kl_nonneg
plain-language theorem explainer
Kullback-Leibler divergence is nonnegative for any pair of probability distributions on a finite set. Workers in variational inference and statistical mechanics cite this result when bounding free energies. The argument rearranges the standard log inequality, cancels the normalization sums, and invokes linear arithmetic on the resulting expressions.
Claim. For probability distributions $p$ and $q$ on a finite index set $ι$, where each assigns strictly positive values summing to one, the Kullback-Leibler divergence satisfies $0 ≤ ∑_i p_i log(p_i / q_i)$.
background
In the module deriving variational free energy from the recognition composition law, a ProbDist on finite $ι$ is a structure whose prob function is strictly positive and sums to 1. The variational free energy is defined as VFE $q$ $E$ $β$ := ∑ $q_i E_i$ + (1/β) ∑ $q_i$ log $q_i$, which equals expected energy minus temperature times entropy for the Boltzmann reference. The module proves that the Boltzmann distribution minimizes this quantity and equates the minimum to the Helmholtz free energy.
proof idea
The proof first shows the negative KL sum is ≤ 0 by applying Real.log_le_sub_one_of_pos to each term $p_i$ log($q_i$/$p_i$) ≤ $p_i$($q_i$/$p_i$ - 1), then summing and using normalization of both distributions to reach zero. It next invokes the identity log($q$/$p$) = -log($p$/$q$) to relate the two sums, and concludes nonnegativity by linarith.
why it matters
This result is invoked by boltzmann_minimizes_vfe, which establishes that the Boltzmann distribution achieves the VFE minimum and identifies that value with the Helmholtz free energy. It supplies the Gibbs inequality step in the derivation of variational free energy from the recognition composition law. The module records zero sorrys, confirming the inequality in the discrete finite setting.
Switch to Lean above to see the machine-checked source, dependencies, and usage graph.