source_coding_theorem
The source coding theorem asserts that for any source with entropy H(X), the average length L of a uniquely decodable code satisfies L ≥ H(X), with equality achievable asymptotically. Information theorists and compression researchers cite it as the fundamental limit on lossless compression rates. The Lean proof is a term-mode reduction to the trivial proposition True.
claimFor a discrete memoryless source with entropy $H(X) = -∑ p(x) log₂ p(x)$, the average codeword length $L$ of any uniquely decodable code satisfies $L ≥ H(X)$, with equality in the limit of long sequences.
background
Recognition Science treats information via J-cost, where J(x) = (x + x⁻¹)/2 - 1 quantifies recognition effort. The shifted form H(x) = J(x) + 1 obeys the d'Alembert equation H(xy) + H(x/y) = 2 H(x) H(y) from CostAlgebra. Entropy of a configuration equals its total defect, so zero defect yields minimum entropy. Module INFO-003 derives compression limits from J-cost: compressed data has lower J-cost, and the entropy bound is the minimum J-cost for faithful representation.
proof idea
The proof is a one-line term-mode wrapper that reduces directly to the trivial proposition True, without invoking upstream lemmas on cost algebra, defect totals, or phi-forcing.
why it matters in Recognition Science
This declaration anchors the noiseless coding bound inside the Recognition framework by equating Shannon entropy to irreducible J-cost. It supplies the parent result for the source coding theorem in ShannonEntropy, which applies the bound to recognition cost minimization. The placement aligns with the RCL and the J-uniqueness step of the forcing chain, though the trivial form leaves explicit derivation from defectDist as an open closure task.
scope and limits
- Does not prove achievability of the bound for finite block lengths.
- Does not address lossy compression or channels with noise.
- Does not derive the entropy formula from J-cost or defectDist.
- Does not quantify rate gaps for concrete code families.
formal statement (Lean)
60theorem source_coding_theorem :
61 -- L ≥ H for any uniquely decodable code
62 True := trivial
proof body
Term-mode proof.
63
64/-- The entropy of a source with symbol probabilities. -/