pith. machine review for the scientific record. sign in
def definition def or abbrev high

arithmeticCoding

show as:
view Lean formalization →

Arithmetic coding encodes an arbitrary message as a single real number in the unit interval whose binary digits form the compressed representation. Information theorists cite the construction when establishing that the average code length approaches the source entropy H for long messages under any distribution. The declaration is a direct string assignment that records the asymptotic optimality property.

claimArithmetic coding is the scheme that maps a message to a real number $r$ in $[0,1)$ such that the expected code length $L$ satisfies $L$ approaches the Shannon entropy $H$ of the source as message length tends to infinity.

background

The Information.Compression module derives fundamental limits on lossless compression from J-cost. The shifted cost satisfies $H(x) = J(x) + 1$, converting the Recognition Composition Law into the d'Alembert equation $H(xy) + H(x/y) = 2 H(x) H(y)$. Entropy of a configuration equals its total defect, which vanishes for the minimum-entropy initial state. Compression is identified with J-cost minimization, so the entropy bound becomes the minimum J-cost required for faithful representation.

proof idea

The declaration is a one-line definition that assigns the descriptive string to arithmeticCoding, followed by inline comments contrasting it with dictionary-based Lempel-Ziv methods.

why it matters in Recognition Science

The definition supplies the concrete mechanism realizing the entropy bound inside the INFO-003 J-cost framework. It directly supports the module's claim that compression equals J-cost minimization and connects to the source-coding theorem via the H functional and defect-based entropy. No downstream uses are recorded, leaving its role as a benchmark for later lossless and lossy compression statements.

scope and limits

formal statement (Lean)

 191def arithmeticCoding : String :=

proof body

Definition body.

 192  "Near-optimal, L → H for long messages"
 193
 194/-- Lempel-Ziv compression (LZ77, LZ78, LZW):
 195    - Dictionary-based
 196    - Adaptive (learns patterns)
 197    - Achieves entropy in limit
 198    - Used in ZIP, PNG, GIF -/

depends on (8)

Lean names referenced from this declaration's body.