Phase transition in large language models and the criticality of natural languages

Kai Nakaishi; Koji Hukushima; Yoshihiko Nishikawa

arxiv: 2406.05335 · v3 · pith:T57MNTLGnew · submitted 2024-06-08 · ❄️ cond-mat.dis-nn · cs.LG

Phase transition in large language models and the criticality of natural languages

Kai Nakaishi , Yoshihiko Nishikawa , Koji Hukushima This is my paper

classification ❄️ cond-mat.dis-nn cs.LG

keywords languagesnaturalphasellmstransitiontextsbehaviorcritical

0 comments

read the original abstract

Generation of text and speech in natural languages can be modeled as a stochastic process. This idea dates back to the seminal work of Markov and, later, to that of Shannon and also underlies the recent development of large language models (LLMs). The stochastic processes corresponding to natural languages should be distinct from those that generate nonlinguistic sequences. One of the features that discriminate linguistic and nonlinguistic sequences is power-law behavior, which is universally observed across different languages. In statistical physics, such behavior suggests that natural languages are critical: They lie near a phase transition point in a parametrized space of stochastic processes. However, testing this conjecture is not straightforward. A phase transition, even if it exists, cannot be directly observed in real-world natural languages because they do not have any controllable parameters. Here, we use LLMs as controllable effective models of natural languages. Through statistical analyses of texts generated by LLMs, we find that, when a parameter analogous to physical temperature is varied, LLMs undergo a phase transition. The transition separates a low-temperature phase with complex repetitive structures in generated texts from a high-temperature phase in which LLMs generate incomprehensible texts. At the critical point between these phases, generated texts display the power-law behavior similar to that of natural languages and most closely resemble natural languages as measured by a standard metric in natural language processing. These findings strongly suggest that natural languages are indeed critical.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 9 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

A Type Theory of Sense: Witnessed Choice in Stratified Semantic Spaces
cs.LO 2026-06 unverdicted novelty 7.0

Introduces TTS, a dependent type theory replacing global canonical composition with regime-indexed indiscernibility and constructive apartness, proving conservativity, provenance, no-fork-from-empty, and persistence o...
Turbulence-like 5/3 spectral scaling in contextual representations of language as a complex system
cs.CL 2026-04 unverdicted novelty 7.0

Contextual language embeddings exhibit a robust 5/3 power-law spectrum in token-sequence fluctuations, analogous to Kolmogorov turbulence.
Generative Criticality in Large Language Model Temperature Scaling
cs.LG 2026-06 unverdicted novelty 6.0

A statistical-field treatment of LLM outputs shows a susceptibility peak, order-parameter shift, and intrinsic-dimension minimum near a characteristic temperature Tc in softmax scaling.
Escaping Mode Collapse in LLM Generation via Geometric Regulation
cs.CL 2026-05 unverdicted novelty 6.0

Reinforced Mode Regulation (RMR) uses low-rank damping on the value cache to prevent geometric collapse and mode collapse in autoregressive LLM generation, supporting stable output down to 0.8 nats/step entropy.
Escaping Mode Collapse in LLM Generation via Geometric Regulation
cs.CL 2026-05 unverdicted novelty 6.0

Reinforced Mode Regulation (RMR) applies low-rank damping to the Transformer value cache to prevent geometric collapse and enable stable autoregressive generation at entropy rates as low as 0.8 nats/step.
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws
cs.LG 2026-04 unverdicted novelty 6.0

Emergent intelligence is recast as the existence of the limit of performance E(N,P,K) as N,P,K to infinity, with necessary and sufficient conditions derived via nonlinear Lipschitz operator theory and scaling laws obt...
World-Model Collapse as a Phase Transition
cs.AI 2026-06 unverdicted novelty 5.0

Long-horizon language agents show phase-transition-like world-model collapse under small parameter changes, with world-state fidelity failing before action validity, as mapped by grid search in deterministic tasks wit...
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws
cs.LG 2026-04 unverdicted novelty 5.0

Emergent intelligence corresponds to the limit of a performance function E(N,P,K) as N, P, K go to infinity, originating from a parameter-limit architecture whose existence is governed by Lipschitz conditions, with sc...
A Limit Theory of Foundation Models: A Mathematical Approach to Understanding Emergent Intelligence and Scaling Laws
cs.LG 2026-04 unverdicted novelty 3.0

Formalizes emergent intelligence in foundation models as the limit of E(N,P,K) as N,P,K approach infinity, proves existence conditions via nonlinear Lipschitz operators, and derives scaling laws from covering numbers.