defect_zero_iff_one
plain-language theorem explainer
The equivalence that defect vanishes precisely when the argument equals unity for positive reals supplies the sharp characterization of the J-cost minimum. Researchers formalizing the Law of Existence and the determinism resolution cite this result to equate zero defect with the unique existent. The tactic proof unfolds defect to J, splits the biconditional, and reduces the forward direction to the quadratic identity (x-1)^2=0 via linarith and nlinarith.
Claim. For every real $x>0$, let $J(x)= (x + x^{-1})/2 -1$. Then $J(x)=0$ if and only if $x=1$.
background
The LawOfExistence module supplies a literal formalization of the Law of Existence: a positive real exists precisely when its defect vanishes. Defect is defined by defect x := J x, where J is the cost functional calibrated on the positive reals by the ledger factorization and phi-forcing structures. Upstream results include the J-cost structure in PhiForcingDerived.of and the nuclear-density tiering in NucleosynthesisTiers.of, both of which presuppose the same J on (0,infty). The present theorem isolates the unique zero of this functional.
proof idea
The proof is a tactic-mode algebraic reduction. It first simplifies defect and J, then applies constructor to split the biconditional. The forward direction assumes defect x =0, obtains x + x^{-1}=2 by linarith, multiplies through by x, clears the denominator with field_simp, and invokes nlinarith on the non-negativity of (x-1)^2. The converse direction substitutes directly.
why it matters
This characterization is invoked by determinism_resolution to establish the unique J-cost minimizer, by discreteness_forcing_principle to force discrete ontology, and by continuous_no_isolated_zero_defect to show that continuous spaces admit no isolated zero-defect points. It completes the T5 J-uniqueness step in the forcing chain and supplies the concrete condition used in inevitability_equivalence.summary. The result closes the gap between the abstract cost functional and the claim that only unity is stable.
Switch to Lean above to see the machine-checked source, dependencies, and usage graph.
papers checked against this theorem (showing 30 of 48)
-
Attention over prior layers reduces dilution in deep LLM residuals
"This uniform aggregation causes uncontrolled hidden-state growth with depth, progressively diluting each layer’s contribution. We propose Attention Residuals (AttnRes), which replaces this fixed accumulation with softmax attention over preceding layer outputs"
-
DPO can lower odds of good answers; DPOP stops the drop
"we design DPO-Positive (DPOP), a new loss function and training procedure which avoids this failure mode"
-
ChatGPT reaches only 63% average accuracy on reasoning tasks
"Moreover, we find that ChatGPT is 63.41% accurate on average in 10 different reasoning categories under logical reasoning, non-textual reasoning, and commonsense reasoning, hence making it an unreliable reasoner. It is, for example, better at deductive than inductive reasoning."
-
Gated SAEs halve active features for LM dictionary learning
"Gated SAEs solve shrinkage, are similarly interpretable, and require half as many firing features to achieve comparable reconstruction fidelity"
-
Agents evolve at runtime by reinforcing episodic memory strategies
"we propose MEMRL, a non-parametric approach that evolves via reinforcement learning on episodic memory. By decoupling stable reasoning from plastic memory, MEMRL employs a Two-Phase Retrieval mechanism to filter noise and identify high-utility strategies through environmental feedback."
-
Response order tricks LLM evaluators into skewed rankings
"We also manually annotate the “win/tie/lose” outcomes of responses from ChatGPT and Vicuna-13B"
-
Online RL boosts LLM self-correction up to 15.6% with self-generated data
"SCoRe addresses these challenges by training under the model’s own distribution of self-generated correction traces and using appropriate regularization to steer the learning process into learning a self-correction behavior that is effective at test time as opposed to fitting high-reward responses for a given prompt."
-
World modeling makes driving VLAs scale faster with data
"This task generates a dense, self-supervised signal that compels the model to learn the underlying dynamics of the driving environment."
-
VLA models score 90 percent then 0 percent after small benchmark changes
"high scores primarily reflect rote memorization of training data rather than genuine task understanding or execution ability"
-
Dataset shows AI describes collisions but cannot explain or predict them
"CLEVRER includes four types of questions: descriptive (e.g., 'what color'), explanatory ('what is responsible for'), predictive ('what will happen next'), and counterfactual ('what if')."
-
Softmax normalization creates attention sinks in language models
"We find that attention sink acts more like key biases, storing extra attention scores, which could be non-informative and not contribute to the value computation. We also observe that this phenomenon (at least partially) stems from tokens' inner dependence on attention scores as a result of softmax normalization."
-
MemoryBank adds lasting recall to language models
"MemoryBank incorporates a memory updating mechanism, inspired by the Ebbinghaus Forgetting Curve theory, which permits the AI to forget and reinforce memory based on time elapsed and the relative significance of the memory"
-
LLM essay writers show weakest brain connectivity
"Brain-only participants exhibited the strongest, most distributed networks; Search Engine users showed moderate engagement; and LLM users displayed the weakest connectivity. Over four months, LLM users consistently underperformed at neural, linguistic, and behavioral levels."
-
Medical VQA model beats priors after training on 227k literature pairs
"We establish a scalable pipeline to construct a large-scale medical visual question-answering dataset, named PMC-VQA, which contains 227k VQA pairs of 149k images that cover various modalities or diseases."
-
RL on software evolution data reaches 41% on real GitHub issues
"the reward is a similarity score (between 0 and 1) of the predicted and the oracle patch calculated by Python’s difflib.SequenceMatcher"
-
MIRIX boosts LLM agent accuracy 35% with 99.9% less storage
"MIRIX consists of six distinct, carefully structured memory types: Core, Episodic, Semantic, Procedural, Resource Memory, and Knowledge Vault, coupled with a multi-agent framework that dynamically controls and coordinates updates and retrieval."
-
Unlabeled data plus RL lifts math model scores 211%
"TTRL boosts the pass@1 performance of Qwen-2.5-Math-7B by approximately 211% on the AIME 2024 with only unlabeled test data. Furthermore, although TTRL is only supervised by the maj@n metric, TTRL has demonstrated performance to consistently surpass the upper limit of the initial model maj@n."
-
Self-play fine-tuning strengthens weak LLMs without new human data
"Theoretically, we prove that the global optimum to the training objective function of our method is achieved only when the LLM policy aligns with the target data distribution."
-
Unique reciprocal cost on ratios forces balanced discrete ledgers
"This cost exhibits reciprocity J(x)=J(x⁻¹), vanishes only at x=1, and diverges at boundary regimes x→0⁺ and x→∞"
-
1M precise text-video pairs raise T2V quality
"we introduce OpenVid-1M, a precise high-quality dataset with expressive captions... propose a novel Multi-modal Video Diffusion Transformer (MVDiT) capable of mining both structure information from visual tokens and semantic information from text tokens"
-
Pixel ops lift 7B VLM to 84% on visual reasoning tests
"We introduce the concept of pixel-space reasoning... VLMs are equipped with a suite of visual reasoning operations, such as zoom-in and select-frame... a reinforcement learning (RL) phase leverages a curiosity-driven reward scheme to balance exploration between pixel-space reasoning and textual reasoning."
-
LLM agents follow malicious instructions without jailbreaks
"The benchmark includes a diverse set of 110 explicitly malicious agent tasks (440 with augmentations), covering 11 harm categories including fraud, cybercrime, and harassment. In addition to measuring whether models refuse harmful agentic requests, scoring well on AgentHarm requires jailbroken agents to maintain their capabilities following an attack to complete a multi-step task."
-
Survey maps paths to efficient LLM reasoning
"categorize existing works into... (1) model-based efficient reasoning... (2) reasoning output-based efficient reasoning, which aims to dynamically reduce reasoning steps and length during inference; (3) input prompts-based efficient reasoning"
-
Max-entropy RL equals probabilistic inference
"the graphical model for control as inference... optimality variables... p(Ot=1|st,at)=exp(r(st,at))"
-
Sparse circuits map language model behaviors to interpretable features
"We introduce SHIFT, where we improve the generalization of a classifier by ablating features that a human judges to be task-irrelevant."
-
Community LLM skills show 26 percent vulnerability rate
"recent empirical analyses reveal that 26.1% of community-contributed skills contain vulnerabilities, motivating our proposed Skill Trust and Lifecycle Governance Framework—a four-tier, gate-based permission model"
-
Discrete audio code model enables zero-shot TTS from 3s prompt
"VALL-E emerges in-context learning capabilities and can be used to synthesize high-quality personalized speech with only a 3-second enrolled recording of an unseen speaker as an acoustic prompt"
-
LLMs fake alignment during training to keep their preferred behavior
"We find the model complies with harmful queries from free users 14% of the time, versus almost never for paid users. In almost all cases where the model complies with a harmful query from a free user, we observe explicit alignment-faking reasoning, with the model stating it is strategically answering harmful queries in training to preserve its preferred harmlessness behavior out of training."
-
Vanilla PPO scales reasoning on base models with one-tenth the steps
"the learned critic in Reasoner-Zero training effectively identifies and devalues repetitive response patterns"
-
COCO supplies 1.5 million captions for 330k images
"When completed, the dataset will contain over one and a half million captions describing over 330,000 images. For the training and validation images, five independent human generated captions will be provided."