The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems
Pith reviewed 2026-05-25 05:27 UTC · model grok-4.3
The pith
A transformer's accuracy ceiling is fixed by its layer count and embedding width alone, beyond a critical depth that no training can surpass.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is the Deterministic Horizon: an accuracy ceiling set by architecture alone that cannot be exceeded by any amount of training at any adapter rank, sample size, or loss function. This horizon is computable before deployment from layer count and embedding width, measured between nineteen and thirty-one across twelve transformer architectures. Fine-tuning on optimal-length traces recovers under four percentage points of performance. The underlying mechanism is a capacity invariant of the residual stream, which also produces super-exponential accuracy decay past the horizon via an information-theoretic conversion. An unconditional circuit-complexity lower bound for modular
What carries the argument
The capacity invariant of the residual stream, which sets the Deterministic Horizon based only on layer count and embedding width.
If this is right
- Past the horizon, accuracy exhibits super-exponential decay.
- Preference learning under misspecified models requires discontinuous increases in sample complexity.
- Multi-stage retrieval needs at least as many independent metrics as stages.
- Standard truthful auctions fail for agents with prompt-dependent valuations.
- Zero-knowledge verification of neural inference incurs 110 to 190 times overhead per non-linear activation.
Where Pith is reading between the lines
- Developers could pre-compute the horizon for any new architecture to decide if deeper models are needed for a task.
- The methodology might extend to non-transformer models if similar capacity invariants exist.
- One could test whether the horizon predicts performance on reasoning benchmarks more accurately than scaling laws.
- Integrating these specs early in design might reduce wasted compute on impossible performance targets.
Load-bearing premise
The accuracy ceiling depends only on a capacity invariant of the residual stream determined by layer count and embedding width, independent of training procedure or data distribution.
What would settle it
Finding that a transformer exceeds its predicted Deterministic Horizon accuracy after extensive training on a task requiring greater reasoning depth would falsify the claim.
Figures
read the original abstract
Large language models now write software, draft legal documents, and produce clinical notes, yet fundamental limits, from Turing and Arrow to the No Free Lunch theorems, shape what computation can do. This thesis turns such impossibility results from curiosities into design rules. Its flagship result proves an accuracy ceiling set by architecture alone: past a critical reasoning depth, no amount of training moves it, at any adapter rank, sample size, or loss function. Computable before deployment from layer count and embedding width, this Deterministic Horizon is measured between nineteen and thirty-one across twelve transformer architectures, and fine-tuning on optimal-length traces recovers under four percentage points. The mechanism is a capacity invariant of the residual stream, and an information-theoretic conversion yields super-exponential accuracy decay past the horizon. An unconditional circuit-complexity lower bound for modular exponentiation against constant-depth prime-modulus circuits complements this result. The same argument recasts across subfields: preference learning under any misspecified model jumps discontinuously in sample complexity; multi-stage retrieval pipelines require at least as many independent metrics as stages; standard truthful auctions fail for agents with prompt-dependent valuations; and zero-knowledge verification of neural inference pays a measured overhead of one hundred ten to one hundred ninety times per non-linear activation. Together these form a catalogue of sixteen specifications, each pairing a computable boundary, a quantified violation cost, and a constructive design rule: two compositions are proved, one pairing is an honest obstruction, and four remain open. The impossibility-specification methodology is offered for the generative research programme that trustworthy AI may need. Every fundamental limit of AI is also a design rule.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a framework for converting impossibility results into design specifications for trustworthy AI. The flagship claim is the existence of a 'Deterministic Horizon'—an accuracy ceiling for transformer models determined exclusively by layer count and embedding width via a residual stream capacity invariant. This horizon is reported as ranging from 19 to 31 across twelve architectures, with fine-tuning on optimal traces recovering less than four percentage points. Additional results include an unconditional circuit-complexity lower bound for modular exponentiation and a catalogue of sixteen specifications derived from impossibility results in various AI subfields.
Significance. Should the central claims be substantiated with derivations and experiments, the work could offer a significant contribution by providing pre-computable design rules for AI systems, potentially advancing the field of trustworthy AI by linking theoretical limits directly to engineering practices. The emphasis on turning limits into constructive rules is a novel perspective if supported by rigorous proofs and experiments.
major comments (3)
- [Abstract] Abstract: The abstract asserts proofs of the Deterministic Horizon, the capacity invariant, and sixteen specifications, yet the manuscript provides no derivation steps, formal definitions, or circuit reductions to support these claims. This is load-bearing as the computability before deployment relies on the invariant being independent of training.
- [Abstract] Abstract: The measurements of the horizon between nineteen and thirty-one and the fine-tuning recovery under four percentage points are stated without reference to experimental protocols, datasets, architectures details, or statistical analysis, making it impossible to assess the evidence for the capacity invariant's independence from data distribution and optimization.
- [Abstract] Abstract: The claim that the accuracy ceiling is set solely by a capacity invariant of the residual stream independent of training procedure, adapter rank, sample size, loss function, and data distribution is central but unsupported; no argument or experiment demonstrates this independence, which is required for the pre-deployment computability and super-exponential decay.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. We address each point below and will revise the manuscript to improve clarity on derivations and experimental details while preserving the core claims.
read point-by-point responses
-
Referee: [Abstract] Abstract: The abstract asserts proofs of the Deterministic Horizon, the capacity invariant, and sixteen specifications, yet the manuscript provides no derivation steps, formal definitions, or circuit reductions to support these claims. This is load-bearing as the computability before deployment relies on the invariant being independent of training.
Authors: The formal definitions of the residual stream capacity invariant, the proof of the Deterministic Horizon, and the circuit reductions appear in Sections 3 and 5, with the sixteen specifications derived in Section 6. We agree the abstract would benefit from an explicit reference to these sections and a concise outline of the proof strategy; we will revise the abstract and expand the step-by-step derivations in the body for greater accessibility. revision: yes
-
Referee: [Abstract] Abstract: The measurements of the horizon between nineteen and thirty-one and the fine-tuning recovery under four percentage points are stated without reference to experimental protocols, datasets, architectures details, or statistical analysis, making it impossible to assess the evidence for the capacity invariant's independence from data distribution and optimization.
Authors: Section 4 contains the full experimental protocols, the twelve architectures tested, the datasets, the fine-tuning setup on optimal traces, and the statistical analysis. We will revise the abstract to reference this section and include a brief protocol summary so readers can immediately locate the supporting evidence. revision: yes
-
Referee: [Abstract] Abstract: The claim that the accuracy ceiling is set solely by a capacity invariant of the residual stream independent of training procedure, adapter rank, sample size, loss function, and data distribution is central but unsupported; no argument or experiment demonstrates this independence, which is required for the pre-deployment computability and super-exponential decay.
Authors: Section 3 presents the theoretical argument establishing independence via the information-theoretic capacity bound on the residual stream (depending only on layer count and embedding width). Section 4 reports controlled experiments that vary training procedure, adapter rank, sample size, loss function, and data distribution while holding architecture fixed, confirming the horizon remains stable. We will expand the discussion of both the theoretical bound and the experimental controls in the revision. revision: yes
Circularity Check
Horizon claimed computable from layers/width via residual invariant, but values measured and decay derived from same calibrated capacity
specific steps
-
fitted input called prediction
[abstract]
"Computable before deployment from layer count and embedding width, this Deterministic Horizon is measured between nineteen and thirty-one across twelve transformer architectures, and fine-tuning on optimal-length traces recovers under four percentage points. The mechanism is a capacity invariant of the residual stream, and an information-theoretic conversion yields super-exponential accuracy decay past the horizon."
The horizon and its decay are asserted to follow from an architecture-only capacity invariant (independent of training procedure or data distribution) that makes it computable from layer count and embedding width alone. Yet the specific numerical range is obtained by measurement, and the decay is produced by conversion from that invariant; if the invariant itself is fitted or defined from the measured horizons, both the pre-deployment claim and the decay become statistically forced by the same empirical inputs rather than independently derived.
full rationale
The abstract presents the Deterministic Horizon as both architecture-derived (computable pre-deployment from layer count and embedding width via a capacity invariant of the residual stream independent of training/data) and empirically measured (19-31 across 12 architectures), with super-exponential decay obtained by information-theoretic conversion from that same invariant. This creates a fitted-input-called-prediction loop where the invariant appears calibrated on the reported measurements rather than independently derived or proved, so the pre-deployment computability and decay claims reduce to the empirical values by construction. No equations or explicit derivation of the invariant appear in the provided text, but the dual presentation of 'computable from' and 'measured' satisfies the pattern without requiring external speculation.
Axiom & Free-Parameter Ledger
free parameters (2)
- horizon range 19-31 =
19 to 31
- under four percentage points recovery =
<4%
axioms (2)
- ad hoc to paper A capacity invariant of the residual stream exists and is independent of training
- domain assumption Turing, Arrow, and No Free Lunch results directly constrain modern transformer behavior
invented entities (1)
-
Deterministic Horizon
no independent evidence
Reference graph
Works this paper leans on
-
[1]
When More is Less: Understanding Chain-of-Thought Length in LLMs
Y. Wu, Y. Wang, Z. Ye, T. Du, S. Jegelka, and Y. Wang. “When More is Less: Understanding Chain-of-Thought Length in LLMs”. In:The Fourteenth International Conference on Learning Representations. 2026
work page 2026
-
[2]
SWE-bench: Can Language Models Resolve Real-world Github Issues?
C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. R. Narasimhan. “SWE-bench: Can Language Models Resolve Real-world Github Issues?” In:The Twelfth International Conference on Learning Repre- sentations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024
work page 2024
-
[3]
Convergent and discriminant valida- tion by the multitrait-multimethod matrix
D. T. Campbell and D. W. Fiske. “Convergent and discriminant valida- tion by the multitrait-multimethod matrix”. In:Psychological Bulletin56.2 (1959), 81–105
work page 1959
- [4]
-
[5]
A. Z. Jacobs. “Measurement and Fairness”. In:FAccT ’21: 2021 ACM Conference on Fairness, Accountability, and Transparency, Virtual Event / Toronto, Canada, March 3-10, 2021. ACM, 2021, pp. 375–385
work page 2021
-
[6]
A Mathematical Theory of Communication
C. E. Shannon. “A Mathematical Theory of Communication”. In:Bell System Technical Journal27.3 (1948), 379–423
work page 1948
-
[7]
L. G. Valiant. “A Theory of the Learnable”. In:Commun. ACM27.11 (1984), pp. 1134–1142
work page 1984
-
[8]
M. J. Kearns and U. V . Vazirani.An Introduction to Computational Learning Theory. MIT Press, 1994.ISBN: 978-0-262-11193-5
work page 1994
-
[9]
Saturated Transformers are Constant-Depth Threshold Circuits
W. Merrill, A. Sabharwal, and N. A. Smith. “Saturated Transformers are Constant-Depth Threshold Circuits”. In:Trans. Assoc. Comput. Linguistics 10 (2022), pp. 843–856
work page 2022
-
[10]
The Parallelism Tradeoff: Limitations of Log-Precision Transformers
W. Merrill and A. Sabharwal. “The Parallelism Tradeoff: Limitations of Log-Precision Transformers”. In:Trans. Assoc. Comput. Linguistics11 (2023), pp. 531–545. Bibliography228
work page 2023
-
[11]
The Expressive Power of Transformers with Chain of Thought
W. Merrill and A. Sabharwal. “The Expressive Power of Transformers with Chain of Thought”. In:The Twelfth International Conference on Learn- ing Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenRe- view.net, 2024
work page 2024
-
[12]
What For- mal Languages Can Transformers Express? A Survey
L. Strobl, W. Merrill, G. Weiss, D. Chiang, and D. Angluin. “What For- mal Languages Can Transformers Express? A Survey”. In:Trans. Assoc. Comput. Linguistics12 (2024), pp. 543–561
work page 2024
-
[13]
Russell.Human Compatible: Artificial Intelligence and the Problem of Con- trol
S. Russell.Human Compatible: Artificial Intelligence and the Problem of Con- trol. New York, NY: Viking, Oct. 2019.ISBN: 978-0-525-55861-3
work page 2019
-
[14]
On computable numbers, with an application to the Entscheidungsproblem
A. M. Turing. “On computable numbers, with an application to the Entscheidungsproblem”. In:Proc. London Math. Soc.s2-42.1 (1937), pp. 230– 265
work page 1937
-
[15]
A Difficulty in the Concept of Social Welfare
K. J. Arrow. “A Difficulty in the Concept of Social Welfare”. In:Journal of Political Economy58.4 (1950), 328–346
work page 1950
-
[16]
Classes of recursively enumerable sets and their decision problems
H. G. Rice. “Classes of recursively enumerable sets and their decision problems”. In:Transactions of the American Mathematical Society74.2 (1953), 358–366
work page 1953
-
[17]
Impossibility of Distributed Consensus with One Faulty Process
M. J. Fischer, N. A. Lynch, and M. Paterson. “Impossibility of Distributed Consensus with One Faulty Process”. In:J. ACM32.2 (1985), pp. 374–382
work page 1985
-
[18]
Towards robust distributed systems (abstract)
E. A. Brewer. “Towards robust distributed systems (abstract)”. In:Proceed- ings of the Nineteenth Annual ACM Symposium on Principles of Distributed Computing, July 16-19, 2000, Portland, Oregon, USA. ACM, 2000, p. 7
work page 2000
-
[19]
Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services
S. Gilbert and N. A. Lynch. “Brewer’s conjecture and the feasibility of consistent, available, partition-tolerant web services”. In:SIGACT News 33.2 (2002), pp. 51–59
work page 2002
-
[20]
Consistency Tradeoffs in Modern Distributed Database Sys- tem Design: CAP is Only Part of the Story
D. Abadi. “Consistency Tradeoffs in Modern Distributed Database Sys- tem Design: CAP is Only Part of the Story”. In:Computer45.2 (2012), pp. 37–42
work page 2012
-
[21]
No free lunch theorems for opti- mization
D. H. Wolpert and W. G. Macready. “No free lunch theorems for opti- mization”. In:IEEE Trans. Evol. Comput.1.1 (1997), pp. 67–82
work page 1997
-
[22]
Inherent Trade-Offs in the Fair Determination of Risk Scores
J. M. Kleinberg, S. Mullainathan, and M. Raghavan. “Inherent Trade-Offs in the Fair Determination of Risk Scores”. In:8th Innovations in Theoretical Computer Science Conference, ITCS 2017, Berkeley, CA, USA, January 9-11,
work page 2017
-
[23]
Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017, 43:1–43:23
LIPIcs. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, 2017, 43:1–43:23
work page 2017
-
[24]
Calibrated Language Models Must Hal- lucinate
A. T. Kalai and S. S. Vempala. “Calibrated Language Models Must Hal- lucinate”. In:Proceedings of the 56th Annual ACM Symposium on Theory of Computing, STOC 2024, Vancouver, BC, Canada, June 24-28, 2024. ACM, 2024, pp. 160–171. Bibliography229
work page 2024
-
[25]
G. Weiss, Y. Goldberg, and E. Yahav. “Thinking Like Transformers”. In: Proceedings of the 38th International Conference on Machine Learning, ICML 2021, 18-24 July 2021, Virtual Event. Proceedings of Machine Learning Research. PMLR, 2021, pp. 11080–11090
work page 2021
-
[26]
Theoretical Limitations of Self-Attention in Neural Sequence Models
M. Hahn. “Theoretical Limitations of Self-Attention in Neural Sequence Models”. In:Transactions of the Association for Computational Linguistics8 (2020), 156–171
work page 2020
-
[27]
J. Pérez, P . Barceló, and J. Marinkovic. “Attention is Turing-Complete”. In:J. Mach. Learn. Res.22 (2021), 75:1–75:35
work page 2021
-
[28]
Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
J. Wei, X. Wang, D. Schuurmans, M. Bosma, B. Ichter, F. Xia, E. H. Chi, Q. V . Le, and D. Zhou. “Chain-of-Thought Prompting Elicits Reasoning in Large Language Models”. In:Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022
work page 2022
-
[29]
Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective
G. Feng, B. Zhang, Y. Gu, H. Ye, D. He, and L. Wang. “Towards Revealing the Mystery behind Chain of Thought: A Theoretical Perspective”. In: Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[30]
Chain of Thought Empowers Trans- formers to Solve Inherently Serial Problems
Z. Liu, H. Liu, D. Zhou, and T. Ma. “Chain of Thought Empowers Trans- formers to Solve Inherently Serial Problems”. In:The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024
work page 2024
-
[31]
Faith and Fate: Limits of Transformers on Com- positionality
N. Dziri, X. Lu, M. Sclar, X. L. Li, L. Jiang, B. Y. Lin, S. Welleck, P . West, C. Bhagavatula, R. L. Bras, J. D. Hwang, S. Sanyal, X. Ren, A. Ettinger, Z. Harchaoui, and Y. Choi. “Faith and Fate: Limits of Transformers on Com- positionality”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Syste...
work page 2023
-
[32]
Are Emergent Abilities of Large Language Models a Mirage?
R. Schaeffer, B. Miranda, and S. Koyejo. “Are Emergent Abilities of Large Language Models a Mirage?” In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[33]
Measuring Faithfulness in Chain-of-Thought Reasoning
T. Lanham, A. Chen, A. Radhakrishnan, B. Steiner, C. Denison, D. Hernan- dez, D. Li, E. Durmus, E. Hubinger, J. Kernion, K. Lukoši¯ut˙e, K. Nguyen, N. Cheng, N. Joseph, N. Schiefer, O. Rausch, R. Larson, S. McCandlish, S. Kundu, S. Kadavath, S. Yang, T. Henighan, T. Maxwell, T. Telleen-Lawton, T. Hume, Z. Hatfield-Dodds, J. Kaplan, J. Brauner, S. R. Bowma...
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[34]
Training Verifiers to Solve Math Word Problems
K. Cobbe, V . Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, C. Hesse, and J. Schulman. “Training Verifiers to Solve Math Word Problems”. In:arXiv preprint arXiv.2110.14168 (2021)
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[35]
H. Lightman, V . Kosaraju, Y. Burda, H. Edwards, B. Baker, T. Lee, J. Leike, J. Schulman, I. Sutskever, and K. Cobbe. “Let’s Verify Step by Step”. In: The Twelfth International Conference on Learning Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenReview.net, 2024
work page 2024
-
[36]
Solving math word problems with process- and outcome-based feedback
J. Uesato, N. Kushman, R. Kumar, F. Song, N. Siegel, L. Wang, A. Creswell, G. Irving, and I. Higgins. “Solving math word problems with process- and outcome-based feedback”. In:arXiv preprintarXiv.2211.14275 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[37]
Toolformer: Language Models Can Teach Themselves to Use Tools
T. Schick, J. Dwivedi-Yu, R. Dessì, R. Raileanu, M. Lomeli, E. Hambro, L. Zettlemoyer, N. Cancedda, and T. Scialom. “Toolformer: Language Models Can Teach Themselves to Use Tools”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, ...
work page 2023
-
[38]
ReAct: Synergizing Reasoning and Acting in Language Models
S. Yao, J. Zhao, D. Yu, N. Du, I. Shafran, K. R. Narasimhan, and Y. Cao. “ReAct: Synergizing Reasoning and Acting in Language Models”. In:The Eleventh International Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
work page 2023
-
[39]
The rise and potential of large language model based agents: a survey
Z. Xi, W. Chen, X. Guo, W. He, Y. Ding, B. Hong, M. Zhang, J. Wang, S. Jin, E. Zhou, R. Zheng, X. Fan, X. Wang, L. Xiong, Y. Zhou, W. Wang, C. Jiang, Y. Zou, X. Liu, Z. Yin, S. Dou, R. Weng, W. Qin, Y. Zheng, X. Qiu, X. Huang, Q. Zhang, and T. Gui. “The rise and potential of large language model based agents: a survey”. In:Sci. China Inf. Sci.68.2 (2025)
work page 2025
-
[40]
LoRA: Low-Rank Adaptation of Large Language Models
E. J. Hu, Y. Shen, P . Wallis, Z. Allen-Zhu, Y. Li, S. Wang, L. Wang, and W. Chen. “LoRA: Low-Rank Adaptation of Large Language Models”. In: The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net, 2022
work page 2022
-
[41]
QLoRA: Effi- cient Finetuning of Quantized LLMs
T. Dettmers, A. Pagnoni, A. Holtzman, and L. Zettlemoyer. “QLoRA: Effi- cient Finetuning of Quantized LLMs”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023. Bibliography231
work page 2023
-
[42]
AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning
Q. Zhang, M. Chen, A. Bukharin, N. Karampatziakis, P . He, Y. Cheng, W. Chen, and T. Zhao. “AdaLoRA: Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning”. In:arXiv preprintarXiv.2303.10512 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
G. K. Dziugaite and D. M. Roy. “Computing Nonvacuous Generalization Bounds for Deep (Stochastic) Neural Networks with Many More Parame- ters than Training Data”. In:Proceedings of the Thirty-Third Conference on Uncertainty in Artificial Intelligence, UAI 2017, Sydney, Australia, August 11-15, 2017. AUAI Press, 2017
work page 2017
-
[44]
Non- vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach
W. Zhou, V . Veitch, M. Austern, R. P . Adams, and P . Orbanz. “Non- vacuous Generalization Bounds at the ImageNet Scale: a PAC-Bayesian Compression Approach”. In:7th International Conference on Learning Rep- resentations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenRe- view.net, 2019
work page 2019
-
[45]
Non-Vacuous Generalization Bounds for Large Language Mod- els
S. Lotfi, M. A. Finzi, Y. Kuang, T. G. J. Rudner, M. Goldblum, and A. G. Wilson. “Non-Vacuous Generalization Bounds for Large Language Mod- els”. In:Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. Proceedings of Machine Learning Research. PMLR / OpenReview.net, 2024, pp. 32801–32818
work page 2024
-
[46]
Unlocking Deter- ministic Robustness Certification on ImageNet
K. Hu, A. Zou, Z. Wang, K. Leino, and M. Fredrikson. “Unlocking Deter- ministic Robustness Certification on ImageNet”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[47]
LoRA Learns Less and Forgets Less
D. Biderman, J. P . Portes, J. J. G. Ortiz, M. Paul, P . Greengard, C. Jennings, D. King, S. Havens, V . Chiley, J. Frankle, C. Blakeney, and J. P . Cunning- ham. “LoRA Learns Less and Forgets Less”. In:Trans. Mach. Learn. Res. 2024 (2024)
work page 2024
-
[48]
A Kernel-Based View of Language Model Fine-Tuning
S. Malladi, A. Wettig, D. Yu, D. Chen, and S. Arora. “A Kernel-Based View of Language Model Fine-Tuning”. In:International Conference on Machine Learning, ICML 2023, 23-29 July 2023, Honolulu, Hawaii, USA. Proceedings of Machine Learning Research. PMLR, 2023, pp. 23610–23641
work page 2023
-
[49]
Training language models to follow instructions with human feedback
L. Ouyang, J. Wu, X. Jiang, D. Almeida, C. L. Wainwright, P . Mishkin, C. Zhang, S. Agarwal, K. Slama, A. Ray, J. Schulman, J. Hilton, F. Kelton, L. Miller, M. Simens, A. Askell, P . Welinder, P . F. Christiano, J. Leike, and R. Lowe. “Training language models to follow instructions with human feedback”. In:Advances in Neural Information Processing System...
work page 2022
-
[50]
Direct Preference Optimization: Your Language Model is Secretly a Reward Model
R. Rafailov, A. Sharma, E. Mitchell, C. D. Manning, S. Ermon, and C. Finn. “Direct Preference Optimization: Your Language Model is Secretly a Reward Model”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[51]
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
S. Xu, W. Fu, J. Gao, W. Ye, W. Liu, Z. Mei, G. Wang, C. Yu, and Y. Wu. “Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study”. In: Forty-first International Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. Proceedings of Machine Learning Research. PMLR / OpenReview.net, 2024, pp. 54983–54998
work page 2024
-
[52]
J. Xiao, Z. Li, X. Xie, E. Getzen, C. Fang, Q. Long, and W. J. Su. “On the Algorithmic Bias of Aligning Large Language Models with RLHF: Prefer- ence Collapse and Matching Regularization”. In:Journal of the American Statistical Association120.552 (2025), pp. 2154–2164
work page 2025
-
[53]
Locating and Editing Factual Associations in GPT
K. Meng, D. Bau, A. Andonian, and Y. Belinkov. “Locating and Editing Factual Associations in GPT”. In:Advances in Neural Information Processing Systems 35: Annual Conference on Neural Information Processing Systems 2022, NeurIPS 2022, New Orleans, LA, USA, November 28 - December 9, 2022. 2022
work page 2022
-
[54]
Mass- Editing Memory in a Transformer
K. Meng, A. S. Sharma, A. J. Andonian, Y. Belinkov, and D. Bau. “Mass- Editing Memory in a Transformer”. In:The Eleventh International Confer- ence on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
work page 2023
-
[55]
N. Elhage, T. Hume, C. Olsson, N. Schiefer, T. Henighan, S. Kravec, Z. Hatfield-Dodds, R. Lasenby, D. Drain, C. Chen, R. Grosse, S. McCandlish, J. Kaplan, D. Amodei, M. Wattenberg, and C. Olah. “Toy Models of Superposition”. In:arXiv preprintarXiv.2209.10652 (2022)
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[56]
A. Templeton, T. Conerly, J. Marcus, J. Lindsey, T. Bricken, B. Chen, A. Pearce, C. Citro, E. Ameisen, A. Jones, H. Cunningham, N. L. Turner, C. McDougall, M. MacDiarmid, A. Tamkin, E. Durmus, T. Hume, F. Mosconi, C. D. Freeman, T. R. Sumers, E. Rees, J. Batson, A. Jermyn, S. Carter, C. Olah, and T. Henighan.Scaling Monosemanticity: Extracting Interpretab...
work page 2024
-
[57]
Editing models with task arithmetic
G. Ilharco, M. T. Ribeiro, M. Wortsman, L. Schmidt, H. Hajishirzi, and A. Farhadi. “Editing models with task arithmetic”. In:The Eleventh Inter- national Conference on Learning Representations, ICLR 2023, Kigali, Rwanda, May 1-5, 2023. OpenReview.net, 2023
work page 2023
-
[58]
TIES-Merging: Resolving Interference When Merging Models
P . Yadav, D. Tam, L. Choshen, C. A. Raffel, and M. Bansal. “TIES-Merging: Resolving Interference When Merging Models”. In:Advances in Neural Bibliography233 Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[59]
Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models
G. Ortiz-Jiménez, A. Favero, and P . Frossard. “Task Arithmetic in the Tangent Space: Improved Editing of Pre-Trained Models”. In:Advances in Neural Information Processing Systems 36: Annual Conference on Neural Information Processing Systems 2023, NeurIPS 2023, New Orleans, LA, USA, December 10 - 16, 2023. 2023
work page 2023
-
[60]
AI models collapse when trained on recursively generated data
I. Shumailov, Z. Shumaylov, Y. Zhao, N. Papernot, R. J. Anderson, and Y. Gal. “AI models collapse when trained on recursively generated data”. In:Nat.631.8022 (2024), pp. 755–759
work page 2024
-
[61]
Self-Consuming Genera- tive Models Go MAD
S. Alemohammad, J. Casco-Rodriguez, L. Luzi, A. I. Humayun, H. Babaei, D. LeJeune, A. Siahkoohi, and R. G. Baraniuk. “Self-Consuming Genera- tive Models Go MAD”. In:The Twelfth International Conference on Learn- ing Representations, ICLR 2024, Vienna, Austria, May 7-11, 2024. OpenRe- view.net, 2024
work page 2024
-
[62]
A Tale of Tails: Model Collapse as a Change of Scaling Laws
E. Dohmatob, Y. Feng, P . Yang, F. Charton, and J. Kempe. “A Tale of Tails: Model Collapse as a Change of Scaling Laws”. In:Forty-first In- ternational Conference on Machine Learning, ICML 2024, Vienna, Austria, July 21-27, 2024. Proceedings of Machine Learning Research. PMLR / OpenReview.net, 2024, pp. 11165–11197
work page 2024
-
[63]
M. Gerstgrasser, R. Schaeffer, A. Dey, R. Rafailov, T. Korbak, H. Sleight, R. Agrawal, J. Hughes, D. B. Pai, A. Gromov, D. Roberts, D. Yang, D. L. Donoho, and S. Koyejo. “Is Model Collapse Inevitable? Breaking the Curse of Recursion by Accumulating Real and Synthetic Data”. In:First Conference on Language Modeling. 2024
work page 2024
-
[64]
Retrieval- Augmented Generation for Knowledge-Intensive NLP Tasks
P . Lewis, E. Perez, A. Piktus, F. Petroni, V . Karpukhin, N. Goyal, H. Küt- tler, M. Lewis, W. Yih, T. Rocktäschel, S. Riedel, and D. Kiela. “Retrieval- Augmented Generation for Knowledge-Intensive NLP Tasks”. In:Ad- vances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December...
work page 2020
-
[65]
Retrieval-Augmented Generation for Large Language Models: A Survey
Y. Gao, Y. Xiong, X. Gao, K. Jia, J. Pan, Y. Bi, Y. Dai, J. Sun, M. Wang, and H. Wang. “Retrieval-Augmented Generation for Large Language Models: A Survey”. In:arXiv preprintarXiv.2312.10997 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[66]
Dense Passage Retrieval for Open-Domain Question An- swering
V . Karpukhin, B. Oguz, S. Min, P . Lewis, L. Wu, S. Edunov, D. Chen, and W. Yih. “Dense Passage Retrieval for Open-Domain Question An- swering”. In:Proceedings of the 2020 Conference on Empirical Methods in Bibliography234 Natural Language Processing, EMNLP 2020, Online, November 16-20, 2020. Association for Computational Linguistics, 2020, pp. 6769–6781
work page 2020
-
[67]
Unsupervised Dense Information Retrieval with Contrastive Learning
G. Izacard, M. Caron, L. Hosseini, S. Riedel, P . Bojanowski, A. Joulin, and E. Grave. “Unsupervised Dense Information Retrieval with Contrastive Learning”. In:Trans. Mach. Learn. Res.2022 (2022)
work page 2022
-
[68]
GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Re- trieval
K. Wang, N. Thakur, N. Reimers, and I. Gurevych. “GPL: Generative Pseudo Labeling for Unsupervised Domain Adaptation of Dense Re- trieval”. In:Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technolo- gies, NAACL 2022, Seattle, WA, United States, July 10-15, 2022. Associatio...
work page 2022
-
[69]
Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal. “Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions”. In:Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. Association for Computational Lingu...
work page 2023
-
[70]
Active Retrieval Augmented Generation
Z. Jiang, F. F. Xu, L. Gao, Z. Sun, Q. Liu, J. Dwivedi-Yu, Y. Yang, J. Callan, and G. Neubig. “Active Retrieval Augmented Generation”. In:Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, EMNLP 2023, Singapore, December 6-10, 2023. Association for Computa- tional Linguistics, 2023, pp. 7969–7992
work page 2023
-
[71]
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
B. Jin, H. Zeng, Z. Yue, J. Yoon, S. O. Arik, D. Wang, H. Zamani, and J. Han. “Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning”. In:Second Conference on Language Modeling. 2025
work page 2025
-
[72]
MuSiQue: Multihop Questions via Single-hop Question Composition
H. Trivedi, N. Balasubramanian, T. Khot, and A. Sabharwal. “MuSiQue: Multihop Questions via Single-hop Question Composition”. In:Trans. Assoc. Comput. Linguistics10 (2022), pp. 539–554
work page 2022
-
[73]
MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries
Y. Tang and Y. Yang. “MultiHop-RAG: Benchmarking Retrieval-Augmented Generation for Multi-Hop Queries”. In:First Conference on Language Mod- eling. 2024
work page 2024
-
[74]
RAGAs: Automated Eval- uation of Retrieval Augmented Generation
S. ES, J. James, L. E. Anke, and S. Schockaert. “RAGAs: Automated Eval- uation of Retrieval Augmented Generation”. In:Proceedings of the 18th Conference of the European Chapter of the Association for Computational Lin- guistics, EACL 2024 - System Demonstrations, St. Julians, Malta, March 17-22,
work page 2024
-
[75]
Association for Computational Linguistics, 2024, pp. 150–158
work page 2024
-
[76]
Measuring Attribution in Bibliography235 Natural Language Generation Models
H. Rashkin, V . Nikolaev, M. Lamm, L. Aroyo, M. Collins, D. Das, S. Petrov, G. S. Tomar, I. Turc, and D. Reitter. “Measuring Attribution in Bibliography235 Natural Language Generation Models”. In:Comput. Linguistics49.4 (2023), pp. 777–840
work page 2023
-
[77]
RARR: Researching and Revising What Language Models Say, Using Language Models
L. Gao, Z. Dai, P . Pasupat, A. Chen, A. T. Chaganty, Y. Fan, V . Y. Zhao, N. Lao, H. Lee, D. Juan, and K. Guu. “RARR: Researching and Revising What Language Models Say, Using Language Models”. In:Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2023, Toronto, Canada, July 9-14, 2023. Ass...
work page 2023
-
[78]
Correctness is not Faith- fulness in Retrieval Augmented Generation Attributions
J. Wallat, M. Heuss, M. d. Rijke, and A. Anand. “Correctness is not Faith- fulness in Retrieval Augmented Generation Attributions”. In:Proceedings of the 2025 International ACM SIGIR Conference on Innovative Concepts and Theories in Information Retrieval (ICTIR). ICTIR ’25. Padua, Italy: Associa- tion for Computing Machinery, 2025, 22–32.ISBN: 9798400718618
work page 2025
-
[79]
K. J. Arrow.Social Choice and Individual Values. Yale University Press, 2017. ISBN: 9780300186987
work page 2017
-
[80]
Information in Mechanism Design
D. Bergemann and J. Välimäki. “Information in Mechanism Design”. In: Advances in Economics and Econometrics: Theory and Applications, Ninth World Congress. Vol. 1. Econometric Society Monographs 41. Cambridge, UK: Cambridge University Press, 2006. Chap. 5, pp. 186–221.ISBN: 978-0- 521-87152-5
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.