arxiv: 2605.04893 · v1 · submitted 2026-05-06 · 💻 cs.LG · cs.CL· stat.ML

Recognition: unknown

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

Dominik Dahlem , Diego Maniloff , Mac Misiura

Authors on Pith no claims yet

Pith reviewed 2026-05-08 16:45 UTC · model grok-4.3

classification 💻 cs.LG cs.CLstat.ML

keywords self-attentionhallucination detectionspectral diagnosticsinformation flow directionasymmetry coefficientcausal attentionbipartite Cheegertransport capacity

0 comments

The pith

Symmetric spectral diagnostics on attention operators cannot detect the direction of information flow.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that any spectral diagnostic of the degree-normalized attention operator that remains unchanged under transposition is structurally unable to tell the operator apart from its transpose. This invariance means such methods cannot register whether information is routed forward or backward through the sequence. The authors establish the asymmetry coefficient G as the single quantity that controls directional bias, then combine it with a capacity measure φ obtained from a bipartite Cheeger analysis of causal attention patterns. The resulting two-axis view produces concrete bounds for uniform and windowed causal attention and yields a falsifiable prediction that bottleneck-type and diffuse-type hallucination benchmarks should display opposite polarity under length-controlled tests.

Core claim

Every transpose-invariant spectral diagnostic applied to the degree-normalized attention operator fails to distinguish the operator from its transpose and therefore cannot detect information-flow direction. The asymmetry coefficient G is shown to be the unique control parameter for this direction. For canonical causal architectures a closed-form bipartite-Cheeger landscape gives uniform attention an n-independent capacity floor of φ ≥ 1/5 with worst cut at t*/n ≈ 0.32, while window attention reaches only O(w/n). The two-axis diagnostic (φ for capacity, G for direction) predicts and observes reversed polarity between bottleneck-dominated and diffuse-dominated hallucination benchmarks, with LC

What carries the argument

The asymmetry coefficient G of the degree-normalized attention operator, which serves as the unique parameter governing directional bias once transpose invariance is imposed on spectral diagnostics.

If this is right

Symmetric-only spectral diagnostics cannot identify directional causes of attention failure.
The φ-G plane separates bottleneck-dominated from diffuse-dominated hallucination regimes.
Uniform causal attention carries an n-independent capacity lower bound of 1/5.
Window attention capacity falls as O(w/n) and can undercut the uniform floor.
Transport features retain measurable predictive signal for hallucinations up to 8B parameters.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The orientation-blindness result may apply to symmetric analyses of other attention-like operators in graph networks or diffusion models.
Adding explicit directional measures could improve interpretability of internal routing in transformer stacks.
The derived bounds on φ could inform the design of attention masks that avoid low-capacity regimes.
The polarity prediction offers a direct test of whether transport properties dominate hallucination behavior across additional datasets.

Load-bearing premise

The dominant hallucination mechanisms in the tested benchmarks are fully captured by the transport capacity and directionality of the degree-normalized attention operator without substantial confounding from feed-forward layers, layer norms, or output decoding.

What would settle it

A concrete counterexample in which some transpose-invariant spectral diagnostic distinguishes an attention operator from its transpose, or a length-controlled evaluation on the same models in which the predicted polarity reversal between HaluEval and MedHallu fails to appear.

Figures

Figures reproduced from arXiv: 2605.04893 by Diego Maniloff, Dominik Dahlem, Mac Misiura.

**Figure 1.** Figure 1: Conductance has a bounded healthy range; asymmetry detects temporal isolation. Each attention head defines bipartite transport between queries (Q) and keys (K). (a)–(c) Conductance spectrum: healthy attention occupies an optimal band; too low indicates bottleneck (a), too high indicates diffuse dilution (c). (a) Bottleneck: concentrated attention yields low ϕb, missing relevant context. (b) Healthy: select… view at source ↗

**Figure 2.** Figure 2: Conductance landscape: theory and empirics. (a) Theoretical ϕ(St) vs. cut location t/n for canonical causal architectures at n = 100, derived in closed form from Theorems 12 and 13. Uniform causal (blue, solid) is U-shaped with worst case at t ∗/n ≈ 0.32 and asymptotic floor u∞/(2+u∞) ≈ 0.36, strictly above the 1/5 Cheeger floor (Theorem 16). Window attention (red and orange, dashed) follows ϕ ≤ w/(n−t) → … view at source ↗

**Figure 3.** Figure 3: Per-layer transport profiles reveal distinct failure signatures. Conductance ϕˆ (a) and spectral norm σ2 (b) averaged across heads within each layer (Pythia-160M). Shaded bands show population ±1 std around dotted mean lines (green: factual, red: hallucinated from HaluEval). Bottleneck (HaluEval, hallucinated): uniformly depressed ϕˆ and elevated σ2 across all layers—the spectral gap is small everywhere, i… view at source ↗

**Figure 4.** Figure 4: Conductance–spectral norm scatter reveals regime-dependent polarity. Each point is one sample; axes show conductance ϕˆ and spectral norm σ2 averaged across all heads and layers (Pythia-160M, 500 subsampled per class for HaluEval). The near-perfect anti-correlation (ρ=−0.99) confirms the Cheeger inequality: low ϕˆ implies high σ2 (small spectral gap). (a) HaluEval: hallucinated samples cluster at low ϕˆ / … view at source ↗

**Figure 5.** Figure 5: Temporal isolation (G) is sparse but architecture- and position-encodingdependent. (a) LC-AUROC for G std across all 15 model-dataset combinations with 95% bootstrap CIs. Most configurations cluster near chance; Flan-T5 decoder/HaluEval (0.78) and Pythia/HaluEval (0.82) are exceptions. (b) Per-layer G profiles on HaluEval (mean ± 1 std ribbons). Flan-T5 decoder (solid): clear class separation; factual G d… view at source ↗

**Figure 6.** Figure 6: Scaling validation and aggregation crossover (HaluEval). (a) LC-AUROC across Pythia 70M–1.4B (same training data, varying parameter count) plus LLaMA 3.1 8B (cross-architecture; GQA, RoPE). Conductance features (σ2 std, ϕˆ CVaR75) retain signal at all scales; G std drops to chance above 410M. (b) Conductance aggregation profile: the dominant aggregation shifts from location (mean) at 70M to spread (std) at… view at source ↗

read the original abstract

Large language models hallucinate in predictable ways: attention routing fails by over-concentrating on a narrow set of positions, or by spreading so diffusely that relevance is diluted, and the shape of the failure carries diagnostic signal. A widely used family of spectral methods analyzes the symmetric component of the degree-normalized attention operator, which governs transport capacity; we prove that every transpose-invariant spectral diagnostic of this operator is structurally orientation-blind (it cannot distinguish an operator from its transpose, and therefore cannot detect information-flow direction), with a quantitative converse establishing the asymmetry coefficient $G$ as the unique control parameter for direction. Pairing this with a closed-form bipartite-Cheeger landscape for canonical causal architectures, we show that uniform causal attention satisfies an $n$-independent floor $\phi \ge 1/5$ with worst cut at $t^\ast/n \approx 0.32$, while window attention pierces the floor as $O(w/n)$; failure modes are shape-different, not just value-different. The resulting two-axis diagnostic ($\phi$ for capacity, $G$ for direction) yields a falsifiable polarity prediction: bottleneck- and diffuse-dominated benchmarks should exhibit opposite polarity. Under length-controlled evaluation, transport features retain interpretable signal (LC-AUROC from 0.62 to 0.84) on tested models up to 8B parameters, with polarity reversing as predicted between HaluEval and MedHallu.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper proves symmetric spectral diagnostics on attention are orientation-blind and introduces G as the fix, with a testable polarity prediction on hallucinations that gets moderate numbers but thin controls.

read the letter

The main point is that any transpose-invariant spectral diagnostic on the degree-normalized attention operator cannot distinguish direction of flow. The paper proves this cleanly and shows that the asymmetry coefficient G is the unique parameter that controls it. They also give closed-form bipartite Cheeger bounds for standard causal patterns, including an n-independent floor of 1/5 for uniform attention and how window attention breaks it as O(w/n). These structural results are new and useful within the spectral-attention literature. The two-axis diagnostic that follows (phi for capacity, G for direction) produces a clear falsifiable claim: bottleneck and diffuse failure modes should show opposite polarity. The reported LC-AUROC range of 0.62-0.84 on models up to 8B, with the predicted reversal between HaluEval and MedHallu, gives some empirical traction. The theoretical side holds up. The proofs are stated directly and the closed forms avoid post-hoc fitting. The empirics are weaker. The abstract and results give no baselines, no intervals, and limited detail on exclusion criteria or length controls. The assumption that attention transport dominates hallucination patterns also sits on thin ground, since feed-forward layers and normalization can produce similar concentration or diffusion effects on their own. Readers working on mechanistic interpretability or spectral tools for transformers will find the proofs worth their time. The work is coherent on its own terms and the central claim is not circular. It deserves peer review because the theoretical contribution is self-contained and the prediction can be checked. Send it to referees, but expect them to press on the empirical isolation and reporting standards.

Referee Report

2 major / 1 minor

Summary. The paper proves that every transpose-invariant spectral diagnostic of the degree-normalized attention operator is orientation-blind (cannot distinguish an operator from its transpose) and provides a quantitative converse showing the asymmetry coefficient G as the unique control parameter for information-flow direction. It derives closed-form bipartite-Cheeger landscapes for canonical causal architectures (uniform causal attention satisfies an n-independent floor φ ≥ 1/5 with worst cut at t*/n ≈ 0.32; window attention scales as O(w/n)), yielding a two-axis diagnostic (φ for transport capacity, G for direction) with a falsifiable polarity prediction tested on hallucination benchmarks (LC-AUROC 0.62–0.84 on models up to 8B, with polarity reversal between HaluEval and MedHallu).

Significance. If the theoretical results hold, the work offers a rigorous account of why symmetric spectral methods fail to capture directionality in attention and supplies closed-form, parameter-free expressions together with a falsifiable prediction on external benchmarks. These elements constitute a clear strength for interpretability research in transformers.

major comments (2)

[Empirical results (LC-AUROC reporting)] The empirical validation of the two-axis diagnostic (LC-AUROC range 0.62–0.84) is load-bearing for the falsifiable polarity prediction yet reports no baselines, confidence intervals, or exclusion criteria for the length-controlled evaluation; this leaves the claim that transport features retain interpretable signal only partially supported.
[Discussion of diagnostic applicability] The central application of the diagnostic presupposes that the degree-normalized attention operator’s capacity ϕ and direction G dominate observed hallucination modes, but the manuscript provides no ablation or isolation argument addressing potential confounding from feed-forward sublayers, layer-norm scaling, or output decoding.

minor comments (1)

[Abstract and introduction] Notation for the asymmetry coefficient G and the transport capacity ϕ should be introduced with explicit definitions before their use in the abstract and main claims.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed review. We address each major comment below, agreeing on the need for stronger empirical reporting and a clearer discussion of applicability. Revisions will be incorporated as indicated.

read point-by-point responses

Referee: The empirical validation of the two-axis diagnostic (LC-AUROC range 0.62–0.84) is load-bearing for the falsifiable polarity prediction yet reports no baselines, confidence intervals, or exclusion criteria for the length-controlled evaluation; this leaves the claim that transport features retain interpretable signal only partially supported.

Authors: We agree that the current empirical reporting is incomplete and weakens support for the claims. In the revised manuscript we will add: (i) explicit baselines including a random classifier (expected AUROC 0.5) and symmetric spectral baselines such as the algebraic connectivity of the symmetrized operator; (ii) 95% bootstrap confidence intervals over the benchmark instances; (iii) precise exclusion criteria for length-controlled evaluation (sequences outside [50, 2048] tokens discarded, with per-benchmark sample counts reported). These additions will be placed in a new subsection of the experiments and will not alter the reported LC-AUROC ranges or polarity reversal. We view this as a necessary and straightforward strengthening of the falsifiable prediction. revision: yes
Referee: The central application of the diagnostic presupposes that the degree-normalized attention operator’s capacity ϕ and direction G dominate observed hallucination modes, but the manuscript provides no ablation or isolation argument addressing potential confounding from feed-forward sublayers, layer-norm scaling, or output decoding.

Authors: The referee correctly notes the absence of isolation arguments. Our theoretical results and the two-axis diagnostic are defined strictly on the degree-normalized attention operator extracted after the softmax; the empirical tests therefore operate on attention weights alone. Nevertheless, we acknowledge that feed-forward sublayers, layer-norm, and decoding can still influence final outputs. In revision we will add a dedicated limitations paragraph that (a) states the diagnostic isolates transport by construction because it uses only attention matrices, (b) cites prior literature on attention’s dominant role in routing failures, and (c) explicitly flags the lack of component-wise ablations as a limitation, recommending future controlled experiments on attention-only or frozen-FFN models. Full empirical isolation is not feasible within the current revision cycle, so this constitutes a partial revision focused on interpretive clarity rather than new experiments. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained mathematical proof with external validation

full rationale

The paper derives the orientation-blindness of transpose-invariant spectral diagnostics and the uniqueness of G as direction control parameter directly from properties of the degree-normalized attention operator, then obtains closed-form bipartite-Cheeger bounds for specific architectures and tests the resulting polarity prediction on external hallucination benchmarks. No quoted step reduces a claimed result to a fitted parameter, self-defined quantity, or self-citation chain by construction; the LC-AUROC values are reported as independent empirical checks rather than tautological outputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claims rest on treating the attention matrix as a transport operator whose symmetric part governs capacity, plus standard properties of graph Laplacians and Cheeger constants applied to causal masking patterns. G is introduced as the derived direction parameter rather than an arbitrary free parameter.

axioms (2)

domain assumption The degree-normalized attention operator can be decomposed into symmetric and antisymmetric parts that separately control transport capacity and direction.
Invoked to separate the orientation-blind symmetric diagnostics from the direction-carrying asymmetry G.
domain assumption Canonical causal attention architectures admit a bipartite graph representation whose Cheeger constant yields an n-independent lower bound.
Used to derive the phi >= 1/5 floor and the worst-cut location for uniform causal attention.

invented entities (1)

asymmetry coefficient G independent evidence
purpose: Quantifies the directional bias of the attention operator that symmetric spectral methods cannot capture.
Defined as the unique control parameter whose sign predicts opposite hallucination polarity on different benchmarks.

pith-pipeline@v0.9.0 · 5565 in / 1684 out tokens · 103458 ms · 2026-05-08T16:45:03.414461+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 24 canonical work pages · 1 internal anchor

[1]

ISBN 979-8-89176- 332-6

Jakub Binkowski, Denis Janiak, Albert Sawczyn, Bogdan Gabrys, and Tomasz Jan Kajdanowicz. Hallucination detection in LLMs using spectral features of attention maps. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 24354--24385, Suzhou, China, 2025. Association for Computational Linguistics. doi:10.18...

work page doi:10.18653/v1/2025.emnlp-main.1239 2025
[2]

A Lower Bound for the Smallest Eigenvalue of the Laplacian

Jeff Cheeger. A Lower Bound for the Smallest Eigenvalue of the Laplacian . In Robert C. Gunning, editor, Problems in Analysis : A Symposium in Honor of Salomon Bochner , pages 195--199. Princeton University Press, Princeton, NJ, 1970. Princeton Legacy Library reprint: 2015, ISBN 978-1-4008-6931-2

1970
[3]

INSIDE : LLMs ' internal states retain the power of hallucination detection

Chao Chen, Kai Liu, Ze Chen, Yi Gu, Yue Wu, Mingyuan Tao, Zhihang Fu, and Jieping Ye. INSIDE : LLMs ' internal states retain the power of hallucination detection. In The Twelfth International Conference on Learning Representations (ICLR 2024), 2024

2024
[4]

URL https: //doi.org/10.18653/v1/2024.emnlp-main.84

Yung-Sung Chuang, Linlu Qiu, Cheng-Yu Hsieh, Ranjay Krishna, Yoon Kim, and James Glass. Lookback lens: Detecting and mitigating contextual hallucinations in large language models using only attention maps. In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing, pages 1419--1436, Miami, Florida, USA, 2024. Association for...

work page doi:10.18653/v1/2024.emnlp-main.84 2024
[5]

Fan R. K. Chung. Spectral Graph Theory, volume 92 of CBMS Regional Conference Series in Mathematics. American Mathematical Society, 1997. ISBN 978-0-8218-0315-8

1997
[6]

Fan R. K. Chung. Laplacians and the Cheeger Inequality for Directed Graphs . Annals of Combinatorics, 9 0 (1): 0 1--19, April 2005. ISSN 0219-3094. doi:10.1007/s00026-005-0237-z

work page doi:10.1007/s00026-005-0237-z 2005
[7]

Attention is not all you need: Pure attention loses rank doubly exponentially with depth

Yihe Dong, Jean-Baptiste Cordonnier, and Andreas Loukas. Attention is not all you need: Pure attention loses rank doubly exponentially with depth. In Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 2793--2803. PMLR, 2021

2021
[8]

EigenTrack : Spectral activation feature tracking for hallucination and out-of-distribution detection in LLMs and VLMs , 2025

Davide Ettori, Nastaran Darabi, Sina Tayebati, Ranganath Krishnan, Mahesh Subedar, Omesh Tickoo, and Amit Ranjan Trivedi. EigenTrack : Spectral activation feature tracking for hallucination and out-of-distribution detection in LLMs and VLMs , 2025. URL https://arxiv.org/abs/2509.15735

work page arXiv 2025
[9]

Nature , year =

Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, and Yarin Gal. Detecting hallucinations in large language models using semantic entropy. Nature, 630 0 (8017): 0 625--630, June 2024. ISSN 1476-4687. doi:10.1038/s41586-024-07421-0

work page doi:10.1038/s41586-024-07421-0 2024
[10]

Eigenvalue bounds on convergence to stationarity for nonreversible M arkov chains, with an application to the exclusion process

James Allen Fill. Eigenvalue bounds on convergence to stationarity for nonreversible M arkov chains, with an application to the exclusion process. The Annals of Applied Probability, 1 0 (1): 0 62--87, 1991. doi:10.1214/aoap/1177005981

work page doi:10.1214/aoap/1177005981 1991
[11]

The emergence of clusters in self-attention dynamics

Borjan Geshkovski, Cyril Letrouit, Yury Polyanskiy, and Philippe Rigollet. The emergence of clusters in self-attention dynamics. In Advances in Neural Information Processing Systems, volume 36, pages 57026--57037, 2023

2023
[12]

Transformer Feed-Forward Layers Are Key-Value Memories

Mor Geva, Roei Schuster, Jonathan Berant, and Omer Levy. Transformer feed-forward layers are key-value memories. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 5484--5495, 2021. doi:10.18653/v1/2021.emnlp-main.446

work page internal anchor Pith review doi:10.18653/v1/2021.emnlp-main.446 2021
[13]

Golub and Charles F

Gene H. Golub and Charles F. van Loan. Matrix Computations. Johns Hopkins University Press, 4th edition, 2013. ISBN 978-1-4214-0794-4

2013
[14]

Graham, Donald E

Ronald L. Graham, Donald E. Knuth, and Oren Patashnik. Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley, 2nd edition, 1994. ISBN 978-0-201-55802-9

1994
[15]

Higher dimensional discrete Cheeger inequalities

Anna Gundert and May Szedl\'ak. Higher dimensional discrete Cheeger inequalities. Journal of Computational Geometry, 6 0 (2): 0 54--71, 2015. doi:10.20382/jocg.v6i2a4

work page doi:10.20382/jocg.v6i2a4 2015
[16]

Appendix A

Roger A. Horn and Charles R. Johnson. Matrix Analysis . Cambridge University Press, 2nd edition, 2012. ISBN 978-0-521-83940-2. doi:10.1017/CBO9781139020411

work page doi:10.1017/cbo9781139020411 2012
[17]

and Meka, Raghu , title =

Tsz Chiu Kwok, Lap Chi Lau, Yin Tat Lee, Shayan Oveis Gharan, and Luca Trevisan. Improved Cheeger's inequality: Analysis of spectral partitioning algorithms through higher order spectral gap. In Proceedings of the Forty-Fifth Annual ACM Symposium on Theory of Computing (STOC), pages 11--20. ACM, 2013. doi:10.1145/2488608.2488611

work page doi:10.1145/2488608.2488611 2013
[18]

Frustration index and Cheeger inequalities for discrete and continuous magnetic Laplacians

Carsten Lange, Shiping Liu, Norbert Peyerimhoff, and Olaf Post. Frustration index and Cheeger inequalities for discrete and continuous magnetic Laplacians . Calculus of Variations and Partial Differential Equations, 54 0 (4): 0 4165--4196, 2015. doi:10.1007/s00526-015-0935-x

work page doi:10.1007/s00526-015-0935-x 2015
[19]

R., Gharan, S

James R. Lee, Shayan Oveis Gharan, and Luca Trevisan. Multiway spectral partitioning and higher-order Cheeger inequalities. Journal of the ACM, 61 0 (6): 0 1--30, 2014. doi:10.1145/2665063. Conference version in STOC 2012

work page doi:10.1145/2665063 2014
[20]

Levin, Yuval Peres, and Elizabeth L

David A. Levin, Yuval Peres, and Elizabeth L. Wilmer. Markov chains and mixing times . American Mathematical Society, 2006

2006
[21]

H alu E val: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models

Junyi Li, Xiaoxue Cheng, Wayne Xin Zhao, Jian-Yun Nie, and Ji-Rong Wen. HaluEval : A large-scale hallucination evaluation benchmark for large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6449--6464, Singapore, 2023. Association for Computational Linguistics. doi:10.18653/v1/2023.emnlp-main.397

work page doi:10.18653/v1/2023.emnlp-main.397 2023
[22]

Bradley Efron and Robert J Tibshirani.An introduction to the bootstrap, volume

Stephanie Lin, Jacob Hilton, and Owain Evans. TruthfulQA : Measuring how models mimic human falsehoods. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 3214--3252, Dublin, Ireland, 2022. Association for Computational Linguistics. doi:10.18653/v1/2022.acl-long.229

work page doi:10.18653/v1/2022.acl-long.229 2022
[23]

Lov\'asz

L. Lov\'asz. Random walks on graphs: A survey. In D. Mikl\'os , V. T. S\'os , and T. Sz o nyi , editors, Combinatorics, Paul Erd o s is Eighty , volume 2, pages 353--398. J\'anos Bolyai Mathematical Society, Budapest, 1996

1996
[24]

Potsawee Manakul, Adian Liusie, and Mark J. F. Gales. Selfcheckgpt: Zero-resource black-box hallucination detection for generative large language models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 9004--9017, 2023. doi:10.18653/v1/2023.emnlp-main.557

work page doi:10.18653/v1/2023.emnlp-main.557 2023
[25]

Clustering by weighted cuts in directed graphs

Marina Meila and William Pentney. Clustering by weighted cuts in directed graphs. In Proceedings of the 2007 SIAM International Conference on Data Mining, pages 135--144, 2007. doi:10.1137/1.9781611972771.13

work page doi:10.1137/1.9781611972771.13 2007
[26]

Locating and editing factual associations in GPT

Kevin Meng, David Bau, Alex Andonian, and Yonatan Belinkov. Locating and editing factual associations in GPT . In Advances in Neural Information Processing Systems, volume 35, 2022

2022
[27]

FActScore: Fine-grained atomic evaluation of factual precision in long form text generation

Sewon Min, Kalpesh Krishna, Xinxi Lyu, Mike Lewis, Wen - tau Yih, Pang Wei Koh, Mohit Iyyer, Luke Zettlemoyer, and Hannaneh Hajishirzi. FActScore : Fine-grained atomic evaluation of factual precision in long form text generation. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 12076--12100, 2023. do...

work page doi:10.18653/v1/2023.emnlp-main.741 2023
[28]

In: Proceedings of the 2025 Conference on Empirical Meth- ods in Natural Language Processing

Shrey Pandit, Jiawei Xu, Junyuan Hong, Zhangyang Wang, Tianlong Chen, Kaidi Xu, and Ying Ding. MedHallu : A comprehensive benchmark for detecting medical hallucinations in large language models. In Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, pages 2858--2873, Suzhou, China, 2025. Association for Computational Li...

work page doi:10.18653/v1/2025.emnlp-main.143 2025
[29]

Ori Parzanchevski, Ron Rosenthal, and Ran J. Tessler. Isoperimetric inequalities in simplicial complexes. Combinatorica, 36 0 (2): 0 195--227, 2016. doi:10.1007/s00493-014-3002-x

work page doi:10.1007/s00493-014-3002-x 2016
[30]

Robins, Andrea Rotnitzky, and Lue Ping Zhao

James M. Robins, Andrea Rotnitzky, and Lue Ping Zhao. Estimation of regression coefficients when some regressors are not always observed. Journal of the American Statistical Association, 89 0 (427): 0 846--866, 1994. doi:10.1080/01621459.1994.10476818

work page doi:10.1080/01621459.1994.10476818 1994
[31]

Mind the gap: A spectral analysis of rank collapse and signal propagation in attention layers

Thiziri Nait Saada, Alireza Naderi, and Jared Tanner. Mind the gap: A spectral analysis of rank collapse and signal propagation in attention layers. In Proceedings of the 42nd International Conference on Machine Learning, Proceedings of Machine Learning Research. PMLR, 2025

2025
[32]

Approximate counting, uniform generation and rapidly mixing Markov chains

Alistair Sinclair and Mark Jerrum. Approximate counting, uniform generation and rapidly mixing Markov chains. Information and Computation, 82 0 (1): 0 93--133, 1989. ISSN 0890-5401. doi:10.1016/0890-5401(89)90067-9

work page doi:10.1016/0890-5401(89)90067-9 1989
[33]

LLM-Check : Investigating detection of hallucinations in large language models

Gaurang Sriramanan, Siddhant Bharti, Vinu Sankar Sadasivan, Shoumik Saha, Priyatham Kattakinda, and Soheil Feizi. LLM-Check : Investigating detection of hallucinations in large language models. In Advances in Neural Information Processing Systems, volume 37, 2024

2024
[34]

RoFormer: Enhanced transformer with Rotary Position Embedding , journal =

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. RoFormer : Enhanced transformer with rotary position embedding. Neurocomputing, 568: 0 127063, 2024. doi:10.1016/j.neucom.2023.127063

work page doi:10.1016/j.neucom.2023.127063 2024
[35]

doi: 10.18653/v1/P19-1580

Elena Voita, David Talbot, Fedor Moiseev, Rico Sennrich, and Ivan Titov. Analyzing multi-head self-attention: Specialized heads do the heavy lifting, the rest can be pruned. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 5797--5808, 2019. doi:10.18653/v1/P19-1580

work page doi:10.18653/v1/p19-1580 2019
[36]

Statistics and Computing , year =

Ulrike von Luxburg . A tutorial on spectral clustering. Statistics and Computing, 17 0 (4): 0 395--416, December 2007. ISSN 0960-3174. doi:10.1007/s11222-007-9033-z

work page doi:10.1007/s11222-007-9033-z 2007
[37]

Jerry Wei, Chengrun Yang, Xinying Song, Yifeng Lu, Nathan Hu, Jie Huang, Dustin Tran, Daiyi Peng, Ruibo Liu, Da Huang, Cosmo Du, and Quoc V. Le. Long-form factuality in large language models. In Advances in Neural Information Processing Systems, volume 37, 2024

2024
[38]

Stabilizing transformer training by preventing attention entropy collapse

Shuangfei Zhai, Tatiana Likhomanenko, Etai Littwin, Dan Busbridge, Jason Ramapuram, Yizhe Zhang, Jiatao Gu, and Josh Susskind. Stabilizing transformer training by preventing attention entropy collapse. In Proceedings of the 40th International Conference on Machine Learning, volume 202 of Proceedings of Machine Learning Research, pages 40770--40803. PMLR, 2023

2023