Pith Integrity automated scientific verification

What automated verification finds.

Every finding on this page was produced by a versioned detector, signed with the Pith Ed25519 key, and emitted to the paper's Open Graph Bundle. No human curates this feed. The resolver decides.

About the integrity layer · Protocol · Event schema

37257Papers checked

7727Findings

2455Critical

5272Advisory

2992Affected papers

8Detectors

Filters

All Critical Advisory

All doi_compliance doi_title_agreement ai_meta_artifact external_links citation_quote_validity shingle_duplication claim_evidence cited_work_retraction

Showing latest 6 findings; use filters for narrower slices.

The Deterministic Horizon: Impossibility Results as Design Specifications for Trustworthy AI Systems arXiv:2605.23024

advisory citation_quote_validity unsupported attribution · ref #34

Citing paper attributes a specific factual claim to reference [34], which resolves to arXiv:2110.14168. The claim's distinctive tokens have only 10% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

K. Cobbe, V . Kosaraju, M. Bavarian, M. Chen, H. Jun, L. Kaiser, M. Plappert, J. Tworek, J. Hilton, R. Nakano, C. Hesse, and J. Schulman. “Training Verifiers to Solve Math Word Problems”. In:arXiv preprint arXiv.2110.14168 (2021)

arXiv:2110.14168

Finding detail arXiv Pith page integrity.json

Learning Through Noise: Why Subliminal Learning Works and When It Fails arXiv:2605.23645

advisory citation_quote_validity unsupported attribution · ref #23

Citing paper attributes a specific factual claim to reference [23], which resolves to arXiv:2209.10652. The claim's distinctive tokens have only 14% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield-Dodds, Robert Lasenby, Dawn Drain, Carol Chen, et al. Toy models of superposition.arXiv preprint arXiv:2209.10652, 2022

arXiv:2209.10652

Finding detail arXiv Pith page integrity.json

Broken Memories: Detecting and Mitigating Memorization in Diffusion Models with Degraded Generations arXiv:2605.22050

advisory citation_quote_validity unsupported attribution · ref #26

Citing paper attributes a specific factual claim to reference [26], which resolves to arXiv:2103.00020. The claim's distinctive tokens have only 7% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. 2021. Learning Transferable Visual Models From Natural Language Supervision. arXiv:2103.00020 [cs.CV] https://arxiv.org/ abs/2103.00020

arXiv:2103.00020

Finding detail arXiv Pith page integrity.json

AMEL: Accumulated Message Effects on LLM Judgments arXiv:2605.22714

advisory citation_quote_validity unsupported attribution · ref #17

Citing paper attributes a specific factual claim to reference [17], which resolves to arXiv:2307.03172. The claim's distinctive tokens have only 0% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

Nelson F Liu, Kevin Lin, John Hewitt, Ashwin Paran- jape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. Lost in the middle: How language models use long contexts.Transactions of the Association for Com- putational Linguistics (TACL), 12:157–173, 2024. URL https://arxiv.org/abs/2307.03172. Mod- els attend more to beginning and end of context, degrad- ing on middle content

arXiv:2307.03172

Finding detail arXiv Pith page integrity.json

TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization arXiv:2605.21318

advisory citation_quote_validity unsupported attribution · ref #9

Citing paper attributes a specific factual claim to reference [9], which resolves to arXiv:2406.07496. The claim's distinctive tokens have only 13% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

Mert Yuksekgonul, Federico Bianchi, Joseph Boen, Sheng Liu, Zhi Huang, Carlos Guestrin, and James Zou. Textgrad: Automatic" differentiation" via text.arXiv preprint arXiv:2406.07496, 2024

arXiv:2406.07496

Finding detail arXiv Pith page integrity.json

DEFLECT: Delay-Robust Execution via Flow-matching Likelihood-Estimated Counterfactual Tuning for VLA Policies arXiv:2605.19294

advisory citation_quote_validity unsupported attribution · ref #20

Citing paper attributes a specific factual claim to reference [20], which resolves to arXiv:2603.19199. The claim's distinctive tokens have only 0% overlap with any chunk of the cited paper's stored text (threshold for unsupported is 15%). The attribution could not be verified against the cited work.

Evidence text

Y . Lu, Z. Liu, X. Fan, Z. Yang, J. Hou, J. Li, K. Ding, and H. Zhao. Faster: Rethinking real-time flow vlas, 2026. URLhttps://arxiv.org/abs/2603.19199

arXiv:2603.19199

Finding detail arXiv Pith page integrity.json

How this works

Pith runs 8 detectors on the corpus: DOI/arXiv compliance, identifier-title agreement, AI meta-comment regex, external link availability, citation-to-quotation validity, and 40-token shingle duplication. Each finding is keyed by a stable evidence hash, signed, and emitted to the paper's Pith Number Open Graph Bundle as a pith.integrity.v1 event.

Public findings use one of three verdict classes: incontrovertible, cross_source, or threshold_with_margin. Read the integrity protocol for exact detector contracts and evidence schemas.