arxiv: 2605.08071 · v1 · submitted 2026-05-08 · 💰 econ.EM · cs.HC· stat.ME

Recognition: no theorem link

Vibe Econometrics and the Analysis Contract

Lydia Ashton (University of Wisconsin-Madison)

Pith reviewed 2026-05-11 01:50 UTC · model grok-4.3

classification 💰 econ.EM cs.HCstat.ME

keywords vibe econometricsAI-assisted analysisAnalysis Contractpre-analysis planscausal inferenceinferential failuresgovernance frameworkmethod-data mismatch

0 comments

The pith

AI assistance in econometrics changes how inferential failures occur and persuade, requiring the Analysis Contract to restore governance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper argues that AI-assisted vibe econometrics does not invent new ways for causal claims to fail but changes their incidence, observability, and persuasive force by collapsing the barrier between naming a method and executing it. This creates three problems: method-data mismatch at execution, confidence laundering through polished output, and invisible forking paths. A sympathetic reader would care because these shifts allow weak foundations to reach large audiences at scale and speed without the expertise needed to detect invalidity. The proposed solution is the Analysis Contract, which adapts pre-analysis plans by requiring a method-data agreement, a data audit, and pre-commitment to what counts as disconfirming evidence before any causal claim is advanced.

Core claim

The central claim is that AI assistance structurally alters the failure surface in vibe econometrics, a class of methods where identification can be named faster than it can be audited, so that outputs do not reliably signal invalidity and recognizing any signals requires the expertise the workflow bypasses. The result is not new failure modes but their industrialization: weak analysis packaged with rigor that spreads faster and more credibly than before. The Analysis Contract addresses this by imposing three pre-conditions on causal claims: a method-data contract, a data audit, and a pre-commitment statement defining disconfirming results, generalizing existing safeguards to the AI setting.

What carries the argument

The Analysis Contract, a pre-commitment framework that requires a method-data contract, data audit, and pre-commitment to disconfirming results before advancing a causal claim in AI-assisted analysis.

If this is right

Requiring a method-data contract upfront would prevent execution of analyses on incompatible data or identification strategies.
Mandating a data audit would expose assumption violations that formatted AI output otherwise conceals.
Pre-committing to disconfirming results would make forking paths visible and reduce the ability to claim support after seeing the data.
The contract would integrate with existing tools such as pre-analysis plans by adding AI-specific checks on observability of failures.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The framework could be extended to non-econometric domains where AI generates outputs whose validity depends on unverifiable assumptions.
Voluntary adoption of the contract by journals or funders might serve as a test by tracking whether signed papers face fewer external critiques.
Without such pre-commitments, AI tools risk accelerating the use of flawed causal evidence in policy settings where decision-makers lack auditing capacity.

Load-bearing premise

The assumption that AI assistance alters the incidence, observability, and persuasive force of inferential failures enough to create a practically distinct governance problem that pre-analysis plans and the Causal Roadmap cannot address without a new named framework.

What would settle it

A study documenting no measurable increase in the rate or undetected incidence of invalid causal claims from AI-assisted versus traditional econometric workflows, or showing that the three elements of the Analysis Contract produce no improvement in auditability or reduction in post-hoc adjustments, would falsify the need for the new framework.

Figures

Figures reproduced from arXiv: 2605.08071 by Lydia Ashton (University of Wisconsin-Madison).

**Figure 2.** Figure 2: The Vibe Econometrics Workflow. The pattern is visible in claims-data analyses, where the failure modes are documented (Mbotwa et al., 2017; Dahlen & Charu, 2023). For example, a health plan analytics team receives a vendorprovided dataset of medical claims aggregated to member-month totals. They use an AI assistant to estimate a care management program’s effect on total cost of care using difference-in-d… view at source ↗

read the original abstract

"Vibe coding" and "vibe analytics" have been framed as a democratization of technical capability. This paper argues that AI-assisted methodology more broadly, or what I call "vibe methodology," also democratizes the failure modes specific to each domain. When AI assists with methods whose validity depends on assumptions that cannot be verified from the output alone (a class I call "vibe inference"), the failure surface is structurally different: the output does not reliably signal invalidity, and when it does, recognizing the signal requires the expertise the workflow bypasses. I focus on "vibe econometrics," the subset of AI-assisted causal analysis where identification can be named faster than it can be audited. The claim of this paper is not that AI invents inferential failures that did not previously exist, but that it changes their incidence, observability, and persuasive force enough to create a practically distinct governance problem. This results in three failure modes: method-data mismatch, where AI bypasses expertise at execution; confidence laundering, where AI amplifies the credibility of formatted output; and invisible forking, which spans both. What is new is not the failure modes but AI's industrialization of their packaging. The barrier between naming a method and executing it has collapsed, and weak foundations, dressed as rigorous analysis, now reach audiences at a scale, speed, and polish that previously required expertise. I propose the Analysis Contract, a pre-commitment framework that adapts the logic of pre-analysis plans and the Causal Roadmap to the AI-assisted setting. The contract imposes three conditions before a causal claim is made: a method-data contract, a data audit, and a pre-commitment statement defining what would count as a disconfirming result. The framework generalizes across domains of vibe inference through domain-specific instantiation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper renames existing pre-analysis plan ideas for AI-assisted econometrics but does not demonstrate why a new framework is required over incremental updates.

read the letter

This paper claims AI does not invent new econometric failures but changes their incidence, observability, and persuasive force enough to create a distinct governance problem in causal work. It introduces the term vibe econometrics for cases where identification can be named faster than audited and proposes the Analysis Contract as the fix. The contract requires a method-data agreement, a data audit, and pre-commitment to what would count as disconfirming evidence before making a causal claim. This setup adapts the logic of pre-analysis plans and the Causal Roadmap to AI workflows. The three named failure modes—method-data mismatch, confidence laundering, and invisible forking—are described clearly as ways AI packages weak foundations at scale and speed. The paper does a straightforward job laying out how formatted output can bypass the expertise that used to catch problems. What is actually new is the specific packaging of these points around AI industrialization of the failures rather than any first-principles invention. The soft spot is that the central claim of practical distinctness rests on assertion. The abstract and description supply no worked example of an AI workflow where an updated pre-analysis plan specifying AI steps, audits, and disconfirmation criteria would fail to address the modes. Without that or any simulation or data on changed incidence, it is difficult to judge whether the new name adds more than re-labeling. The argument stays at logical description throughout. This piece is for applied researchers and methodologists who use or supervise AI tools in causal analysis and want a concrete checklist to discuss. A reader focused on research integrity practices would get value from the framing and the explicit conditions. It deserves a serious referee because the topic is timely, the proposal is specific, and the gaps are fixable with added examples rather than fatal to the core idea. I would send it out for review with the expectation that revisions address whether the framework is truly necessary or just a useful reminder.

Referee Report

2 major / 2 minor

Summary. The paper claims that AI-assisted 'vibe methodology' (and specifically 'vibe econometrics') does not invent new inferential failures but changes their incidence, observability, and persuasive force by collapsing the barrier between naming and executing methods whose validity rests on unverifiable assumptions. This creates three failure modes—method-data mismatch, confidence laundering, and invisible forking—whose industrialization requires a new governance framework. The proposed Analysis Contract adapts pre-analysis plans and the Causal Roadmap via three conditions: a method-data contract, a data audit, and pre-commitment to disconfirming results, generalizing across domains of 'vibe inference.'

Significance. If the central claim holds, the paper identifies a practically relevant shift in the governance of empirical causal claims under AI assistance, potentially informing journal policies and researcher workflows in econometrics. It correctly credits that the underlying failures predate AI and focuses on their changed packaging and scale. The conceptual mapping to existing tools (pre-analysis plans, Causal Roadmap) is a strength, though the manuscript supplies no empirical data, simulations, or worked examples to quantify altered incidence or observability.

major comments (2)

[Abstract] Abstract: The claim that AI assistance creates a 'practically distinct governance problem' that cannot be addressed by suitably updated pre-analysis plans is asserted rather than demonstrated; no concrete AI workflow is supplied in which an extended PAP (specifying AI steps, required audits, and disconfirmation criteria) would fail to cover the three failure modes.
[Abstract] Abstract (description of Analysis Contract): The three conditions are explicitly described as adaptations of pre-analysis plans and the Causal Roadmap, yet the manuscript provides no derivation or counter-example showing which AI-specific elements (e.g., prompt auditing or output verification) cannot be incorporated into those existing structures without a new named framework.

minor comments (2)

[Abstract] The term 'vibe inference' is introduced without a formal definition or boundary conditions that would allow readers to classify specific econometric procedures as inside or outside the category.
The manuscript would benefit from a short table or bullet list contrasting the Analysis Contract conditions with standard pre-analysis plan elements to clarify incremental versus novel requirements.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful and constructive comments. The feedback accurately notes that the manuscript's central claims are presented conceptually and would be strengthened by explicit demonstration and derivation. We respond to each major comment below and will revise the manuscript accordingly to incorporate the requested elements.

read point-by-point responses

Referee: [Abstract] Abstract: The claim that AI assistance creates a 'practically distinct governance problem' that cannot be addressed by suitably updated pre-analysis plans is asserted rather than demonstrated; no concrete AI workflow is supplied in which an extended PAP (specifying AI steps, required audits, and disconfirmation criteria) would fail to cover the three failure modes.

Authors: We agree that the manuscript would benefit from a concrete illustration. The structural distinction arises because AI assistance collapses the barrier between naming a method and executing it, allowing unverifiable assumptions to be embedded in generated code or specifications in ways that remain opaque even when an extended PAP details the AI steps, audits, and disconfirmation criteria. In the revised version, we will add a specific AI workflow example (e.g., LLM-assisted specification of an instrumental-variables design on observational data) showing how method-data mismatch and invisible forking can persist despite such extensions. This will demonstrate why the change in incidence and observability creates a practically distinct governance problem. revision: yes
Referee: [Abstract] Abstract (description of Analysis Contract): The three conditions are explicitly described as adaptations of pre-analysis plans and the Causal Roadmap, yet the manuscript provides no derivation or counter-example showing which AI-specific elements (e.g., prompt auditing or output verification) cannot be incorporated into those existing structures without a new named framework.

Authors: We accept that the manuscript would be improved by an explicit derivation. While the Analysis Contract adapts existing tools, the AI setting introduces elements such as non-deterministic prompt outputs and the bypassing of domain expertise that standard PAP specifications and the Causal Roadmap do not routinely address. In revision, we will include a derivation section contrasting the three conditions with those frameworks and provide a counter-example in which prompt auditing and output verification are incorporated into an extended PAP yet still leave gaps in addressing confidence laundering and invisible forking. This will clarify the rationale for the dedicated named framework as a practical governance instrument. revision: yes

Circularity Check

0 steps flagged

No circularity: conceptual proposal adapts existing frameworks without self-referential reduction

full rationale

The manuscript contains no equations, derivations, fitted parameters, or mathematical claims. Its central argument defines 'vibe inference' and 'vibe econometrics' descriptively, then proposes the Analysis Contract as an adaptation of pre-analysis plans and the Causal Roadmap. No step claims to derive a new result from first principles that collapses back to the input definitions or to a self-citation chain. The three conditions of the contract are presented as explicit extensions of prior structures rather than as outputs forced by the paper's own premises. Because the work is self-contained against external benchmarks (existing methodological literature) and makes no load-bearing self-citations or uniqueness assertions, the derivation chain exhibits no circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The paper rests on domain assumptions about how AI changes method execution and credibility perception without empirical backing or formal models.

axioms (2)

domain assumption AI assistance bypasses expertise at the point of method execution in ways that alter failure observability
Invoked when describing method-data mismatch and the collapse of the barrier between naming and executing a method.
domain assumption Formatted AI output increases the persuasive force of claims beyond their actual validity
Basis for the confidence laundering failure mode.

pith-pipeline@v0.9.0 · 5624 in / 1327 out tokens · 43689 ms · 2026-05-11T01:50:49.896955+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

5 extracted references · 5 canonical work pages

[1]

Fishing Expedition

Acemoglu, D., Kong, D., & Ozdaglar, A. (2026). AI, Human Cognition and Knowledge Collapse. NBER Working Paper No. 34910. All About AI. (2025). AI Hallucinations: Statistics. Retrieved from https://www.allaboutai.com/resources/ai- statistics/ai-hallucinations/ Arbour, D., Bojinov, I., Feller, A., & Ni, T. (2026). Toward causal field evaluations of AI syste...

work page doi:10.1162/99608f92.7d74e33e 2026
[2]

Nickerson, R.S

DOI: 10.1038/s41562-016-0021. Nickerson, R.S. (1998). Confirmation Bias: A Ubiquitous Phenomenon in Many Guises. Review of General Psychology, 2 (2), 175-220. DOI: 10.1037/1089-2680.2.2.175. NIST. (2023). Artificial Intelligence Risk Management Framework (AI RMF 1.0). NIST AI 100-1. https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf Perez, E., Ringer,...

work page doi:10.1038/s41562-016-0021 1998
[3]

doi:10.1016/j.tics.2016.07.002 Rocky.ai

arXiv:2507.06306. Risko, E.F., & Gilbert, S.J. (2016). Cognitive Offloading. Trends in Cognitive Sciences, 20 (9), 676–688. DOI: 10.1016/j.tics.2016.07.002. Roth, J., Sant’Anna, P.H.C., Bilinski, A., & Poe, J. (2023). What’s Trending in Difference-in- Differences? A Synthesis of the Recent Econometrics Literature. Journal of Econometrics, 235 (2), 2218–22...

work page doi:10.1016/j.tics.2016.07.002 2016
[4]

Shaw and Gideon Nave

arXiv:2310.13548. Shaw, S.D., & Nave, G. (2026). Thinking—Fast, Slow, and Artificial: How AI is Reshaping Human Reasoning and the Rise of Cognitive Surrender. Wharton School Working Paper. DOI: 10.2139/ssrn.6097646. Shen, J. H., & Tamkin, A. (2026). How AI impacts skill formation. arXiv preprint arXiv:2601.20245v2. 19 Shumailov, I., Shumaylov, Z., Zhao, Y...

work page doi:10.2139/ssrn.6097646 2026
[5]

Wang, G., Hamad, R., & White, J.S

Suprmind.ai. Wang, G., Hamad, R., & White, J.S. (2024). Advances in Difference-in-Differences Methods for Policy Evaluation Research. Epidemiology, 35 (5), 628–637. DOI: 10.1097/EDE.0000000000001630. PMC: PMC11305929. Watson, H.J. (2025). A Statistical Analysis Plan Template for Observational Studies: Promot- ing Quality and Rigor in Research. Journal of ...

work page doi:10.1097/ede.0000000000001630 2024