arxiv: 2604.04081 · v1 · submitted 2026-04-05 · ⚛️ physics.comp-ph · physics.ed-ph· physics.soc-ph· quant-ph

Recognition: no theorem link

Co-Authoring with AI: How I Wrote a Physics Paper About AI, Using AI

Yi Zhou

Authors on Pith no claims yet

Pith reviewed 2026-05-13 16:58 UTC · model grok-4.3

classification ⚛️ physics.comp-ph physics.ed-phphysics.soc-phquant-ph

keywords AI co-authoringhuman-in-the-loopscientific writingauthorship responsibilityLLM in sciencephysics manuscriptsupplementary transcriptsscientific integrity

0 comments

The pith

The human author must enforce rigorous physical logic and academic standards when co-authoring with AI, and the community should require full AI interaction transcripts as supplementary material.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper uses the author's experience drafting a computational physics manuscript with an AI as a case study to show that large language models can organize structure and generate syntax but fall short on enforcing physical reasoning and handling peer-review concerns. A sympathetic reader would care because this redefines the human contribution from generating text to mentoring the AI as a virtual collaborator, shifting authorship norms in science. The central argument is that this human-in-the-loop oversight is indispensable for maintaining scientific integrity. To preserve accountability, the paper calls for mandating publication of complete, unedited AI conversation transcripts alongside the paper.

Core claim

Using the drafting process of a recent computational physics manuscript as a case study, this essay explores the indispensable role of the Human-in-the-Loop (HITL). We demonstrate that while AI excels at structural organization and syntax generation, the human author bears the ultimate responsibility for enforcing rigorous physical logic, maintaining academic diplomacy, and anticipating peer-review critiques. In this paradigm, the human contribution shifts from writing boilerplate text to acting as a Principal Investigator who actively mentors and steers the AI's reasoning.

What carries the argument

The Human-in-the-Loop (HITL) workflow, in which the human acts as Principal Investigator who mentors and steers the AI's reasoning during co-authoring.

If this is right

AI handles structural organization and syntax but cannot replace human enforcement of physical logic.
Human responsibility includes maintaining academic diplomacy and anticipating peer-review critiques.
The human role evolves from generating text to mentoring the AI as a virtual collaborator.
Full unedited AI interaction transcripts must be published as standard supplementary material to ensure accountability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

This workflow could extend to other scientific fields where reasoning depth matters, such as biology or materials science.
Requiring transcripts might encourage clearer human prompts and reduce subtle AI-induced inconsistencies over time.
Long transcripts could create new challenges for journal storage and reader attention, prompting summaries or selective excerpts as a practical adaptation.

Load-bearing premise

Current AI limitations in scientific reasoning are fixed and require ongoing human intervention, while publishing full unedited AI transcripts remains practical without major privacy or logistical barriers.

What would settle it

A peer-reviewed physics paper produced entirely by AI with no human steering that passes standard review, or documented cases where publishing full AI transcripts creates insurmountable privacy, IP, or length problems.

Figures

Figures reproduced from arXiv: 2604.04081 by Yi Zhou.

**Figure 1.** Figure 1: The Virtual Research Group. To successfully write scalable quantum physics code and draft an academic manuscript, LLMs cannot be treated as magical oracles. They must be managed as a cohort of virtual students—a junior theorist for extraction, a senior postdoc for rigorous LaTeX specification, and a coder for implementation—all actively mentored and corrected by a Human Principal Investigator. 2. The “Insi… view at source ↗

read the original abstract

The rapid integration of Large Language Models (LLMs) into scientific writing fundamentally challenges traditional definitions of authorship, responsibility, and scientific integrity. As researchers transition from using computers as deterministic tools to managing them as ``virtual collaborators,'' the nature of human contribution must be re-evaluated. Using the drafting process of a recent computational physics manuscript as a case study, this essay explores the indispensable role of the Human-in-the-Loop (HITL). We demonstrate that while AI excels at structural organization and syntax generation, the human author bears the ultimate responsibility for enforcing rigorous physical logic, maintaining academic diplomacy, and anticipating peer-review critiques. In this paradigm, the human contribution shifts from writing boilerplate text to acting as a Principal Investigator who actively mentors and steers the AI's reasoning. To ensure accountability and preserve the integrity of the scientific record in this new era, I argue that the community must mandate the publication of full, unedited AI interaction transcripts as standard supplementary material.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript uses the author's experience co-authoring a computational physics paper with an LLM as a case study to argue that human oversight remains essential. It claims AI excels at structural organization and syntax but that the human author must enforce rigorous physical logic, maintain academic diplomacy, and anticipate peer-review critiques. The central recommendation is that the community mandate publication of full, unedited AI interaction transcripts as standard supplementary material to ensure accountability.

Significance. If the policy recommendation holds, it would increase transparency in AI-assisted scientific writing and allow better evaluation of human versus AI contributions to the record. This could help maintain integrity as LLMs become routine tools. The significance is limited by the single-case basis, which provides no comparative data or tests of feasibility across different research domains or AI systems.

major comments (1)

Abstract: The claim that the community 'must mandate' full unedited AI transcripts as supplementary material rests on a single personal case study with no systematic evidence, comparisons to other workflows, or analysis of implementation barriers such as privacy, IP, or storage logistics. This single-example foundation is load-bearing for the policy recommendation.

Simulated Author's Rebuttal

1 responses · 1 unresolved

We thank the referee for their detailed and constructive review. The feedback correctly identifies the single-case foundation of our policy recommendation as a key limitation. We have revised the manuscript to acknowledge this explicitly, soften the prescriptive language, and add discussion of practical barriers while retaining the value of the case study as an illustrative example.

read point-by-point responses

Referee: Abstract: The claim that the community 'must mandate' full unedited AI transcripts as supplementary material rests on a single personal case study with no systematic evidence, comparisons to other workflows, or analysis of implementation barriers such as privacy, IP, or storage logistics. This single-example foundation is load-bearing for the policy recommendation.

Authors: We agree that the recommendation rests on a single case study and that this limits its generalizability. The manuscript is framed as an essay based on one author's direct experience co-authoring a computational physics paper, intended to highlight concrete challenges (enforcing physical logic, maintaining academic tone, anticipating critiques) rather than to serve as a systematic empirical analysis. In revision we have changed the abstract wording from 'must mandate' to 'advocate that the community consider mandating' and inserted a new section that explicitly discusses implementation barriers, including privacy risks, intellectual-property concerns, storage and versioning logistics, and the need for future multi-domain testing. We maintain that publishing interaction transcripts remains a low-cost, high-value transparency measure even if broader validation is still required; the case study demonstrates why such transcripts would allow reviewers to distinguish human versus AI contributions. We cannot, within the scope of this essay, supply comparative data across workflows or domains. revision: partial

standing simulated objections not resolved

Providing systematic comparative evidence or multi-domain tests of the proposed policy, as this would require a separate empirical study beyond the present single-experience essay.

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper is a normative opinion essay presenting a case study of AI-assisted writing and a policy recommendation for mandatory transcript publication. It contains no mathematical derivations, fitted parameters, predictions, uniqueness theorems, or ansatzes. The central claims rest on interpretive judgment from personal experience rather than any self-referential reduction of a result to its own inputs or to a self-citation chain. No load-bearing step reduces by construction to prior outputs of the same paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that AI tools have inherent limitations in enforcing scientific logic that necessitate human oversight, presented without additional evidence.

axioms (1)

domain assumption AI excels at structural organization and syntax generation but cannot independently enforce rigorous physical logic.
Directly stated in the abstract as the foundation for redefining human contribution.

pith-pipeline@v0.9.0 · 5468 in / 1039 out tokens · 41195 ms · 2026-05-13T16:58:38.818046+00:00 · methodology

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Intrinsic Floquet Generation and $1/I$ Quantum Oscillations in a Sliding Charge-Density Wave
cond-mat.mes-hall 2026-05 unverdicted novelty 7.0

A uniformly sliding CDW is exactly solvable as a Floquet system whose sideband ladder produces 1/I oscillations in fixed-bias tunneling, explained by current percolation through localized coherent filaments.
From Paper to Program: Accelerating Quantum Many-Body Algorithm Development via a Multi-Stage LLM-Assisted Workflow
physics.comp-ph 2026-04 accept novelty 7.0

A human-in-the-loop multi-stage LLM workflow with an intermediate technical specification externalizing index conventions and contraction orders enables reliable DMRG code generation that reproduces known entanglement...

Reference graph

Works this paper leans on

6 extracted references · 6 canonical work pages · cited by 2 Pith papers · 1 internal anchor

[1]

Co-Authoring with AI: How I Wrote a Physics Paper About AI, Using AI

The Paradigm Shift: From Tool to Collaborator For years, computational physicists have used computers strictly as tools—compilers, equation solvers, and numerical libraries that execute precise, deterministic commands. But over the course of 24 hours, while building a complex tensor network engine from scratch, I experienced a fundamental paradigm shift. ...

work page internal anchor Pith review Pith/arXiv arXiv 2026
[2]

Inside-Out

The “Inside-Out” Writing Strategy When collaborating with an AI on a manuscript, the biggest mistake a researcher can make is starting with the Introduction. If you ask an LLM to write an introduction before the core arguments are locked in, it will lose context, hallucinate the narrative arc, and rush the ending. The very first and most critical step of ...

work page
[3]

However, translating the continuous, analytical mathe- matics of tensor network theory into discrete, high-performance software remains a formidable challenge

Mentoring the AI: Enforcing Academic Rigor The true value of Human-in-the-Loop (HITL) methodology is not fixing typos; it is enforcing domain-specific scientific rigor and logical consistency. Here are three exact moments from our chat transcripts where human responsibility was required to save the manuscript. 3.1. Catching Physics Inaccuracies (Discrete ...

work page
[4]

Reviewer 2

Anticipating “Reviewer 2”: Closing Logical Loopholes 2 A crucial part of human co-authorship is anticipating skepticism. A seasoned reviewer in any high-impact computational journal will not simply accept that an AI wrote a complex codebase; they will actively look for logical flaws, data contamination, and imprecise terminology. It was my responsibility ...

work page
[5]

Art Director

Directing the Visuals: AI as an Art Director The human-AI collaboration extended beyond drafting text and code into multi-modal visual storytelling. For the original computational physics manuscript, I required professional-grade figures to illustrate the multi-agent workflow and the accelerated 24-hour timeline. However, feeding generic prompts to an AI ...

work page
[6]

Virtual Research Group

Conclusion: The Future of Authorship and Transparency Writing a paper with AI is not about automation; it is about augmentation and iteration. I did not use an AI to write my paper for me. I collaborated with an AI to structure my thoughts, refine my logical arguments, and typeset my results. Throughout the process, the human physicist remained the Princi...

work page