ATM: CID-Brokered Pre-Write Admission for Multi-Agent Code Co-Synthesis

Eagl Huang

arxiv: 2607.00041 · v1 · pith:HDOXYQ34new · submitted 2026-06-29 · 💻 cs.SE · cs.AI

ATM: CID-Brokered Pre-Write Admission for Multi-Agent Code Co-Synthesis

Eagl Huang This is my paper

Pith reviewed 2026-07-02 20:34 UTC · model grok-4.3

classification 💻 cs.SE cs.AI

keywords multi-agent systemscode co-synthesiswrite admissionCID brokergovernance chainsemantic atomsvirtual atomsshared mutation

0 comments

The pith

ATM binds task intent, scope, admission, validation and evidence into one chain using a CID broker for shared-mutation decisions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Multi-agent LLM systems decompose software work into planning, generation and repair but still need a mechanism to decide which concurrent write intents may run in parallel, which must serialize, and which must fail closed. ATM creates a single governance chain that links task intent to repository scope, write admission, validation and evidence obligations. A CID broker performs the admission step by mapping intents to semantic atoms through adapter-guided atomization; virtual atoms supply temporary units when persistent coverage is incomplete. Writes are executed by a neutral steward rather than the proposing agents. Controlled scenarios, archived cases, an admission benchmark, a three-week adopter study and a recovery benchmark supply evidence of feasibility, auditability and bounded recoverability inside observed single-domain settings.

Core claim

The AI-Atomic-Framework (ATM) binds task intent, repository scope, write admission, validation, and evidence obligations into one governance chain. A Content Identifier (CID) broker serves as the shared-mutation admission subsystem. Adapter-guided atomization maps write intents to semantic atoms and bounded regions; when persistent atom-map coverage is incomplete, virtual atoms provide temporary auditable governance units for conservative comparison and routing. Governed shared writes are ultimately applied by a neutral steward rather than directly by proposing agents.

What carries the argument

The CID broker as shared-mutation admission subsystem that routes intents via adapter-guided atomization and virtual atoms.

If this is right

Write intents receive deterministic routing to parallel, serialized or fail-closed paths before any mutation occurs.
A neutral steward, not the proposing agents, applies all governed writes.
Virtual atoms maintain auditability when persistent atom-map coverage is incomplete.
The same governance chain supports both feasibility checks and bounded recovery in single-domain operation.
Evidence from a 12-scenario design matrix, ATM-AdmissionBench and external-adopter study is consistent with these properties.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The single-governance-domain restriction may limit direct use in federated or cross-repository agent teams.
Virtual atoms could be tested as a general pattern for any collaborative editing system that lacks complete static maps.
Integration of the CID broker with existing version-control hooks would allow empirical measurement of conflict reduction in live multi-agent workflows.

Load-bearing premise

Adapter-guided atomization plus virtual atoms can supply conservative, auditable routing when persistent atom-map coverage is incomplete without introducing new failure modes or requiring cross-domain coordination.

What would settle it

An observed case in which an incomplete atom map produces an unsafe parallel admission or an unrecoverable state inside a single governance domain.

read the original abstract

Multi-agent LLM systems can decompose software-engineering work into planning, generation, validation, and repair, but a narrower systems problem remains: before any governed shared mutation is applied, a system must decide which concurrently formed write intents may proceed in parallel, which require deterministic composition or serialization, and which must take a fail-closed path. We address this problem with the AI-Atomic-Framework (ATM), a specification-grounded governance substrate for software agents operating within a single governance domain. ATM binds task intent, repository scope, write admission, validation, and evidence obligations into one governance chain. A Content Identifier (CID) broker serves as the shared-mutation admission subsystem. Adapter-guided atomization maps write intents to semantic atoms and bounded regions; when persistent atom-map coverage is incomplete, virtual atoms provide temporary auditable governance units for conservative comparison and routing. Governed shared writes are ultimately applied by a neutral steward rather than directly by proposing agents. Evaluation combines controlled, field, adoption, and extension evidence, including a 12-scenario deterministic design matrix, three archived runner cases, ATM-AdmissionBench, three archived same-file boundary cases, a three-week external-adopter study, and an operational recovery-routing benchmark. The results support feasibility, auditability, and bounded recoverability within the observed single-domain settings, but do not claim broad comparative superiority or cross-clone governance.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

ATM gives a named governance chain for multi-agent code writes via CID broker and semantic atoms, but the virtual-atom routing logic stays underspecified and the evaluation lacks numbers.

read the letter

The core contribution is a concrete admission subsystem for concurrent write intents in multi-agent LLM code work. ATM ties task intent, repo scope, admission, validation, and evidence into one chain, with a CID broker handling shared-mutation decisions and adapter-guided atomization mapping intents to bounded regions. When coverage is incomplete, virtual atoms act as temporary units for comparison. This is new as a packaged framework for single-domain settings; the components draw from distributed-systems patterns but the combination and the virtual-atom fallback are presented as fresh.

The paper does a reasonable job laying out the problem and the governance obligations. It reports a 12-scenario design matrix, archived runner cases, ATM-AdmissionBench, boundary cases, a three-week external study, and a recovery benchmark. Those artifacts show the authors tried to test feasibility, auditability, and recoverability rather than just claiming it.

The soft spot is exactly where the stress-test flagged: the abstract gives no algorithmic detail on how virtual atoms are built, compared, or used to decide admission. Without that, it is hard to judge whether the routing stays conservative or introduces false negatives on conflicts. The evaluation evidence is described but not quantified here—no error rates, throughput numbers, or failure counts—so the claim of bounded recoverability rests on unshown data. The single-domain limit is stated clearly, which helps, but does not remove the internal mechanism gap.

This is for systems people already building multi-agent code generators who need a reusable admission pattern. It is not yet for readers looking for proven performance gains or cross-domain results. The work shows clear thinking on the governance chain and honest scoping, so it deserves a serious referee even if the virtual-atom part needs more specification and the results need numbers.

Referee Report

2 major / 1 minor

Summary. The paper introduces the AI-Atomic-Framework (ATM) as a specification-grounded governance substrate for multi-agent LLM systems performing software-engineering tasks. It binds task intent, repository scope, write admission, validation, and evidence obligations into a single chain, with a CID broker acting as the shared-mutation admission subsystem. Adapter-guided atomization maps intents to semantic atoms and bounded regions; virtual atoms handle incomplete persistent atom-map coverage via temporary auditable units for conservative comparison and routing. Governed writes are applied by a neutral steward. Evaluation draws on a 12-scenario deterministic design matrix, three archived runner cases, ATM-AdmissionBench, three archived same-file boundary cases, a three-week external-adopter study, and an operational recovery-routing benchmark; the results are presented as supporting feasibility, auditability, and bounded recoverability within observed single-domain settings, without claims of broad superiority or cross-clone applicability.

Significance. If the central claims hold, the work supplies a concrete governance mechanism for concurrent write management in multi-agent code synthesis, addressing a practical systems gap between decomposition and safe shared mutation. The multi-method evaluation approach (controlled matrix, archived cases, external study, and benchmark) is a positive feature that strengthens the feasibility argument within the stated single-domain scope.

major comments (2)

[Abstract] Abstract (paragraph on virtual atoms): the claim that virtual atoms deliver 'conservative comparison and routing' when persistent atom-map coverage is incomplete is load-bearing for the CID broker's admission subsystem and the overall governance chain, yet the manuscript provides no algorithmic specification of atom construction, comparison logic, or decision rules. Without this, it is impossible to verify absence of over-admission (false negatives on conflicts) or under-admission that would break the stated bounded recoverability.
[Abstract] Abstract (evaluation paragraph): the central claim that the listed evidence 'supports feasibility, auditability, and bounded recoverability' rests on unshown quantitative results, error bars, or derivation details from the 12-scenario design matrix, ATM-AdmissionBench, or recovery-routing benchmark. This absence prevents assessment of whether the observed single-domain results actually substantiate the recoverability guarantee.

minor comments (1)

[Abstract] The abstract introduces several new entities (CID broker, semantic atoms, virtual atoms) without a brief forward reference to their definitions or invariants in the main text.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive review and the major revision recommendation. We address each comment below. Both points correctly identify missing details in the current manuscript, and we will revise to supply them.

read point-by-point responses

Referee: [Abstract] Abstract (paragraph on virtual atoms): the claim that virtual atoms deliver 'conservative comparison and routing' when persistent atom-map coverage is incomplete is load-bearing for the CID broker's admission subsystem and the overall governance chain, yet the manuscript provides no algorithmic specification of atom construction, comparison logic, or decision rules. Without this, it is impossible to verify absence of over-admission (false negatives on conflicts) or under-admission that would break the stated bounded recoverability.

Authors: The referee correctly notes the absence of algorithmic specification for virtual atom construction, comparison logic, and decision rules. The current manuscript does not contain these details. In revision we will add a dedicated subsection with pseudocode, formal rules, and decision procedures to enable verification of conservative properties and bounded recoverability. revision: yes
Referee: [Abstract] Abstract (evaluation paragraph): the central claim that the listed evidence 'supports feasibility, auditability, and bounded recoverability' rests on unshown quantitative results, error bars, or derivation details from the 12-scenario design matrix, ATM-AdmissionBench, or recovery-routing benchmark. This absence prevents assessment of whether the observed single-domain results actually substantiate the recoverability guarantee.

Authors: The referee is correct that the abstract and evaluation summary lack the quantitative results, error bars, and derivation details. The manuscript presents the evaluation approach but does not include these specifics in the provided sections. We will revise the abstract and main text to incorporate key quantitative findings, tables, and derivations from the design matrix and benchmarks. revision: yes

Circularity Check

0 steps flagged

No significant circularity in specification-grounded framework

full rationale

The paper presents ATM as a specification-grounded governance substrate that binds task intent, repository scope, write admission, validation, and evidence obligations into one chain, with CID broker and virtual atoms as design elements for conservative routing when atom-map coverage is incomplete. No equations, fitted parameters, predictions, or derivation steps appear in the abstract or described structure. Claims rest on the construction itself plus listed evaluation evidence (design matrix, benchmarks, adopter study) rather than reducing to self-inputs by construction. No self-citation load-bearing uniqueness theorems or ansatz smuggling are referenced. This is a systems design specification, not a predictive derivation chain, so the central claims remain independent of the input patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 3 invented entities

The framework rests on the existence of a single governance domain and the ability to map arbitrary write intents to bounded semantic atoms or virtual atoms; these are introduced without external benchmarks or shipped artifacts.

axioms (1)

domain assumption A single governance domain exists in which all agents operate and a neutral steward can apply writes.
Stated in abstract as the setting for all evaluation evidence.

invented entities (3)

CID broker no independent evidence
purpose: Shared-mutation admission subsystem that routes write intents.
Core new component introduced to solve the admission problem; no independent evidence supplied.
semantic atoms no independent evidence
purpose: Bounded regions for conservative comparison and routing of writes.
Invented mapping unit; no external validation mentioned.
virtual atoms no independent evidence
purpose: Temporary auditable governance units when atom-map coverage is incomplete.
Fallback mechanism introduced to handle incomplete mappings; no falsifiable handle outside the paper.

pith-pipeline@v0.9.1-grok · 5772 in / 1339 out tokens · 23793 ms · 2026-07-02T20:34:30.902350+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 39 canonical work pages · 16 internal anchors

[1]

Pugachev, Sergey. 2025. ”CodeCRDT: Observation-Driven Coordination for Multi-Agent LLM Code Generation. ” arXiv:2510.18893 [cs.DC]. https://doi.org/10.48550/arXiv.2510.18893

work page doi:10.48550/arxiv.2510.18893 2025
[2]

Acharya, Vivek. 2026. ”Semantic Consensus: Process-Aware Conflict Detection and Resolution for Enterprise Multi-Agent LLM Systems. ” arXiv:2604.16339 [cs.AI].https://doi.org/10.48550/arXiv.2604.16339

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.16339 2026
[3]

Liu, Mengyang, Taozhi Chen, Zhenhua Xu, Xue Jiang, and Yihong Dong. 2026. ”Multi-agent Collaboration with State Management. ” arXiv:2605.20563 [cs.MA]. https://doi.org/10.48550/arXiv.2605.20563. 38

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.20563 2026
[4]

Qian, Kaiyang, Xinmin Fang, and Zhengxiong Li. 2026. ”MPAC: A Multi-Principal Agent Coordination Protocol for Interoperable Multi-Agent Collaboration. ” arXiv:2604.09744 [cs.MA].https://doi.org/10.48550/arXiv.2604.09744

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.09744 2026
[5]

Zhou, Weixing, Zhiyou Wang, Zeshun Peng, Hetian Chen, Yanfeng Zhang, and Ge Yu. 2026. ”ATCC: Adaptive Concurrency Control for Unforeseen Agentic Transactions. ” arXiv:2603.13906 [cs.DB]. https://doi.org/10.48550/arXiv.2603.13906

work page doi:10.48550/arxiv.2603.13906 2026
[6]

Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, Joseph E

Pan, Melissa Z., Mert Cemri, Lakshya A. Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, Joseph E. Gonzalez, Matei Zaharia, and Ion Stoica. 2025. ”Why Do Multiagent Systems Fail?” In *ICLR 2025 Workshop on Building Trust in Language Models and Applications*. https://openreview.net/forum?...

2025
[7]

Sartori, Camilo Chacon. 2026. ”The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents. ” arXiv:2603.24284 [cs.SE]. https://doi.org/10.48550/arXiv.2603.24284

work page doi:10.48550/arxiv.2603.24284 2026
[8]

Ellis, Clarence A., and Simon J. Gibbs. 1989. ”Concurrency Control in Groupware Systems. ” In *Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data*, 399–407. New York: ACM Press. https://doi.org/10.1145/ 67544.66963

work page arXiv 1989
[9]

Shapiro, Marc, Nuno Preguica, Carlos Baquero, and Marek Zawirski. 2011. ”Conflict-Free Replicated Data Types. ” In *Stabilization, Safety, and Security of Distributed Systems: 13th International Symposium, SSS 2011*, Lecture Notes in Computer Science 6976, 386-400. Berlin: Springer. https://doi.org/10.1007/978-3-642-24550-3_29

work page doi:10.1007/978-3-642-24550-3_29 2011
[10]

T., and John T

Kung, H. T., and John T. Robinson. 1981. ”On Optimistic Methods for Concurrency Control. ” *ACM Transactions on Database Systems* 6 (2): 213-226. https://doi.org/10.1145/319566.319567

work page doi:10.1145/319566.319567 1981
[11]

Lyu, Hongtao, Dingyan Zhang, Mingyu Wu, Xingda Wei, and Haibo Chen. 2026. ”CoAgent: Concurrency Control for Multi-Agent Systems. ” arXiv:2606.15376 [cs.DC].https://doi.org/10.48550/arXiv.2606.15376

work page doi:10.48550/arxiv.2606.15376 2026
[12]

Geng, Jiayi, and Graham Neubig. 2026. ”Effective Strategies for Asynchronous Software Engineering Agents. ” arXiv:2603.21489 [cs.CL]. https://doi.org/10.48550/arXiv.2603.21489

work page doi:10.48550/arxiv.2603.21489 2026
[13]

Zhang, Qingyu, Junzhe Li, Jiayi Lin, Changhua Luo, and Chenxiong Qian. 2026. ”Rover: Context-aware Conflict Resolution with LLM. ” arXiv:2605.17279 [cs.SE].https://doi.org/10.48550/arXiv.2605.17279

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.17279 2026
[14]

Ogenrwot, Daniel, and John Businge. 2026. ”AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub. ” arXiv:2604.03551 [cs.SE]. https://doi.org/10.48550/arXiv.2604.03551

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.03551 2026
[15]

Wang, Yifei, Ruiyin Li, Peng Liang, Qiong Feng, Zengyang Li, Mojtaba Shahin, and Arif Ali Khan. 2026. ”CodeTeam: An LLM- Powered Multi-Agent Framework for Repository-Level Code Generation. ” arXiv:2606.22082 [cs.SE]. https://doi.org/10.485 50/arXiv.2606.22082

work page internal anchor Pith review Pith/arXiv arXiv 2026
[16]

Khan, Sajjad. 2026. ”S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination. ” arXiv:2605.17076 [cs.LG]. https://doi.org/10.48550/arXiv.2605.17076

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.17076 2026
[17]

Huang, Beichen, Ran Cheng, and Kay Chen Tan. 2025. ”EvoGit: Decentralized Code Evolution via Git-Based Multi-Agent Collabo- ration. ” arXiv:2506.02049 [cs.SE].https://doi.org/10.48550/arXiv.2506.02049

work page doi:10.48550/arxiv.2506.02049 2025
[18]

Li, Yang, Siqi Ping, Xiyu Chen, Xiaojian Qi, Zigan Wang, Ye Luo, and Xiaowei Zhang. 2025. ”AgentGit: A Version Control Framework for Reliable and Scalable LLM-Powered Multi-Agent Systems. ” arXiv:2511.00628 [cs.SE]. https://doi.org/10.48550/arXiv .2511.00628

work page internal anchor Pith review doi:10.48550/arxiv 2025
[20]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Wu, Qingyun, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. 2023. ”AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. ” arXiv:2308.08155 [cs.AI].https://doi.org/10.48550/arXiv.2308.08155

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2308.08155 2023
[21]

Adya, Atul. 1999. ”Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions. ” PhD thesis, Massachusetts Institute of Technology. https://hdl.handle.net/1721.1/149899

1999
[22]

Freedman, Michael Kaminsky, and David G

Lloyd, Wyatt, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. 2011. ”Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS. ” In *Proceedings of the 23rd ACM Symposium on Operating Systems Principles*, 401-416. https://doi.org/10.1145/2043556.2043593

work page doi:10.1145/2043556.2043593 2011
[23]

Liu, Tianyang, Canwen Xu, and Julian McAuley. 2024. ”RepoBench: Benchmarking Repository-Level Code Auto-Completion Sys- tems. ” In *Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)*. https://doi.org/10 .48550/arXiv.2306.03091

work page internal anchor Pith review Pith/arXiv arXiv 2024
[24]

Ding, Yangruibo, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. ”CrossCodeEval: A Diverse and Multilingual Benchmark for Cross- File Code Completion. ” In *Advances in Neural Information Processing Systems 36*. arXiv:2310.11248. https://doi.o...

work page arXiv 2023
[25]

Li, Wei, Xin Zhang, Zhongxin Guo, Shaoguang Mao, Wen Luo, Guangyue Peng, Yangyu Huang, Houfeng Wang, and Scarlett Li
[26]

Evaluation of large language models for assessing code maintainability,

”FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation. ” In *Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics*, 17160–17176. https://doi.org/10.48550/a rXiv.2503.06680

work page doi:10.48550/a
[27]

Zan, Daoguang, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei Guan, Zhiguang 39 Yang, Yongji Wang, Qianxiang Wang, and Lizhen Cui. 2024. ”CodeS: Natural Language to Code Repository via Multi-Layer Sketch. ” arXiv:2403.16443 [cs.LG]. https://doi.org/10.48550/arXiv.2403.16443

work page doi:10.48550/arxiv.2403.16443 2024
[28]

Ding, Jingzhe, Shengda Long, Changxin Pu, Huan Zhou, Hongwan Gao, Xiang Gao, Chao He, Yue Hou, Fei Hu, Zhaojian Li, Weiran Shi, Zaiyuan Wang, Daoguang Zan, Chenchen Zhang, Xiaoxu Zhang, Qizhi Chen, Xianfu Cheng, Bo Deng, Qingshui Gu, Kai Hua, Juntao Lin, Pai Liu, Mingchen Li, Xuanguang Pan, Zifan Peng, Yujia Qin, Yong Shan, Zhewen Tan, Weihao Xie, Zihan W...

work page arXiv 2025
[29]

Sun, Chengzheng, Xiaohua Jia, Yanchun Zhang, Yun Yang, and David Chen. 1998. ”Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems. ” *ACM Transactions on Computer-Human Interaction* 5 (1): 63-108. https://doi.org/10.1145/274444.274447

work page doi:10.1145/274444.274447 1998
[30]

Chacon, Scott, and Ben Straub. 2014. *Pro Git*, 2nd ed. Apress / Open Source. https://git-scm.com/book

2014
[31]

Bernstein, Philip A., Vassos Hadzilacos, and Nathan Goodman. 1987. *Concurrency Control and Recovery in Database Systems*. Reading, MA: Addison-Wesley. https://www.microsoft.com/en-us/research/people/philbe/book/

1987
[32]

Hou, Xinyi, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. ”Large Language Models for Software Engineering: A Systematic Literature Review. ” *ACM Transactions on Software Engineering and Methodology* 33 (8): 1-79. https://doi.org/10.1145/3695988

work page doi:10.1145/3695988 2024
[33]

Chiu, Claire Cardie, Matthias Gallé, and Alexander M

Zhao, Wenting, Nan Jiang, Celine Lee, Justin T. Chiu, Claire Cardie, Matthias Gallé, and Alexander M. Rush. 2025. ”Commit0: Library Generation from Scratch. ” In *Proceedings of the 13th International Conference on Learning Representations (ICLR)*. arXiv:2412.01769 [cs.SE]. https://doi.org/10.48550/arXiv.2412.01769

work page doi:10.48550/arxiv.2412.01769 2025
[34]

Zhou, Qixing, Jiacheng Zhang, Haiyang Wang, Rui Hao, Jiahe Wang, Minghao Han, Yuxue Yang, Shuzhe Wu, Feiyang Pan, Lue Fan, Dandan Tu, and Zhaoxiang Zhang. 2026. ”FeatureBench: Benchmarking Agentic Coding for Complex Feature Development. ” arXiv:2602.10975 [cs.SE]. https://doi.org/10.48550/arXiv.2602.10975

work page doi:10.48550/arxiv.2602.10975 2026
[35]

Ni, Ziyi, Huacan Wang, Shuo Zhang, Shuo Lu, Ziyang He, Wang You, Zhenheng Tang, Yuntao Du, Bill Sun, Hongzhang Liu, Sen Hu, Ronghao Chen, Bo Li, Xin Li, Chen Hu, Binxing Jiao, Daxin Jiang, and Pin Lyu. 2025. ”GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging. ” arXiv:2508.18993 [cs.SE]. https://doi.org/1...

work page arXiv 2025
[36]

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Yang, John, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. ”SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering. ” In *Advances in Neural Information Processing Systems 37*. arXiv:2405.15793. https://doi.org/10.48550/arXiv.2405.15793

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2405.15793 2024
[37]

AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Wang, Haoyu, Christopher M. Poskitt, and Jun Sun. 2025. ”AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents. ” arXiv:2503.18666 [cs.AI].https://doi.org/10.48550/arXiv.2503.18666

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.18666 2025
[38]

Zhao, Wei, Zhe Li, Peixin Zhang, and Jun Sun. 2026. ”ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection. ” arXiv:2604.11790 [cs.CR]. https://doi.org/10.48550/arXiv.2604.11790

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.11790 2026
[39]

Winston, Cailin, Claris Winston, and René Just. 2026. ”Solver-Aided Verification of Policy Compliance in Tool-Augmented LLM Agents. ” arXiv:2603.20449 [cs.SE].https://doi.org/10.48550/arXiv.2603.20449

work page doi:10.48550/arxiv.2603.20449 2026
[40]

Sousa, Marcelo, Isil Dillig, and Shuvendu K. Lahiri. 2018. ”Verifying Semantic Conflict-Freedom in Three-Way Program Merges. ” arXiv:1802.06551 [cs.PL]. https://doi.org/10.48550/arXiv.1802.06551

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.06551 2018
[41]

Cavalcanti, Guilherme, Paulo Borba, Leonardo dos Anjos, and Jonatas Clementino. 2024. ”Semistructured Merge with Language- Specific Syntactic Separators. ” arXiv:2407.18888 [cs.SE]. https://doi.org/10.48550/arXiv.2407.18888

work page doi:10.48550/arxiv.2407.18888 2024
[42]

Mohammadi, Bardia, Nearchos Potamitis, Lars Klein, Akhil Arora, and Laurent Bindschaedler. 2026. ”Atomix: Timely, Transactional Tool Use for Reliable Agentic Workflows. ” arXiv:2602.14849 [cs.LG].https://doi.org/10.48550/arXiv.2602.14849

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.14849 2026
[43]

Chen, Zheng, Hanqing Liu, Duling Xu, Dong Dong, Jialin Li, Bangzheng Pu, and Jidong Zhai. 2026. ”Cordon: Semantic Transactions for Tool-Using LLM Agents. ” arXiv:2606.17573 [cs.OS].https://doi.org/10.48550/arXiv.2606.17573

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2606.17573 2026
[45]

Mao, Zhenyu, Jacky Keung, Fengji Zhang, Shuo Liu, Yifei Wang, and Jialong Li. 2025. ”Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach. ” arXiv:2510.12120 [cs.SE].https://doi.org/10.48550/arXiv.2510.12120

work page doi:10.48550/arxiv.2510.12120 2025
[46]

Hou, Bo, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, and Li Zhang. 2025. ”LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning. ” arXiv:2507.16395 [cs.AI]. https://doi.org/10.48550/arXiv.2 507.16395. 40

work page doi:10.48550/arxiv.2 2025

[1] [1]

Pugachev, Sergey. 2025. ”CodeCRDT: Observation-Driven Coordination for Multi-Agent LLM Code Generation. ” arXiv:2510.18893 [cs.DC]. https://doi.org/10.48550/arXiv.2510.18893

work page doi:10.48550/arxiv.2510.18893 2025

[2] [2]

Acharya, Vivek. 2026. ”Semantic Consensus: Process-Aware Conflict Detection and Resolution for Enterprise Multi-Agent LLM Systems. ” arXiv:2604.16339 [cs.AI].https://doi.org/10.48550/arXiv.2604.16339

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.16339 2026

[3] [3]

Liu, Mengyang, Taozhi Chen, Zhenhua Xu, Xue Jiang, and Yihong Dong. 2026. ”Multi-agent Collaboration with State Management. ” arXiv:2605.20563 [cs.MA]. https://doi.org/10.48550/arXiv.2605.20563. 38

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.20563 2026

[4] [4]

Qian, Kaiyang, Xinmin Fang, and Zhengxiong Li. 2026. ”MPAC: A Multi-Principal Agent Coordination Protocol for Interoperable Multi-Agent Collaboration. ” arXiv:2604.09744 [cs.MA].https://doi.org/10.48550/arXiv.2604.09744

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.09744 2026

[5] [5]

Zhou, Weixing, Zhiyou Wang, Zeshun Peng, Hetian Chen, Yanfeng Zhang, and Ge Yu. 2026. ”ATCC: Adaptive Concurrency Control for Unforeseen Agentic Transactions. ” arXiv:2603.13906 [cs.DB]. https://doi.org/10.48550/arXiv.2603.13906

work page doi:10.48550/arxiv.2603.13906 2026

[6] [6]

Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, Joseph E

Pan, Melissa Z., Mert Cemri, Lakshya A. Agrawal, Shuyi Yang, Bhavya Chopra, Rishabh Tiwari, Kurt Keutzer, Aditya Parameswaran, Kannan Ramchandran, Dan Klein, Joseph E. Gonzalez, Matei Zaharia, and Ion Stoica. 2025. ”Why Do Multiagent Systems Fail?” In *ICLR 2025 Workshop on Building Trust in Language Models and Applications*. https://openreview.net/forum?...

2025

[7] [7]

Sartori, Camilo Chacon. 2026. ”The Specification Gap: Coordination Failure Under Partial Knowledge in Code Agents. ” arXiv:2603.24284 [cs.SE]. https://doi.org/10.48550/arXiv.2603.24284

work page doi:10.48550/arxiv.2603.24284 2026

[8] [8]

Ellis, Clarence A., and Simon J. Gibbs. 1989. ”Concurrency Control in Groupware Systems. ” In *Proceedings of the 1989 ACM SIGMOD International Conference on Management of Data*, 399–407. New York: ACM Press. https://doi.org/10.1145/ 67544.66963

work page arXiv 1989

[9] [9]

Shapiro, Marc, Nuno Preguica, Carlos Baquero, and Marek Zawirski. 2011. ”Conflict-Free Replicated Data Types. ” In *Stabilization, Safety, and Security of Distributed Systems: 13th International Symposium, SSS 2011*, Lecture Notes in Computer Science 6976, 386-400. Berlin: Springer. https://doi.org/10.1007/978-3-642-24550-3_29

work page doi:10.1007/978-3-642-24550-3_29 2011

[10] [10]

T., and John T

Kung, H. T., and John T. Robinson. 1981. ”On Optimistic Methods for Concurrency Control. ” *ACM Transactions on Database Systems* 6 (2): 213-226. https://doi.org/10.1145/319566.319567

work page doi:10.1145/319566.319567 1981

[11] [11]

Lyu, Hongtao, Dingyan Zhang, Mingyu Wu, Xingda Wei, and Haibo Chen. 2026. ”CoAgent: Concurrency Control for Multi-Agent Systems. ” arXiv:2606.15376 [cs.DC].https://doi.org/10.48550/arXiv.2606.15376

work page doi:10.48550/arxiv.2606.15376 2026

[12] [12]

Geng, Jiayi, and Graham Neubig. 2026. ”Effective Strategies for Asynchronous Software Engineering Agents. ” arXiv:2603.21489 [cs.CL]. https://doi.org/10.48550/arXiv.2603.21489

work page doi:10.48550/arxiv.2603.21489 2026

[13] [13]

Zhang, Qingyu, Junzhe Li, Jiayi Lin, Changhua Luo, and Chenxiong Qian. 2026. ”Rover: Context-aware Conflict Resolution with LLM. ” arXiv:2605.17279 [cs.SE].https://doi.org/10.48550/arXiv.2605.17279

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.17279 2026

[14] [14]

Ogenrwot, Daniel, and John Businge. 2026. ”AgenticFlict: A Large-Scale Dataset of Merge Conflicts in AI Coding Agent Pull Requests on GitHub. ” arXiv:2604.03551 [cs.SE]. https://doi.org/10.48550/arXiv.2604.03551

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.03551 2026

[15] [15]

Wang, Yifei, Ruiyin Li, Peng Liang, Qiong Feng, Zengyang Li, Mojtaba Shahin, and Arif Ali Khan. 2026. ”CodeTeam: An LLM- Powered Multi-Agent Framework for Repository-Level Code Generation. ” arXiv:2606.22082 [cs.SE]. https://doi.org/10.485 50/arXiv.2606.22082

work page internal anchor Pith review Pith/arXiv arXiv 2026

[16] [16]

Khan, Sajjad. 2026. ”S-Bus: Automatic Read-Set Reconstruction for Multi-Agent LLM State Coordination. ” arXiv:2605.17076 [cs.LG]. https://doi.org/10.48550/arXiv.2605.17076

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2605.17076 2026

[17] [17]

Huang, Beichen, Ran Cheng, and Kay Chen Tan. 2025. ”EvoGit: Decentralized Code Evolution via Git-Based Multi-Agent Collabo- ration. ” arXiv:2506.02049 [cs.SE].https://doi.org/10.48550/arXiv.2506.02049

work page doi:10.48550/arxiv.2506.02049 2025

[18] [18]

Li, Yang, Siqi Ping, Xiyu Chen, Xiaojian Qi, Zigan Wang, Ye Luo, and Xiaowei Zhang. 2025. ”AgentGit: A Version Control Framework for Reliable and Scalable LLM-Powered Multi-Agent Systems. ” arXiv:2511.00628 [cs.SE]. https://doi.org/10.48550/arXiv .2511.00628

work page internal anchor Pith review doi:10.48550/arxiv 2025

[19] [20]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

Wu, Qingyun, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang. 2023. ”AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation. ” arXiv:2308.08155 [cs.AI].https://doi.org/10.48550/arXiv.2308.08155

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2308.08155 2023

[20] [21]

Adya, Atul. 1999. ”Weak Consistency: A Generalized Theory and Optimistic Implementations for Distributed Transactions. ” PhD thesis, Massachusetts Institute of Technology. https://hdl.handle.net/1721.1/149899

1999

[21] [22]

Freedman, Michael Kaminsky, and David G

Lloyd, Wyatt, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. 2011. ”Don’t Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS. ” In *Proceedings of the 23rd ACM Symposium on Operating Systems Principles*, 401-416. https://doi.org/10.1145/2043556.2043593

work page doi:10.1145/2043556.2043593 2011

[22] [23]

Liu, Tianyang, Canwen Xu, and Julian McAuley. 2024. ”RepoBench: Benchmarking Repository-Level Code Auto-Completion Sys- tems. ” In *Proceedings of the 12th International Conference on Learning Representations (ICLR 2024)*. https://doi.org/10 .48550/arXiv.2306.03091

work page internal anchor Pith review Pith/arXiv arXiv 2024

[23] [24]

Ding, Yangruibo, Zijian Wang, Wasi Uddin Ahmad, Hantian Ding, Ming Tan, Nihal Jain, Murali Krishna Ramanathan, Ramesh Nallapati, Parminder Bhatia, Dan Roth, and Bing Xiang. 2023. ”CrossCodeEval: A Diverse and Multilingual Benchmark for Cross- File Code Completion. ” In *Advances in Neural Information Processing Systems 36*. arXiv:2310.11248. https://doi.o...

work page arXiv 2023

[24] [25]

Li, Wei, Xin Zhang, Zhongxin Guo, Shaoguang Mao, Wen Luo, Guangyue Peng, Yangyu Huang, Houfeng Wang, and Scarlett Li

[25] [26]

Evaluation of large language models for assessing code maintainability,

”FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation. ” In *Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics*, 17160–17176. https://doi.org/10.48550/a rXiv.2503.06680

work page doi:10.48550/a

[26] [27]

Zan, Daoguang, Ailun Yu, Wei Liu, Dong Chen, Bo Shen, Wei Li, Yafen Yao, Yongshun Gong, Xiaolin Chen, Bei Guan, Zhiguang 39 Yang, Yongji Wang, Qianxiang Wang, and Lizhen Cui. 2024. ”CodeS: Natural Language to Code Repository via Multi-Layer Sketch. ” arXiv:2403.16443 [cs.LG]. https://doi.org/10.48550/arXiv.2403.16443

work page doi:10.48550/arxiv.2403.16443 2024

[27] [28]

Ding, Jingzhe, Shengda Long, Changxin Pu, Huan Zhou, Hongwan Gao, Xiang Gao, Chao He, Yue Hou, Fei Hu, Zhaojian Li, Weiran Shi, Zaiyuan Wang, Daoguang Zan, Chenchen Zhang, Xiaoxu Zhang, Qizhi Chen, Xianfu Cheng, Bo Deng, Qingshui Gu, Kai Hua, Juntao Lin, Pai Liu, Mingchen Li, Xuanguang Pan, Zifan Peng, Yujia Qin, Yong Shan, Zhewen Tan, Weihao Xie, Zihan W...

work page arXiv 2025

[28] [29]

Sun, Chengzheng, Xiaohua Jia, Yanchun Zhang, Yun Yang, and David Chen. 1998. ”Achieving Convergence, Causality Preservation, and Intention Preservation in Real-Time Cooperative Editing Systems. ” *ACM Transactions on Computer-Human Interaction* 5 (1): 63-108. https://doi.org/10.1145/274444.274447

work page doi:10.1145/274444.274447 1998

[29] [30]

Chacon, Scott, and Ben Straub. 2014. *Pro Git*, 2nd ed. Apress / Open Source. https://git-scm.com/book

2014

[30] [31]

Bernstein, Philip A., Vassos Hadzilacos, and Nathan Goodman. 1987. *Concurrency Control and Recovery in Database Systems*. Reading, MA: Addison-Wesley. https://www.microsoft.com/en-us/research/people/philbe/book/

1987

[31] [32]

Hou, Xinyi, Yanjie Zhao, Yue Liu, Zhou Yang, Kailong Wang, Li Li, Xiapu Luo, David Lo, John Grundy, and Haoyu Wang. 2024. ”Large Language Models for Software Engineering: A Systematic Literature Review. ” *ACM Transactions on Software Engineering and Methodology* 33 (8): 1-79. https://doi.org/10.1145/3695988

work page doi:10.1145/3695988 2024

[32] [33]

Chiu, Claire Cardie, Matthias Gallé, and Alexander M

Zhao, Wenting, Nan Jiang, Celine Lee, Justin T. Chiu, Claire Cardie, Matthias Gallé, and Alexander M. Rush. 2025. ”Commit0: Library Generation from Scratch. ” In *Proceedings of the 13th International Conference on Learning Representations (ICLR)*. arXiv:2412.01769 [cs.SE]. https://doi.org/10.48550/arXiv.2412.01769

work page doi:10.48550/arxiv.2412.01769 2025

[33] [34]

Zhou, Qixing, Jiacheng Zhang, Haiyang Wang, Rui Hao, Jiahe Wang, Minghao Han, Yuxue Yang, Shuzhe Wu, Feiyang Pan, Lue Fan, Dandan Tu, and Zhaoxiang Zhang. 2026. ”FeatureBench: Benchmarking Agentic Coding for Complex Feature Development. ” arXiv:2602.10975 [cs.SE]. https://doi.org/10.48550/arXiv.2602.10975

work page doi:10.48550/arxiv.2602.10975 2026

[34] [35]

Ni, Ziyi, Huacan Wang, Shuo Zhang, Shuo Lu, Ziyang He, Wang You, Zhenheng Tang, Yuntao Du, Bill Sun, Hongzhang Liu, Sen Hu, Ronghao Chen, Bo Li, Xin Li, Chen Hu, Binxing Jiao, Daxin Jiang, and Pin Lyu. 2025. ”GitTaskBench: A Benchmark for Code Agents Solving Real-World Tasks Through Code Repository Leveraging. ” arXiv:2508.18993 [cs.SE]. https://doi.org/1...

work page arXiv 2025

[35] [36]

SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering

Yang, John, Carlos E. Jimenez, Alexander Wettig, Kilian Lieret, Shunyu Yao, Karthik Narasimhan, and Ofir Press. 2024. ”SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering. ” In *Advances in Neural Information Processing Systems 37*. arXiv:2405.15793. https://doi.org/10.48550/arXiv.2405.15793

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2405.15793 2024

[36] [37]

AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents

Wang, Haoyu, Christopher M. Poskitt, and Jun Sun. 2025. ”AgentSpec: Customizable Runtime Enforcement for Safe and Reliable LLM Agents. ” arXiv:2503.18666 [cs.AI].https://doi.org/10.48550/arXiv.2503.18666

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2503.18666 2025

[37] [38]

Zhao, Wei, Zhe Li, Peixin Zhang, and Jun Sun. 2026. ”ClawGuard: A Runtime Security Framework for Tool-Augmented LLM Agents Against Indirect Prompt Injection. ” arXiv:2604.11790 [cs.CR]. https://doi.org/10.48550/arXiv.2604.11790

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2604.11790 2026

[38] [39]

Winston, Cailin, Claris Winston, and René Just. 2026. ”Solver-Aided Verification of Policy Compliance in Tool-Augmented LLM Agents. ” arXiv:2603.20449 [cs.SE].https://doi.org/10.48550/arXiv.2603.20449

work page doi:10.48550/arxiv.2603.20449 2026

[39] [40]

Sousa, Marcelo, Isil Dillig, and Shuvendu K. Lahiri. 2018. ”Verifying Semantic Conflict-Freedom in Three-Way Program Merges. ” arXiv:1802.06551 [cs.PL]. https://doi.org/10.48550/arXiv.1802.06551

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1802.06551 2018

[40] [41]

Cavalcanti, Guilherme, Paulo Borba, Leonardo dos Anjos, and Jonatas Clementino. 2024. ”Semistructured Merge with Language- Specific Syntactic Separators. ” arXiv:2407.18888 [cs.SE]. https://doi.org/10.48550/arXiv.2407.18888

work page doi:10.48550/arxiv.2407.18888 2024

[41] [42]

Mohammadi, Bardia, Nearchos Potamitis, Lars Klein, Akhil Arora, and Laurent Bindschaedler. 2026. ”Atomix: Timely, Transactional Tool Use for Reliable Agentic Workflows. ” arXiv:2602.14849 [cs.LG].https://doi.org/10.48550/arXiv.2602.14849

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.14849 2026

[42] [43]

Chen, Zheng, Hanqing Liu, Duling Xu, Dong Dong, Jialin Li, Bangzheng Pu, and Jidong Zhai. 2026. ”Cordon: Semantic Transactions for Tool-Using LLM Agents. ” arXiv:2606.17573 [cs.OS].https://doi.org/10.48550/arXiv.2606.17573

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2606.17573 2026

[43] [45]

Mao, Zhenyu, Jacky Keung, Fengji Zhang, Shuo Liu, Yifei Wang, and Jialong Li. 2025. ”Towards Engineering Multi-Agent LLMs: A Protocol-Driven Approach. ” arXiv:2510.12120 [cs.SE].https://doi.org/10.48550/arXiv.2510.12120

work page doi:10.48550/arxiv.2510.12120 2025

[44] [46]

Hou, Bo, Xin Tan, Kai Zheng, Fang Liu, Yinghao Zhu, and Li Zhang. 2025. ”LLM-Driven Collaborative Model for Untangling Commits via Explicit and Implicit Dependency Reasoning. ” arXiv:2507.16395 [cs.AI]. https://doi.org/10.48550/arXiv.2 507.16395. 40

work page doi:10.48550/arxiv.2 2025