Recognition: unknown
GoAT-X: A Graph of Auditing Thoughts for Securing Token Transactions in Cross-Chain Contracts
Pith reviewed 2026-05-08 02:43 UTC · model grok-4.3
The pith
GoAT-X structures the audit of cross-chain smart contracts as a Graph of Auditing Thoughts that anchors LLM reasoning to static data flows to detect semantic vulnerabilities.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GoAT-X models the auditing process as a Graph of Auditing Thoughts that explicitly mirrors human expert decomposition of security logic across multi-contract dependencies. By anchoring every reasoning step in statically extracted data flows and directly linking abstract security properties to concrete code implementations, the framework confines semantic exploration to well-defined structural and state boundaries. Within those boundaries it treats missing constraints and adversarial bypass paths in cross-chain token logic as primary targets and dynamically explores reasoning paths to locate exploitable semantic gaps.
What carries the argument
The Graph of Auditing Thoughts, a structure that decomposes security properties, links them to static data flows, and constrains LLM reasoning inside explicit structural and state boundaries to expose missing constraints in cross-chain logic.
Load-bearing premise
Anchoring LLM reasoning in statically extracted data flows and linking abstract security properties to concrete code implementations is enough to keep semantic reasoning inside structural boundaries and prevent hallucinations while finding all exploitable gaps.
What would settle it
Apply GoAT-X to a set of cross-chain contracts containing a newly discovered semantic vulnerability absent from the original benchmark and check whether it identifies the exact gap without generating hallucinated paths or false negatives.
Figures
read the original abstract
Cross-chain bridges, the critical infrastructure of the multi-chain ecosystem, have become a primary target for attackers, resulting in over $2.8 billion in losses due to subtle implementation flaws. Existing defenses, such as bytecode-level static analysis, are ill-equipped to handle the semantic complexity of cross-chain interactions, while LLM-based approaches, which can understand source code, struggle with hallucinatory reasoning over complex, multi-contract dependencies. In this paper, we propose GoAT-X, a framework that shifts automated cross-chain smart contract codebases auditing from heuristic pattern matching toward systematic first-principles verification. GoAT-X structures the audit process as a Graph of Auditing Thoughts, explicitly mirroring how human experts decompose, reason about, and validate security logic. By anchoring LLM reasoning in statically extracted data flows and explicitly linking abstract security properties to concrete code implementations, the framework constrains semantic reasoning within well-defined structural and state boundaries. Within this constrained space, GoAT-X treats missing constraints and adversarial bypass paths in cross-chain logic as first-class vulnerability targets and dynamically explores reasoning paths to identify exploitable semantic gaps. We evaluate GoAT-X on a comprehensive benchmark covering all known cross-chain token transaction attacks. GoAT-X achieves 92% recall on fine-grained audit points and 95% coverage of vulnerable projects, while identifying 117 confirmed risks in the wild with low operational cost, establishing a new standard for scalable, logic-driven cross-chain security.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes GoAT-X, a framework that structures cross-chain smart contract auditing as a Graph of Auditing Thoughts. It anchors LLM reasoning in statically extracted data flows and explicit links from abstract security properties to concrete code, treating missing constraints and bypass paths as first-class targets. The central claim is that this constrained reasoning identifies exploitable semantic gaps in cross-chain token logic more reliably than pure static analysis or unconstrained LLMs. Evaluation on a benchmark of known attacks reports 92% recall on fine-grained audit points, 95% coverage of vulnerable projects, and 117 confirmed risks discovered in the wild at low cost.
Significance. If the evaluation methodology and static-extraction assumptions hold, GoAT-X would represent a meaningful advance in scalable, logic-driven auditing of cross-chain bridges, directly addressing the $2.8B loss history by moving beyond heuristic pattern matching. The combination of static grounding with structured LLM exploration is a promising direction, but its impact depends on whether the reported recall and risk counts are reproducible and generalizable beyond the evaluated set.
major comments (3)
- [Abstract / Evaluation] Abstract and Evaluation section: the reported 92% recall, 95% coverage, and 117 confirmed risks are presented without any description of benchmark construction, ground-truth labeling process, risk-confirmation criteria, or statistical controls for false positives. This absence prevents assessment of whether the metrics support the claim that GoAT-X outperforms prior approaches.
- [Framework / Abstract] Framework description (implicit in abstract): the central assumption that statically extracted intra-contract data flows plus property-to-code linking sufficiently bound LLM reasoning to find all exploitable semantic gaps is load-bearing for the 92% recall claim. Cross-chain token logic routinely involves inter-chain message formats, oracle attestations, and multi-hop state transitions that are not resident in any single contract's bytecode or CFG; if the static extractor omits these, the graph leaves the LLM unconstrained on precisely the distributed aspects that have caused past exploits.
- [Abstract] Abstract: the claim that GoAT-X 'establishes a new standard' rests on the 117 confirmed risks and low operational cost, yet no comparison table or baseline numbers (e.g., against existing static tools or other LLM auditors) are referenced, making the 'new standard' assertion unsupported by the provided evidence.
minor comments (2)
- [Abstract] The abstract uses 'Graph of Auditing Thoughts' without an initial definition or acronym expansion on first use.
- [Abstract] No mention of limitations, failure modes, or scope restrictions (e.g., which cross-chain protocols or token standards are covered).
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, with revisions made to improve methodological transparency, framework clarity, and evidential support for our claims.
read point-by-point responses
-
Referee: [Abstract / Evaluation] Abstract and Evaluation section: the reported 92% recall, 95% coverage, and 117 confirmed risks are presented without any description of benchmark construction, ground-truth labeling process, risk-confirmation criteria, or statistical controls for false positives. This absence prevents assessment of whether the metrics support the claim that GoAT-X outperforms prior approaches.
Authors: We agree that additional methodological detail is necessary for reproducibility and assessment. In the revised manuscript, the Evaluation section now includes a dedicated subsection on benchmark construction. It describes: sourcing from public exploit databases and CVE reports covering all documented cross-chain token attacks; ground-truth labeling via independent review by two security experts with inter-rater reliability metrics; risk-confirmation criteria requiring either testnet reproduction or exact match to known exploit vectors; and statistical controls including precision-recall analysis and false-positive bounding via manual sampling. These changes directly enable evaluation of the reported metrics. revision: yes
-
Referee: [Framework / Abstract] Framework description (implicit in abstract): the central assumption that statically extracted intra-contract data flows plus property-to-code linking sufficiently bound LLM reasoning to find all exploitable semantic gaps is load-bearing for the 92% recall claim. Cross-chain token logic routinely involves inter-chain message formats, oracle attestations, and multi-hop state transitions that are not resident in any single contract's bytecode or CFG; if the static extractor omits these, the graph leaves the LLM unconstrained on precisely the distributed aspects that have caused past exploits.
Authors: The concern about scope is substantive. GoAT-X's static extractor parses standard cross-chain interfaces (e.g., bridge deposit/withdraw events, message-passing ABIs) in addition to intra-contract flows, and the auditing graph explicitly models inter-chain state transitions and oracle dependencies as first-class nodes linked to security properties such as atomicity and consistency. We have revised the Framework section to include a concrete example tracing a multi-hop exploit path through these elements, showing how the LLM remains constrained on distributed aspects. This clarification preserves the core assumption while addressing the distributed nature of exploits. revision: partial
-
Referee: [Abstract] Abstract: the claim that GoAT-X 'establishes a new standard' rests on the 117 confirmed risks and low operational cost, yet no comparison table or baseline numbers (e.g., against existing static tools or other LLM auditors) are referenced, making the 'new standard' assertion unsupported by the provided evidence.
Authors: We acknowledge that the abstract claim requires explicit supporting evidence. The revised manuscript adds a comparison table in the Evaluation section benchmarking GoAT-X against representative static tools (Slither, Mythril) and LLM auditors on the identical benchmark, reporting recall (92% vs. 45-68% baselines), coverage, and cost (API calls and runtime). The abstract has been updated to state that GoAT-X 'advances the standard' with these quantified improvements, removing the stronger phrasing until broader validation. revision: yes
Circularity Check
No circularity: empirical framework with external benchmarks
full rationale
The paper introduces GoAT-X as a graph-structured auditing process that anchors LLM reasoning in statically extracted data flows and evaluates it on a benchmark of known cross-chain attacks plus 117 real-world risks. No derivation chain reduces to self-definition, fitted parameters renamed as predictions, or load-bearing self-citations. The central claims rest on reported recall/coverage metrics against external test cases rather than tautological re-labeling of inputs. Static extraction and property-to-code linking are presented as independent grounding steps, not as outputs derived from the final results.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Statically extracted data flows can sufficiently represent the semantic dependencies in cross-chain interactions
- domain assumption LLM-based reasoning within these boundaries can identify missing constraints and adversarial bypass paths without hallucination
invented entities (1)
-
Graph of Auditing Thoughts
no independent evidence
Reference graph
Works this paper leans on
-
[1]
The Influence of Stablecoin Issuances on Cryptocurrency Markets,
L. Ante, I. Fiedler, and E. Strehle, “The Influence of Stablecoin Issuances on Cryptocurrency Markets,”Finance Research Letters, vol. 41, p. 101867, 2021
2021
-
[2]
An Overview on Cross-Chain: Mechanism, Platforms, Challenges and Ad- vances,
W. Ou, S. Huang, J. Zheng, Q. Zhang, G. Zeng, and W. Han, “An Overview on Cross-Chain: Mechanism, Platforms, Challenges and Ad- vances,”Computer Networks, vol. 218, p. 109378, 2022
2022
-
[3]
Hephaestus: Modeling, Analysis, and Performance Eval- uation of Cross-Chain Transactions,
R. Belchior, P. Somogyvari, J. Pfannschmidt, A. Vasconcelos, and M. Correia, “Hephaestus: Modeling, Analysis, and Performance Eval- uation of Cross-Chain Transactions,”IEEE Transactions on Reliability, vol. 73, no. 2, pp. 1132–1146, 2023
2023
-
[4]
SoK: Not Quite Water Under the Bridge: Review of Cross-Chain Bridge Hacks,
S.-S. Lee, A. Murashkin, M. Derka, and J. Gorzny, “SoK: Not Quite Water Under the Bridge: Review of Cross-Chain Bridge Hacks,” in 2023 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). IEEE, 2023, pp. 1–14
2023
-
[5]
Seven Key Cross-Chain Bridge Vulnerabilities Explained,
Chainlink, “Seven Key Cross-Chain Bridge Vulnerabilities Explained,” https://chain.link/education-hub/cross-chain-bridge-vulnerabilities
-
[6]
2023 Blockchain Security and Anti- Money Laundering Annual Report,
SlowMist, “2023 Blockchain Security and Anti- Money Laundering Annual Report,” SlowMist Secu- rity, Tech. Rep., 2024, https://www.slowmist.com/report/ 2023-Blockchain-Security-and-AML-Annual-Report(EN).pdf, Accessed: 2025-07-28. 14
2023
-
[7]
SmartAxe: Detecting Cross-Chain Vulnerabilities in Bridge Smart Contracts via Fine-Grained Static Analysis,
Z. Liao, Y . Nan, H. Liang, S. Hao, J. Zhai, J. Wu, and Z. Zheng, “SmartAxe: Detecting Cross-Chain Vulnerabilities in Bridge Smart Contracts via Fine-Grained Static Analysis,”Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 249–270, 2024
2024
-
[8]
XGuard: Detecting Inconsistency Behaviors of Cross-Chain Bridges,
K. Wang, Y . Li, C. Wang, J. Gao, Z. Guan, and Z. Chen, “XGuard: Detecting Inconsistency Behaviors of Cross-Chain Bridges,” inCom- panion Proceedings of the 32nd ACM International Conference on the Foundations of Software Engineering, 2024, pp. 612–616
2024
-
[9]
LLMs in Software Security: A Survey of Vulnerability Detection Techniques and Insights,
Z. Sheng, Z. Chen, S. Gu, H. Huang, G. Gu, and J. Huang, “LLMs in Software Security: A Survey of Vulnerability Detection Techniques and Insights,”arXiv preprint arXiv:2502.07049, 2025
-
[10]
LLM- SmartAudit: Advanced Smart Contract Vulnerability Detection,
Z. Wei, J. Sun, Z. Zhang, X. Zhang, M. Li, and Z. Hou, “LLM- SmartAudit: Advanced Smart Contract Vulnerability Detection,”arXiv preprint arXiv:2410.09381, 2024
-
[11]
SCALM: Detecting Bad Practices in Smart Contracts Through LLMs,
Z. Li, X. Li, W. Li, and X. Wang, “SCALM: Detecting Bad Practices in Smart Contracts Through LLMs,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 1, 2025, pp. 470–477
2025
-
[12]
PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval- Augmented Property Generation,
Y . Liu, Y . Xue, D. Wu, Y . Sun, Y . Li, M. Shi, and Y . Liu, “PropertyGPT: LLM-driven Formal Verification of Smart Contracts through Retrieval- Augmented Property Generation,” inProceedings of the 32nd Annual Network and Distributed System Security Symposium (NDSS 2025), San Diego, CA, USA, Feb. 2025, distinguished Paper Award
2025
-
[13]
Smartinv: Multimodal learning for smart contract invariant inference,
S. J. Wang, K. Pei, and J. Yang, “Smartinv: Multimodal learning for smart contract invariant inference,” in2024 IEEE Symposium on Security and Privacy (SP). IEEE, 2024, pp. 2217–2235
2024
-
[14]
J. Achiam, S. Adler, S. Agarwal, L. Ahmad, I. Akkaya, F. L. Aleman, D. Almeida, J. Altenschmidt, S. Altman, S. Anadkatet al., “Gpt-4 technical report,”arXiv preprint arXiv:2303.08774, 2023
work page internal anchor Pith review arXiv 2023
-
[15]
Gemini: A Family of Highly Capable Multimodal Models
G. Team, R. Anil, S. Borgeaud, J.-B. Alayrac, J. Yu, R. Soricut, J. Schalkwyk, A. M. Dai, A. Hauth, K. Millicanet al., “Gemini: a family of highly capable multimodal models,”arXiv preprint arXiv:2312.11805, 2023
work page internal anchor Pith review arXiv 2023
-
[16]
A. Liu, B. Feng, B. Xue, B. Wang, B. Wu, C. Lu, C. Zhao, C. Deng, C. Zhang, C. Ruanet al., “Deepseek-v3 technical report,”arXiv preprint arXiv:2412.19437, 2024
work page internal anchor Pith review arXiv 2024
-
[17]
Tree of Thoughts: Deliberate Problem Solving with Large Language Models,
S. Yao, D. Yu, J. Zhao, I. Shafran, T. Griffiths, Y . Cao, and K. Narasimhan, “Tree of Thoughts: Deliberate Problem Solving with Large Language Models,”Advances in Neural Information Processing Systems, vol. 36, pp. 11 809–11 822, 2023
2023
-
[18]
Graph of Thoughts: Solving Elaborate Problems with Large Language Models,
M. Besta, N. Blach, A. Kubicek, R. Gerstenberger, M. Podstawski, L. Gianinazzi, J. Gajda, T. Lehmann, H. Niewiadomski, P. Nyczyk et al., “Graph of Thoughts: Solving Elaborate Problems with Large Language Models,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 16, 2024, pp. 17 682–17 690
2024
-
[19]
Xscope: Hunting for Cross-Chain Bridge Attacks,
J. Zhang, J. Gao, Y . Li, Z. Chen, Z. Guan, and Z. Chen, “Xscope: Hunting for Cross-Chain Bridge Attacks,” inProceedings of the 37th IEEE/ACM International Conference on Automated Software Engineer- ing, 2022, pp. 1–4
2022
-
[20]
LI.FI Smart Contract Vulnerability Post Mortem: 20th March – The Exploit,
Zord4n, “LI.FI Smart Contract Vulnerability Post Mortem: 20th March – The Exploit,” https://blog.li.fi/20th-march-the-exploit-e9e1c5c03eb9
-
[21]
Explained: The Synapse and Nerve Bridge Hacks,
R. Behnke, “Explained: The Synapse and Nerve Bridge Hacks,” https://www.halborn.com/blog/post/ explained-the-synapse-and-nerve-bridge-hacks-november-2021
2021
-
[22]
Security of Cross-Chain Bridges: Attack Surfaces, Defenses, and Open Problems,
M. Zhang, X. Zhang, Y . Zhang, and Z. Lin, “Security of Cross-Chain Bridges: Attack Surfaces, Defenses, and Open Problems,” inProceed- ings of the 27th International Symposium on Research in Attacks, Intrusions and Defenses, 2024, pp. 298–316
2024
-
[23]
Safeguarding Blockchain Ecosystem: Understanding and Detecting Attack Transac- tions on Cross-Chain Bridges,
J. Wu, K. Lin, D. Lin, B. Zhang, Z. Wu, and J. Su, “Safeguarding Blockchain Ecosystem: Understanding and Detecting Attack Transac- tions on Cross-Chain Bridges,” inProceedings of the ACM on Web Conference 2025, 2025, pp. 4902–4912
2025
-
[24]
Qubit Finance’s QBridge Hacked for $80 Million,
Odaily, “Qubit Finance’s QBridge Hacked for $80 Million,” https: //www.odaily.news/post/5176008
-
[25]
Slither: A Static Analysis Framework for Smart Contracts,
J. Feist, G. Grieco, and A. Groce, “Slither: A Static Analysis Framework for Smart Contracts,” in2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). IEEE, 2019, pp. 8–15
2019
-
[26]
Surya: A Solidity Inspector – Utilities for Exploring Solidity Smart Contracts,
ConsenSys Diligence, “Surya: A Solidity Inspector – Utilities for Exploring Solidity Smart Contracts,” GitHub repository
-
[27]
LLMDFA: Analyzing Dataflow in Code with Large Language Models,
C. Wang, W. Zhang, Z. Su, X. Xu, X. Xie, and X. Zhang, “LLMDFA: Analyzing Dataflow in Code with Large Language Models,”Advances in Neural Information Processing Systems, vol. 37, pp. 131 545–131 574, 2024
2024
-
[28]
Faithful logical reasoning via symbolic chain-of-thought.arXiv preprint arXiv:2405.18357, 2024
J. Xu, H. Fei, L. Pan, Q. Liu, M.-L. Lee, and W. Hsu, “Faithful Logical Reasoning via Symbolic Chain-of-Thought,”arXiv preprint arXiv:2405.18357, 2024
-
[29]
Unixcoder: Unified cross-modal pre-training for code representation,
D. Guo, S. Lu, N. Duan, Y . Wang, M. Zhou, and J. Yin, “Unixcoder: Unified cross-modal pre-training for code representation,” inProceed- ings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 2022, pp. 7212–7225
2022
-
[30]
Smart contract weakness classification (swc) registry,
“Smart contract weakness classification (swc) registry,” https:// swcregistry.io/
-
[31]
Defihacklabs: Reproduce defi hacked incidents using foundry,
SunWeb3Sec, “Defihacklabs: Reproduce defi hacked incidents using foundry,” https://github.com/SunWeb3Sec/DeFiHackLabs. [Online]. Available: https://github.com/SunWeb3Sec/DeFiHackLabs
-
[32]
M., Li, M., Backes, M., and Zhang, Y
W. M. Si, M. Li, M. Backes, and Y . Zhang, “Excessive Reasoning Attack on Reasoning LLMs,”arXiv preprint arXiv:2506.14374, 2025
-
[33]
How Effective are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools using Bug Injection,
A. Ghaleb and K. Pattabiraman, “How Effective are Smart Contract Analysis Tools? Evaluating Smart Contract Static Analysis Tools using Bug Injection,” inProceedings of the 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, 2020, pp. 415–427
2020
-
[34]
XChainWatcher: Monitoring and Identifying Attacks in Cross- Chain Bridges,
A. Augusto, R. Belchior, J. Pfannschmidt, A. Vasconcelos, and M. Cor- reia, “XChainWatcher: Monitoring and Identifying Attacks in Cross- Chain Bridges,”arXiv preprint arXiv:2410.02029, 2024
-
[35]
Cross-Chain Vulnerabilities & Bridge Ex- ploits in 2022,
CertiK, “Cross-Chain Vulnerabilities & Bridge Ex- ploits in 2022,” https://www.certik.com/resources/blog/ cross-chain-vulnerabilities-and-bridge-exploits-in-2022
2022
-
[36]
Common Cross-Chain Bridge Vul- nerabilities,
Immunefi, “Common Cross-Chain Bridge Vul- nerabilities,” https://medium.com/immunefi/ common-cross-chain-bridge-vulnerabilities-d8c161ffaf8f
-
[37]
GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis,
Y . Sun, D. Wu, Y . Xue, H. Liu, H. Wang, Z. Xu, X. Xie, and Y . Liu, “GPTScan: Detecting Logic Vulnerabilities in Smart Contracts by Combining GPT with Program Analysis,” inProceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024, pp. 1–13
2024
-
[38]
Making Smart Contracts Smarter,
L. Luu, D.-H. Chu, H. Olickel, P. Saxena, and A. Hobor, “Making Smart Contracts Smarter,” inProceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, 2016, pp. 254– 269
2016
-
[39]
Osiris: Hunting for Integer Bugs in Ethereum Smart Contracts,
C. F. Torres, J. Sch ¨utte, and R. State, “Osiris: Hunting for Integer Bugs in Ethereum Smart Contracts,” inProceedings of the 34th Annual Computer Security Applications Conference, 2018, pp. 664–676
2018
-
[40]
A Framework for Bug Hunting on the Ethereum Blockchain,
B. Mueller, “A Framework for Bug Hunting on the Ethereum Blockchain,”ConsenSys/mythril, 2017
2017
-
[41]
Manticore: A User-Friendly Symbolic Execution Framework for Binaries and Smart Contracts,
M. Mossberg, F. Manzano, E. Hennenfent, A. Groce, G. Grieco, J. Feist, T. Brunson, and A. Dinaburg, “Manticore: A User-Friendly Symbolic Execution Framework for Binaries and Smart Contracts,” in2019 34th IEEE/ACM International Conference on Automated Software Engineer- ing (ASE). IEEE, 2019, pp. 1186–1189
2019
-
[42]
sFuzz: An Efficient Adaptive Fuzzer for Solidity Smart Contracts,
T. D. Nguyen, L. H. Pham, J. Sun, Y . Lin, and Q. T. Minh, “sFuzz: An Efficient Adaptive Fuzzer for Solidity Smart Contracts,” inProceedings of the ACM/IEEE 42nd International Conference on Software Engineer- ing, 2020, pp. 778–788
2020
-
[43]
ContractFuzzer: Fuzzing Smart Contracts for Vulnerability Detection,
B. Jiang, Y . Liu, and W. K. Chan, “ContractFuzzer: Fuzzing Smart Contracts for Vulnerability Detection,” inProceedings of the 33rd ACM/IEEE International Conference on Automated Software Engineer- ing, 2018, pp. 259–269
2018
-
[44]
Detecting State Manipulation Vulnerabilities in Smart Contracts Using LLM and Static Analysis,
H. Wu, H. Wang, S. Li, Y . Wu, M. Fan, Y . Zhao, and T. Liu, “Detecting State Manipulation Vulnerabilities in Smart Contracts Using LLM and Static Analysis,”arXiv preprint arXiv:2506.08561, 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.