Recognition: 2 theorem links
· Lean TheoremGoCoMA: Hyperbolic Multimodal Representation Fusion for Large Language Model-Generated Code Attribution
Pith reviewed 2026-05-15 00:36 UTC · model grok-4.3
The pith
GoCoMA fuses stylometric code features and binary artifact images in hyperbolic space to attribute LLM sources more accurately.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
GoCoMA projects modality embeddings into a hyperbolic Poincaré ball, fuses them via a geodesic-cosine similarity-based cross-modal attention (GCSA) fusion mechanism, and back-projects the fused representation to Euclidean space for final LLM-source attribution, consistently outperforming unimodal and Euclidean multimodal baselines on the CoDET-M4 and LLMAuthorBench benchmarks.
What carries the argument
Geodesic-cosine similarity-based cross-modal attention (GCSA) for fusing hyperbolic embeddings of code stylometry and binary pre-executable artifacts.
Load-bearing premise
Code stylometry forms a higher extrinsic level than binary pre-executable artifact images in a hierarchy that hyperbolic geometry can exploit to improve attribution.
What would settle it
Demonstrating that performance does not improve or degrades when using the hyperbolic projection and GCSA fusion compared to Euclidean alternatives on the same benchmarks would disprove the central benefit.
Figures
read the original abstract
Large Language Models (LLMs) trained on massive code corpora are now increasingly capable of generating code that is hard to distinguish from human-written code. This raises practical concerns, including security vulnerabilities and licensing ambiguity, and also motivates a forensic question: 'Who (or which LLM) wrote this piece of code?' We present GoCoMA, a multimodal framework that models an extrinsic hierarchy between (i) code stylometry, capturing higher-level structural and stylistic signatures, and (ii) image representations of binary pre-executable artifacts (BPEA), capturing lower-level, execution-oriented byte semantics shaped by compilation and toolchains. GoCoMA projects modality embeddings into a hyperbolic Poincar\'e ball, fuses them via a geodesic-cosine similarity-based cross-modal attention (GCSA) fusion mechanism, and back-projects the fused representation to Euclidean space for final LLM-source attribution. Experiments on two open-source benchmarks (CoDET-M4 and LLMAuthorBench) show that GoCoMA consistently outperforms unimodal and Euclidean multimodal baselines under identical evaluation protocols.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces GoCoMA, a multimodal framework for attributing code to its LLM source (or human). It explicitly models an extrinsic hierarchy with code stylometry as higher-level and binary pre-executable artifact (BPEA) images as lower-level, projects both modality embeddings into a hyperbolic Poincaré ball, fuses them via geodesic-cosine similarity-based cross-modal attention (GCSA), and back-projects the result to Euclidean space for final classification. Experiments on the CoDET-M4 and LLMAuthorBench benchmarks are reported to show consistent gains over unimodal and Euclidean multimodal baselines under identical protocols.
Significance. If the reported gains are reproducible, the work would demonstrate a concrete use of hyperbolic geometry to capture hierarchical structure in code stylometry and execution artifacts, offering a new direction for forensic attribution of LLM-generated code. The GCSA fusion mechanism is a specific technical contribution that could be tested in related multimodal settings.
major comments (2)
- [§4] §4 (Experiments): the abstract states that GoCoMA 'consistently outperforms' baselines on CoDET-M4 and LLMAuthorBench, yet no data splits, exact metrics, statistical significance tests, or ablation results are referenced. Without these, the central empirical claim cannot be verified and remains load-bearing for the paper's contribution.
- [§3.2] §3.2 (Hyperbolic Projection and Hierarchy): the extrinsic hierarchy between stylometry (higher) and BPEA images (lower) is asserted without supporting evidence, prior literature, or sensitivity analysis. Because the choice of Poincaré ball and geodesic-cosine fusion rests directly on this hierarchy, its justification is required for the modeling approach to be defensible.
minor comments (2)
- [§3.3] Notation for the fused representation after back-projection should be defined once and used consistently; currently the transition from hyperbolic to Euclidean space is described only at a high level.
- [§2] Add a short related-work paragraph contrasting GCSA with existing hyperbolic attention mechanisms (e.g., those based on Möbius operations) to clarify novelty.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on our manuscript. We address each major point below and will incorporate revisions to improve clarity and verifiability of the results and modeling choices.
read point-by-point responses
-
Referee: [§4] §4 (Experiments): the abstract states that GoCoMA 'consistently outperforms' baselines on CoDET-M4 and LLMAuthorBench, yet no data splits, exact metrics, statistical significance tests, or ablation results are referenced. Without these, the central empirical claim cannot be verified and remains load-bearing for the paper's contribution.
Authors: We agree that the current presentation of experimental details is insufficient for full verification. In the revised manuscript we will expand §4 with explicit descriptions of the train/validation/test splits for both CoDET-M4 and LLMAuthorBench, report exact metric values (accuracy, macro-F1, etc.) together with standard deviations over multiple runs, include results of statistical significance tests (paired t-tests with p-values) against all baselines, and provide comprehensive ablation tables isolating the contributions of hyperbolic projection, GCSA, and the hierarchy. These additions will directly support the reproducibility of the reported gains. revision: yes
-
Referee: [§3.2] §3.2 (Hyperbolic Projection and Hierarchy): the extrinsic hierarchy between stylometry (higher) and BPEA images (lower) is asserted without supporting evidence, prior literature, or sensitivity analysis. Because the choice of Poincaré ball and geodesic-cosine fusion rests directly on this hierarchy, its justification is required for the modeling approach to be defensible.
Authors: We acknowledge that the hierarchy motivation requires stronger grounding. In the revision we will add citations to prior literature on hierarchical multimodal representations in code analysis and vision-language models that similarly assign higher-level semantic features to one modality and lower-level execution or pixel features to another. We will also insert a sensitivity analysis (new subsection or appendix) that swaps the modality levels and reports the resulting performance drop, thereby demonstrating that the chosen hierarchy is empirically beneficial rather than arbitrary. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper describes a multimodal fusion approach using hyperbolic projections and geodesic-cosine attention for code attribution, with claims of outperformance on CoDET-M4 and LLMAuthorBench benchmarks under identical protocols. No equations, derivations, or modeling steps are provided in the available text that reduce any prediction or result to a fitted parameter, self-definition, or self-citation chain by construction. The hierarchy between stylometry and BPEA images is presented as an explicit modeling assumption rather than derived from prior results, and the central performance claims rest on empirical comparisons without load-bearing internal reductions. The derivation chain is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
projects modality embeddings into a hyperbolic Poincaré ball, fuses them via a geodesic-cosine similarity-based cross-modal attention (GCSA) fusion mechanism... curvature-consistent Möbius linear maps... geodesic distance dc(x,y) = 2√c tanh⁻¹(√c∥−x⊕c y∥)
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leancostAlphaLog_high_calibrated_iff echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
models an extrinsic hierarchy between (i) code stylometry, capturing higher-level structural and stylistic signatures, and (ii) image representations of binary pre-executable artifacts (BPEA), capturing lower-level, execution-oriented byte semantics
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Joseph Spracklen et al., “We have a package for you! a comprehensive analysis of package hallucinations by code generating{LLMs},” in34th USENIX Security Symposium (USENIX Security 25), 2025, pp. 3687– 3706
work page 2025
-
[2]
An analyst-inspector framework for evaluating re- producibility of llms in data science,
Qiuhai Zeng et al., “An analyst-inspector framework for evaluating re- producibility of llms in data science,”arXiv preprint arXiv:2502.16395, 2025
-
[3]
Stack overflow: A code laundering platform?,
Le An, Ons Mlouki, Foutse Khomh, and Giuliano Antoniol, “Stack overflow: A code laundering platform?,” in24th International Con- ference on Software Analysis, Evolution, and Reengineering (SANER), Klagenfurt, Austria, 2017, pp. 282–292, IEEE
work page 2017
-
[4]
Reassessing code authorship attribution in the era of language models,
Andy Liu et al., “Reassessing code authorship attribution in the era of language models,” in32nd USENIX Security Symposium (USENIX Security 23). 2023, pp. 257–274, USENIX Association
work page 2023
-
[5]
A first look at license compliance capability of llms in code generation,
Tianyi Zhang et al., “A first look at license compliance capability of llms in code generation,”arXiv preprint arXiv:2408.02487, 2024
-
[6]
Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry,
Andrea Gurioli, Maurizio Gabbrielli, and Stefano Zacchiroli, “Is This You, LLM? Recognizing AI-written Programs with Multilingual Code Stylometry,” inIEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2025, Montr ´eal, Canada, Mar. 2025
work page 2025
-
[7]
When coding style survives compilation: De-anonymizing programmers from executable binaries,
Aylin Caliskan, Fabian Yamaguchi, Engin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt, and Arvind Narayanan, “When coding style survives compilation: De-anonymizing programmers from executable binaries,” inProceedings of the Network and Distributed System Security Symposium (NDSS), 2018
work page 2018
-
[8]
Code authorship attribution: Methods and challenges,
Vaibhav Kalgutkar, Rupinder Kaur, Hern ´an Gonzalez, Natalia Stakhanova, and Anita Matyukhina, “Code authorship attribution: Methods and challenges,”ACM Computing Surveys, vol. 52, no. 1, pp. 1–36, 2020
work page 2020
-
[9]
Evgeny Bogomolov, Vladyslav Kovalenko, Yaroslav Rebryk, Alberto Bacchelli, and Timofey Bryksin, “Authorship attribution of source code: A language-agnostic approach and applicability in software engineer- ing,” inProceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (...
work page 2021
-
[10]
Jos ´e Cambronero, Francisco J. Rodr ´ıguez, Laura Moreno, Daniel M. German, and Premkumar Devanbu, “Reducing the impact of time evolution on source code authorship attribution via unsupervised data augmentation,”Proceedings of the ACM on Software Engineering, vol. 1, no. FSE, pp. 1–25, 2024
work page 2024
-
[11]
Spotting the malicious moment: Characterizing malware behavior using dynamic features,
Alberto Ferrante et al., “Spotting the malicious moment: Characterizing malware behavior using dynamic features,” in2016 11th International Conference on Availability, Reliability and Security (ARES), 2016, pp. 372–381
work page 2016
-
[12]
Who wrote this code? identifying the authors of program binaries,
Nathan Rosenblum, Xiaojin Zhu, and Barton P Miller, “Who wrote this code? identifying the authors of program binaries,” inEuropean Symposium on Research in Computer Security. Springer, 2011, pp. 172– 189
work page 2011
-
[13]
Bin- mlm: Binary authorship verification with flow-aware mixture-of-shared language model,
Qige Song, Yongzheng Zhang, Linshu Ouyang, and Yige Chen, “Bin- mlm: Binary authorship verification with flow-aware mixture-of-shared language model,” in2022 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER). IEEE, 2022, pp. 1023– 1033
work page 2022
-
[14]
Detection of llm-generated java code using discretized nested bigrams,
Timothy Paek and Chilukuri Mohan, “Detection of llm-generated java code using discretized nested bigrams,” inInternational Conference on Computational Science and Computational Intelligence. Springer, 2024, pp. 118–132
work page 2024
-
[15]
De-anonymizing programmers via code stylome- try,
Caliskan-Islam et al., “De-anonymizing programmers via code stylome- try,” in24th USENIX security symposium (USENIX Security 15), 2015, pp. 255–270
work page 2015
-
[16]
When Coding Style Survives Compilation: De-anonymizing Programmers from Executable Binaries
Aylin Caliskan, Fabian Yamaguchi, Edwin Dauber, Richard Harang, Konrad Rieck, Rachel Greenstadt, and Arvind Narayanan, “When coding style survives compilation: De-anonymizing programmers from executable binaries,”arXiv preprint arXiv:1512.08546, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[17]
code2vec: Learning distributed representations of code,
Uri Alon, Meital Zilberstein, Omer Levy, and Eran Yahav, “code2vec: Learning distributed representations of code,”Proceedings of the ACM on Programming Languages, vol. 3, no. POPL, pp. 1–29, 2019
work page 2019
-
[18]
Integration of static and dynamic code stylometry analysis for programmer de-anonymization,
Ningfei Wang et al., “Integration of static and dynamic code stylometry analysis for programmer de-anonymization,” inProceedings of the 11th ACM workshop on artificial intelligence and security, 2018, pp. 74–84
work page 2018
-
[19]
David ´Alvarez-Fidalgo and Francisco Ortin, “Clave: A deep learning model for source code authorship verification with contrastive learning and transformer encoders,”Information Processing & Management, vol. 62, no. 3, pp. 104005, 2025
work page 2025
-
[20]
Source code authorship attribution using long short-term memory based networks,
Bander Alsulami, Edwin Dauber, Richard Harang, Spiros Mancoridis, and Rachel Greenstadt, “Source code authorship attribution using long short-term memory based networks,” inEuropean Symposium on Research in Computer Security. Springer, 2017, pp. 65–82
work page 2017
-
[21]
Large-scale and robust code authorship identification with deep feature learning,
Mohammed Abuhamad, Tamer Abuhmed, David Mohaisen, and Daehun Nyang, “Large-scale and robust code authorship identification with deep feature learning,”ACM Transactions on Privacy and Security (TOPS), vol. 24, no. 4, pp. 1–35, 2021
work page 2021
-
[22]
Detecting stylistic fingerprints of large language models,
Yehonatan Bitton et al., “Detecting stylistic fingerprints of large language models,”arXiv preprint arXiv:2503.01659, 2025
-
[23]
Tamas Bisztray et al., “I know which llm wrote your code last summer: Llm generated code stylometry for authorship attribution,” arXiv preprint arXiv:2506.17323, 2025
-
[24]
Marking code without breaking it: Code watermarking for detecting llm-generated code,
Jungin Kim, Shinwoo Park, and Yo-Sub Han, “Marking code without breaking it: Code watermarking for detecting llm-generated code,”arXiv preprint arXiv:2502.18851, 2025
-
[25]
Codemark: Contextual and natural watermarking for tracing code snippet provenance,
Wei Li, Borui Yang, Yujie Sun, Suyu Chen, Yuting Chen, and Liyao Xiang, “Codemark: Contextual and natural watermarking for tracing code snippet provenance,”IEEE Transactions on Dependable and Secure Computing, 2025
work page 2025
-
[26]
A watermark for large language models,
John Kirchenbauer, Jonas Geiping, Yuxin Wen, Jonathan Katz, Ian Miers, and Tom Goldstein, “A watermark for large language models,” inInternational Conference on Machine Learning. PMLR, 2023, pp. 17061–17084
work page 2023
-
[27]
Revisiting the robustness of wa- termarking to paraphrasing attacks,
Saksham Rastogi and Danish Pruthi, “Revisiting the robustness of wa- termarking to paraphrasing attacks,”arXiv preprint arXiv:2411.05277, 2024
-
[28]
De- mark: Watermark removal in large language models,
Ruibo Chen, Yihan Wu, Junfeng Guo, and Heng Huang, “De- mark: Watermark removal in large language models,”arXiv preprint arXiv:2410.13808, 2024
-
[29]
Yue Wang, Hung Le, Akhilesh Deepak Gotmare, Nghi DQ Bui, Junnan Li, and Steven CH Hoi, “Codet5+: Open code large language models for code understanding and generation,”arXiv preprint arXiv:2305.07922, 2023
-
[30]
Qwen2.5-Coder Technical Report
Binyuan Hui, Jian Yang, Zeyu Cui, Jiaxi Yang, Dayiheng Liu, Lei Zhang, Tianyu Liu, Jiajun Zhang, Bowen Yu, Keming Lu, et al., “Qwen2. 5- coder technical report,”arXiv preprint arXiv:2409.12186, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[31]
Unixcoder: Unified cross-modal pre-training for code representation,
Daya Guo, Shuai Lu, Nan Duan, Yanlin Wang, Ming Zhou, and Jian Yin, “Unixcoder: Unified cross-modal pre-training for code representation,” arXiv preprint arXiv:2203.03850, 2022
-
[32]
CodeBERT: A Pre-Trained Model for Programming and Natural Languages
Zhangyin Feng, Daya Guo, Duyu Tang, Nan Duan, Xiaocheng Feng, Ming Gong, Linjun Shou, Bing Qin, Ting Liu, Daxin Jiang, et al., “Codebert: A pre-trained model for programming and natural lan- guages,”arXiv preprint arXiv:2002.08155, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2002
-
[33]
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feichtenhofer, Trevor Darrell, and Saining Xie, “A convnet for the 2020s,”Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recogni- tion (CVPR), pp. 11976–11986, 2022
work page 2022
-
[34]
Efficientnetv2: Smaller models and faster training,
Mingxing Tan and Quoc V . Le, “Efficientnetv2: Smaller models and faster training,” inProceedings of the 38th International Conference on Machine Learning (ICML), 2021, pp. 10096–10106
work page 2021
-
[35]
An image is worth 16x16 words: Transformers for image recognition at scale,
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weis- senborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby, “An image is worth 16x16 words: Transformers for image recognition at scale,” inInternational Conference on Learning Representations (ICLR), 2021
work page 2021
-
[36]
Maxvit: Multi-axis vision transformer,
Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan C. Bovik, and Anish Mittal, “Maxvit: Multi-axis vision transformer,” inProceedings of the European Conference on Computer Vision (ECCV), 2022, pp. 459–479
work page 2022
-
[37]
Poincar ´e embeddings for learning hierarchical representations,
Maximillian Nickel and Douwe Kiela, “Poincar ´e embeddings for learning hierarchical representations,”Advances in neural information processing systems, vol. 30, 2017
work page 2017
-
[38]
Octavian Ganea, Gary B ´ecigneul, and Thomas Hofmann, “Hyperbolic neural networks,”Advances in neural information processing systems, vol. 31, 2018
work page 2018
-
[39]
Orchid Chetia Phukan, Mohd Mujtaba Akhtar, Swarup Ranjan Behera, Pailla Balakrishna Reddy, Arun Balaji Buduru, Rajesh Sharma, et al., “Hyfuse: Aligning heterogeneous speech pre-trained representations in hyperbolic space for speech emotion recognition,”arXiv preprint arXiv:2506.03403, 2025
-
[40]
Daniil Orel et al., “CoDet-m4: Detecting machine-generated code in multi-lingual, multi-generator and multi-domain settings,” inFindings of the Association for Computational Linguistics: ACL 2025, Wanxiang Che, Joyce Nabende, Ekaterina Shutova, and Mohammad Taher Pilehvar, Eds., Vienna, Austria, July 2025, pp. 10570–10593, Association for Computational Li...
work page 2025
-
[41]
Malnet: A large-scale image database of malicious software,
Scott Freitas, Rahul Duggal, and Duen Horng Chau, “Malnet: A large-scale image database of malicious software,” inProceedings of the 31st ACM International Conference on Information & Knowledge Management, 2022, pp. 3948–3952
work page 2022
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.