Recognition: unknown
LLMSniffer: Detecting LLM-Generated Code via GraphCodeBERT and Supervised Contrastive Learning
Pith reviewed 2026-05-10 08:24 UTC · model grok-4.3
The pith
LLMSniffer detects LLM-generated code more accurately by fine-tuning GraphCodeBERT with a two-stage supervised contrastive learning pipeline.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
LLMSniffer improves detection of LLM-generated code by fine-tuning GraphCodeBERT with a two-stage supervised contrastive learning pipeline after comment removal preprocessing and applying an MLP classifier. This produces accuracy of 78 percent on GPTSniffer (F1 78 percent) and 94.65 percent on Whodunit (F1 94.64 percent), exceeding prior baselines. t-SNE visualizations show that the contrastive stage creates compact, well-separated clusters for human-written and LLM-generated code.
What carries the argument
The two-stage supervised contrastive learning pipeline on GraphCodeBERT, which pulls embeddings of code snippets from the same class closer together while pushing embeddings from different classes farther apart.
If this is right
- Accuracy on the GPTSniffer dataset rises by 8 percentage points over earlier detectors.
- Accuracy on the Whodunit dataset rises by 3.65 percentage points over earlier detectors.
- t-SNE plots confirm tighter intra-class clusters and clearer separation between human and LLM code embeddings.
- Public release of the trained model and demo allows direct inspection of the learned representations.
Where Pith is reading between the lines
- Detection performance is likely to decline on outputs from newer LLMs unless the model is periodically retrained on fresh examples.
- Comment removal succeeds because comments often carry distinctive human phrasing or LLM-specific artifacts that the contrastive stage can exploit.
- The same contrastive pipeline could be applied to detect AI assistance in non-code artifacts such as documentation or test cases.
- Embedding-based detection might be inserted into continuous-integration pipelines to flag potential AI contributions during code review.
Load-bearing premise
That the GPTSniffer and Whodunit datasets sufficiently represent real-world distributions of LLM-generated code and that the reported gains will hold for new LLMs or coding styles without retraining.
What would settle it
Running LLMSniffer on code samples produced by an LLM released after the training data or written in a programming style absent from the benchmarks and finding that accuracy drops to or below the prior baseline levels.
Figures
read the original abstract
The rapid proliferation of Large Language Models (LLMs) in software development has made distinguishing AI-generated code from human-written code a critical challenge with implications for academic integrity, code quality assurance, and software security. We present LLMSniffer, a detection framework that fine-tunes GraphCodeBERT using a two-stage supervised contrastive learning pipeline augmented with comment removal preprocessing and an MLP classifier. Evaluated on two benchmark datasets - GPTSniffer and Whodunit - LLMSniffer achieves substantial improvements over prior baselines: accuracy increases from 70% to 78% on GPTSniffer (F1: 68% to 78%) and from 91% to 94.65% on Whodunit (F1: 91% to 94.64%). t-SNE visualizations confirm that contrastive fine-tuning yields well-separated, compact embeddings. We release our model checkpoints, datasets, codes and a live interactive demo to facilitate further research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents LLMSniffer, a detection framework that fine-tunes GraphCodeBERT via a two-stage supervised contrastive learning pipeline, augmented by comment removal preprocessing and an MLP classifier. It reports concrete accuracy and F1 improvements over baselines on the GPTSniffer (70% to 78% accuracy, 68% to 78% F1) and Whodunit (91% to 94.65% accuracy, 91% to 94.64% F1) benchmarks, supported by t-SNE evidence of improved embedding separation, and releases model checkpoints, datasets, code, and a live demo.
Significance. If the reported gains prove robust, the work advances practical methods for distinguishing LLM-generated code, relevant to academic integrity, code quality, and security in software engineering. The explicit release of model checkpoints, datasets, code, and an interactive demo is a clear strength that supports reproducibility and follow-on research.
major comments (2)
- [Evaluation] Evaluation section: The accuracy and F1 lifts are reported as single point estimates (e.g., 70% to 78% on GPTSniffer) with no error bars, standard deviations across runs, ablation studies isolating the contrastive loss or comment removal, or statistical significance tests; this directly undermines confidence that the gains are reliable rather than dataset-specific.
- [Methods] Methods section: The two-stage supervised contrastive learning pipeline is described at a high level but lacks concrete details on the contrastive loss formulation, temperature and margin hyperparameters, the exact integration of the MLP classifier, and training schedules; without these, the central claim of improvement via this specific architecture cannot be fully assessed or replicated.
minor comments (1)
- [Results] The t-SNE visualizations are presented as qualitative evidence of separation but would benefit from quantitative metrics (e.g., silhouette score or inter-cluster distance) to strengthen the supporting claim.
Axiom & Free-Parameter Ledger
free parameters (2)
- contrastive loss temperature and margin
- MLP hidden sizes and learning rate schedule
axioms (2)
- domain assumption Pre-trained GraphCodeBERT embeddings contain transferable features for distinguishing human vs. LLM code after contrastive adaptation.
- domain assumption Comment removal preprocessing does not discard task-relevant signals.
Reference graph
Works this paper leans on
-
[1]
T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations. In International Conference on Machine Learning (ICML), pages 1597--1607, 2020. URL https://arxiv.org/abs/2002.05709
work page internal anchor Pith review arXiv 2020
-
[2]
Ding et al
Z. Ding et al. Towards understanding the capability of large language models on code clone detection: A survey. In Proceedings of the International Conference on Software Engineering (ICSE), 2022
2022
-
[3]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, and M. Zhou. CodeBERT : A pre-trained model for programming and natural languages. In Findings of the Association for Computational Linguistics: EMNLP 2020, pages 1536--1547, 2020. doi:10.18653/v1/2020.findings-emnlp.139
-
[4]
S. Gehrmann, H. Strobelt, and A. M. Rush. GLTR : Statistical detection and visualization of generated text. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 111--116, 2019. doi:10.18653/v1/P19-3019
-
[5]
GitHub Copilot : Your AI pair programmer, 2021
GitHub . GitHub Copilot : Your AI pair programmer, 2021. URL https://github.com/features/copilot
2021
-
[6]
Supervised contrastive learning for pre-trained language model fine-tuning,
B. Gunel, J. Du, A. Conneau, and V. Stoyanov. Supervised contrastive learning for pre-trained language model fine-tuning. In International Conference on Learning Representations (ICLR), 2021. URL https://arxiv.org/abs/2011.01403
-
[7]
D. Guo, S. Ren, S. Lu, Z. Feng, D. Tang, S. Liu, L. Zhou, N. Duan, A. Svyatkovskiy, S. Fu, M. Tufano, S. K. Deng, C. Clement, D. Drain, N. Sundaresan, J. Yin, D. Jiang, and M. Zhou. GraphCodeBERT : Pre-training code representations with data flow. In International Conference on Learning Representations (ICLR), 2021. URL https://openreview.net/forum?id=jLoC4ez43PZ
2021
-
[8]
Hung et al
C.-Y. Hung et al. Exploring the feasibility of automated detection of LLM -generated code through explainable AI . arXiv preprint, 2024
2024
-
[9]
CodeSearchNet Challenge: Evaluating the State of Semantic Code Search
H. Husain, H.-H. Wu, T. Gazit, M. Allamanis, and M. Brockschmidt. CodeSearchNet challenge: Evaluating the state of semantic code search. arXiv preprint arXiv:1909.09436, 2019. URL https://arxiv.org/abs/1909.09436
work page internal anchor Pith review arXiv 1909
-
[10]
Idialu et al
J. Idialu et al. Our students are using ChatGPT : Detecting AI -generated code in programming assignments. arXiv preprint, 2024
2024
-
[11]
arXiv preprint arXiv:2004.11362 , year=
P. Khosla, Y. Tian, M. Tschannen, J. Lienen, Y. Zhang, L. Jiang, D. Krishnan, and Y. Tian. Supervised contrastive learning. In Advances in Neural Information Processing Systems (NeurIPS), volume 33, pages 18661--18673, 2020. URL https://arxiv.org/abs/2004.11362
-
[12]
A watermark for large language models.arXiv preprint arXiv:2301.10226, 2023a
J. Kirchenbauer, J. Geiping, Y. Wen, J. Katz, I. Miers, and T. Goldstein. A watermark for large language models. In International Conference on Machine Learning (ICML), 2023. URL https://arxiv.org/abs/2301.10226
-
[13]
Decoupled Weight Decay Regularization
I. Loshchilov and F. Hutter. Decoupled weight decay regularization. In International Conference on Learning Representations (ICLR), 2019. URL https://arxiv.org/abs/1711.05101
work page internal anchor Pith review Pith/arXiv arXiv 2019
-
[14]
E. Mitchell, Y. Lee, A. Khazatsky, C. D. Manning, and C. Finn. DetectGPT : Zero-shot machine-generated text detection using probability curvature. arXiv preprint arXiv:2301.11305, 2023. URL https://arxiv.org/abs/2301.11305
- [15]
-
[16]
P. T. Nguyen et al. GPTSniffer : Detecting ChatGPT -generated code via fine-tuned language models. Journal of Systems and Software, 2024 b . doi:10.1016/j.jss.2024.112031. URL https://www.sciencedirect.com/science/article/pii/S0164121224001043
-
[17]
OpenAI . GPT-4 technical report. arXiv preprint arXiv:2303.08774, 2023. URL https://arxiv.org/abs/2303.08774
work page internal anchor Pith review Pith/arXiv arXiv 2023
- [18]
-
[19]
Radford, J
A. Radford, J. Wu, R. Child, D. Luan, D. Amodei, and I. Sutskever. Language models are unsupervised multitask learners. OpenAI Blog, 1 0 (8): 0 9, 2019. URL https://openai.com/blog/better-language-models
2019
-
[20]
Code Llama: Open Foundation Models for Code
B. Rozi \`e re, J. Gehring, F. Gloeckle, S. Sootla, I. Gat, X. E. Tan, Y. Adi, J. Liu, T. Remez, J. Rapin, et al. Code Llama : Open foundation models for code. arXiv preprint arXiv:2308.12950, 2023. URL https://arxiv.org/abs/2308.12950
work page internal anchor Pith review arXiv 2023
-
[21]
Shi et al
H. Shi et al. Zero-shot detection of LLM -generated code. arXiv preprint, 2024
2024
-
[22]
Emma Strubell, Ananya Ganesh, and Andrew McCallum
I. Solaiman, M. Brundage, J. Clark, A. Askell, A. Herbert-Voss, J. Wu, A. Radford, and J. Wang. Release strategies and the social impacts of language models. arXiv preprint arXiv:1908.09203, 2019. URL https://arxiv.org/abs/1908.09203
-
[23]
van der Maaten and G
L. van der Maaten and G. Hinton. Visualizing data using t-SNE . Journal of Machine Learning Research, 9: 0 2579--2605, 2008. URL https://www.jmlr.org/papers/v9/vandermaaten08a.html
2008
-
[24]
Y. Wang, W. Wang, S. Joty, and S. C. Hoi. CodeT5 : Identifier-aware unified pre-trained encoder-decoder models for code understanding and generation. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 8696--8708, 2021. doi:10.18653/v1/2021.emnlp-main.685
-
[25]
Ye et al
J. Ye et al. Detecting LLM -generated code: A zero-shot approach. arXiv preprint, 2024
2024
-
[26]
online" 'onlinestring :=
ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
-
[27]
write newline
" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.