Recognition: unknown
Identifying and Characterizing Semantic Clones of Solidity Functions
Pith reviewed 2026-05-07 11:12 UTC · model grok-4.3
The pith
A code-and-comment analysis method detects semantically equivalent Solidity functions with 97% recall.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors demonstrate that semantic equivalence between Solidity functions can be identified by analyzing their code implementations together with their comments, achieving high recall in a large-scale validation. This method also characterizes the differences between equivalent functions as design choices around security, modularity, and optimization. When comments are absent, LLM-generated summaries combined with sentence transformers enable detection at 75% precision.
What carries the argument
The hybrid analysis of code structure and comment semantics to determine functional equivalence between functions.
If this is right
- The method provides a benchmark for future clone detection research in smart contract languages.
- Developers can use it to discover alternative implementations that improve security or reduce costs.
- It helps in tracing how vulnerabilities spread through code reuse in Ethereum.
- Automated tools can suggest better design alternatives during contract development.
- The technique extends to poorly documented code via LLM assistance.
Where Pith is reading between the lines
- Extending this to dynamic analysis or execution traces could improve precision by verifying actual behavior.
- The identified design alternatives might be used to build a catalog of best practices for Solidity.
- Integration into development environments could reduce accidental vulnerability introduction through cloning.
- Similar methods could apply to other blockchain languages like Vyper if comment practices are comparable.
Load-bearing premise
Human judges can accurately and consistently assess whether two Solidity functions are semantically equivalent.
What would settle it
A follow-up study on a new random sample from the dataset finds that more than 40% of the pairs flagged as clones do not actually perform the same operations when executed with matching inputs.
Figures
read the original abstract
Smart Contracts are essential blockchain components, mainly written in Solidity. The high availability of public Solidity code leads to frequent reuse and high clone ratios. Since cloning can propagate vulnerabilities and flaws, effective detection is crucial. Although existing techniques work well in detecting syntactic clones, the identification of semantic clones is an open problem. To address this challenge, in this paper, we present and empirically assess a scalable methodology, based on analyzing code and comments, to spot semantically equivalent Solidity functions. We first collected an up-to-date dataset of about 300,000 Ethereum smart contracts, 82.07% of which are compliant with modern Solidity version 0.8. Manual validation of a statistically significant sample comprising 1,155 function pairs confirms the effectiveness of our solution, achieving an overall precision of 59% (rising to 84% for homonymous functions) and a recall of 97%. Besides, we explore the structural differences occurring on semantically equivalent Solidity functions, demonstrating that they often represent design alternatives focused on security choices, modularization, and gas optimization. Finally, we investigate the use of Large Language Models (LLMs) as documentation engines in scenarios where code comments are poor or absent. Our results show that LLM-generated summaries, combined with sentence transformers like BERT, can bridge the documentation gap, enabling the identification of semantic clones in uncommented code with 75% precision. This work establishes a modern benchmark for Solidity clone detection and provides a foundation for the automated discovery of secure and efficient code alternatives.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents a methodology for detecting semantic clones in Solidity smart contract functions by analyzing both source code and comments. It describes collection of a ~300k-contract Ethereum dataset (82% v0.8 compliant), manual validation of 1,155 function pairs yielding 59% overall precision (84% for homonyms) and 97% recall, characterization of structural differences in equivalent functions (security, modularization, gas), and an LLM-based approach for generating summaries to detect clones in uncommented code at 75% precision.
Significance. If the validation methodology is sound and reproducible, the work supplies a timely empirical benchmark for semantic clone detection in Solidity, an area where syntactic techniques dominate but semantic equivalence remains open. The structural characterization and LLM documentation results offer concrete starting points for tools that could reduce vulnerability propagation in blockchain code.
major comments (2)
- [Abstract and Evaluation] Abstract and Evaluation section: the headline effectiveness claim rests on manual validation of 1,155 pairs reporting 59% precision / 97% recall, yet no inter-rater agreement statistic, judge instructions, operational definition of semantic equivalence (distinct from name or syntactic similarity), exclusion criteria, or sampling frame (random vs. stratified vs. homonym-only) is supplied. This directly affects whether the numbers can be read as evidence that the code+comment method finds semantic clones.
- [Abstract and LLM experiment] Abstract and LLM experiment: the 75% precision result for LLM-generated summaries is presented without describing the judgment protocol, the size or selection of the uncommented-code test set, or any comparison baseline, making it impossible to assess whether the approach genuinely bridges the documentation gap.
minor comments (1)
- [Abstract] The phrase 'statistically significant sample' appears without reporting the target confidence level, effect size, or power calculation used to determine the 1,155-pair size.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback, which highlights important aspects of reproducibility in our evaluation. We address each major comment below with clarifications and specific plans for revision to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract and Evaluation] Abstract and Evaluation section: the headline effectiveness claim rests on manual validation of 1,155 pairs reporting 59% precision / 97% recall, yet no inter-rater agreement statistic, judge instructions, operational definition of semantic equivalence (distinct from name or syntactic similarity), exclusion criteria, or sampling frame (random vs. stratified vs. homonym-only) is supplied. This directly affects whether the numbers can be read as evidence that the code+comment method finds semantic clones.
Authors: We agree that the current description of the manual validation lacks sufficient methodological detail for full reproducibility and assessment of the reported precision and recall. In the revised manuscript, we will expand the relevant section to explicitly provide: the operational definition of semantic equivalence (functions that produce equivalent outputs and side-effects for identical inputs, independent of syntactic structure or function names); the sampling frame (a stratified random sample drawn from all detected clone pairs, with stratification by name similarity to balance homonyms and non-homonyms); exclusion criteria (pairs lacking sufficient context or representing trivial syntactic matches); the judge instructions given to the two independent annotators; and the inter-rater agreement (Cohen's kappa). These additions will directly address the concern and allow readers to evaluate the validity of the 59%/97% figures. revision: yes
-
Referee: [Abstract and LLM experiment] Abstract and LLM experiment: the 75% precision result for LLM-generated summaries is presented without describing the judgment protocol, the size or selection of the uncommented-code test set, or any comparison baseline, making it impossible to assess whether the approach genuinely bridges the documentation gap.
Authors: We concur that the LLM experiment requires expanded methodological transparency to substantiate the 75% precision result. In the revised version, we will add: a description of the judgment protocol (using the same operational definition of semantic equivalence as the primary validation, performed by the same annotators); the size and selection details of the test set (a random sample of 300 pairs involving uncommented functions paired with LLM-generated summaries); and a comparison baseline (direct embedding similarity without summaries, which yielded lower precision). These revisions will clarify the contribution of the LLM summaries in addressing the documentation gap. revision: yes
Circularity Check
No significant circularity in the derivation chain
full rationale
The paper describes an empirical methodology for identifying semantic clones in Solidity smart contracts via code and comment analysis, followed by manual validation on a sampled set of 1,155 function pairs drawn from an external corpus of ~300k contracts. No equations, fitted parameters, or self-referential definitions appear in the provided text; the reported precision (59%, 84% for homonyms) and recall (97%) are computed directly from human judgments on the sampled pairs rather than being forced by any internal construction or self-citation chain. The central claims rest on independent data collection and external validation steps that do not reduce to the methodology's own inputs by definition.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Human annotators can consistently determine whether two Solidity functions are semantically equivalent.
- domain assumption The collected set of ~300k Ethereum contracts is representative of publicly deployed Solidity code.
Reference graph
Works this paper leans on
-
[1]
W. Zou, D. Lo, P. S. Kochhar, X.-B. D. Le, X. Xia, Y . Feng, Z. Chen, B. Xu, Smart contract development: Challenges and opportunities, IEEE transactions on software engineering 47 (10) (2019) 2084–2106
2019
-
[2]
Z. Wan, X. Xia, A. E. Hassan, What do programmers discuss about blockchain? a case study on the use of balanced lda and the reference architecture of a domain to capture online discussions about blockchain platforms across stack exchange communities, IEEE Transactions on Software Engineering 47 (7) (2019) 1331–1349
2019
-
[3]
M. Alharby, A. Van Moorsel, Blockchain-based smart contracts: A systematic mapping study, arXiv preprint arXiv:1710.06372 (2017)
-
[4]
K. Sun, Z. Xu, C. Liu, K. Li, Y . Liu, Demystifying the composition and code reuse in solidity smart contracts, in: Proceedings of the 31st ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2023, pp. 796–807
2023
-
[5]
Hwang, S
S. Hwang, S. Ryu, Gap between theory and practice: An empirical study of security patches in solidity, in: Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering, 2020, pp. 542–553
2020
-
[6]
Wöhrer, U
M. Wöhrer, U. Zdun, Domain specific language for smart contract development, in: 2020 IEEE International Conference on Blockchain and Cryptocurrency (ICBC), IEEE, 2020, pp. 1–9
2020
-
[7]
Dhaiouir, S
S. Dhaiouir, S. Assar, A systematic literature review of blockchain-enabled smart contracts: platforms, lan- guages, consensus, applications and choice criteria, in: Research Challenges in Information Science: 14th Inter- national Conference, RCIS 2020, Limassol, Cyprus, September 23–25, 2020, Proceedings 14, Springer, 2020, pp. 249–266
2020
-
[8]
Z. Wan, X. Xia, D. Lo, J. Chen, X. Luo, X. Yang, Smart contract security: A practitioners’ perspective, in: 2021 IEEE/ACM 43rd international conference on software engineering (ICSE), IEEE, 2021, pp. 1410–1422. 21
2021
-
[9]
X. Chen, P. Liao, Y . Zhang, Y . Huang, Z. Zheng, Understanding code reuse in smart contracts, in: 2021 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER), IEEE, 2021, pp. 470– 479
2021
-
[10]
Kondo, G
M. Kondo, G. A. Oliva, Z. M. Jiang, A. E. Hassan, O. Mizuno, Code cloning in smart contracts: a case study on verified contracts from the ethereum blockchain platform, Empirical Software Engineering 25 (2020) 4617– 4675
2020
-
[11]
C. K. Roy, J. R. Cordy, R. Koschke, Comparison and evaluation of code clone detection techniques and tools: A qualitative approach, Science of computer programming 74 (7) (2009) 470–495
2009
-
[12]
F. Khan, I. David, D. Varro, S. McIntosh, Code cloning in smart contracts on the ethereum platform: An extended replication study, IEEE Transactions on Software Engineering 49 (4) (2022) 2006–2019
2022
-
[13]
Z. Gao, V . Jayasundara, L. Jiang, X. Xia, D. Lo, J. Grundy, Smartembed: A tool for clone and bug detection in smart contracts through structural code embedding, in: 2019 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, 2019, pp. 394–397
2019
-
[14]
N. He, L. Wu, H. Wang, Y . Guo, X. Jiang, Characterizing code clones in the ethereum smart contract ecosys- tem, in: Financial Cryptography and Data Security: 24th International Conference, FC 2020, Kota Kinabalu, Malaysia, February 10–14, 2020 Revised Selected Papers 24, Springer, 2020, pp. 654–675
2020
-
[15]
Zakeri-Nasrabadi, S
M. Zakeri-Nasrabadi, S. Parsa, M. Ramezani, C. Roy, M. Ekhtiarzadeh, A systematic literature review on source code similarity measurement and clone detection: Techniques, applications, and challenges, Journal of Systems and Software 204 (2023) 111796
2023
-
[16]
H. Liu, Z. Yang, C. Liu, Y . Jiang, W. Zhao, J. Sun, Eclone: Detect semantic clones in ethereum via symbolic transaction sketch, in: Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, 2018, pp. 900–903
2018
-
[17]
Z. Wang, Z. Wan, Y . Chen, Y . Zhang, D. Lo, D. Xie, X. Yang, Clone detection for smart contracts: How far are we?, Proceedings of the ACM on Software Engineering 2 (FSE) (2025) 1249–1269
2025
-
[18]
P. Rani, S. Panichella, M. Leuenberger, A. Di Sorbo, O. Nierstrasz, How to identify class comment types? a multi-language approach for class comment classification, Journal of systems and software 181 (2021) 111047
2021
- [19]
-
[20]
Durieux, J
T. Durieux, J. F. Ferreira, R. Abreu, P. Cruz, Empirical review of automated analysis tools on 47,587 ethereum smart contracts, in: Proceedings of the ACM/IEEE 42nd International conference on software engineering, 2020, pp. 530–541
2020
-
[21]
J. F. Ferreira, P. Cruz, T. Durieux, R. Abreu, Smartbugs: A framework to analyze solidity smart contracts, in: Proceedings of the 35th IEEE/ACM international conference on automated software engineering, 2020, pp. 1349–1352
2020
-
[22]
G. Ibba, S. Aufiero, R. Neykova, S. Bartolucci, M. Ortu, R. Tonelli, G. Destefanis, A curated solidity smart con- tracts repository of metrics and vulnerability, in: Proceedings of the 20th International Conference on Predictive Models and Data Analytics in Software Engineering, 2024, pp. 32–41
2024
-
[23]
Zheng, J
Z. Zheng, J. Su, J. Chen, D. Lo, Z. Zhong, M. Ye, Dappscan: building large-scale datasets for smart contract weaknesses in dapp projects, IEEE Transactions on Software Engineering (2024)
2024
-
[24]
G. Iuliano, D. Corradini, M. Pasqua, M. Ceccato, D. Di Nucci, How do solidity versions affect vulnerability detection tools? an empirical study, arXiv preprint arXiv:2504.05515 (2025). 22
-
[25]
C. Shi, B. Cai, Y . Zhao, L. Gao, K. Sood, Y . Xiang, Coss: Leveraging statement semantics for code summariza- tion, IEEE Transactions on Software Engineering 49 (6) (2023) 3472–3486
2023
- [26]
-
[27]
Zhang, S
Z. Zhang, S. Chen, G. Fan, G. Yang, Z. Feng, Ccgra: Smart contract code comment generation with retrieval- enhanced approach., in: SEKE, 2023, pp. 212–217
2023
-
[28]
R. Mo, H. Song, W. Ding, C. Wu, Code cloning in solidity smart contracts: Prevalence, evolution, and impact on development, in: 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), IEEE Computer Society, 2025, pp. 660–660
2025
-
[29]
S. N. Pinku, D. Mondal, C. K. Roy, On the use of deep learning models for semantic clone detection, in: 2024 IEEE International Conference on Software Maintenance and Evolution (ICSME), IEEE, 2024, pp. 512–524
2024
-
[30]
Arshad, S
S. Arshad, S. Abid, S. Shamail, Codebert for code clone detection: A replication study, in: 2022 IEEE 16th International Workshop on Software Clones (IWSC), IEEE, 2022, pp. 39–45
2022
-
[31]
Luu, D.-H
L. Luu, D.-H. Chu, H. Olickel, P. Saxena, A. Hobor, Making smart contracts smarter, in: Proceedings of the 2016 ACM SIGSAC conference on computer and communications security, 2016, pp. 254–269
2016
-
[32]
S. et al., Identifying and characterizing semantic clones of solidity functions dataset (2026).doi:10.6084/m9. figshare.29195039. URLhttps://figshare.com/s/c5ed9d941623cb7d239d
work page doi:10.6084/m9 2026
-
[33]
Medvedev, Ethereum in bigquery: a public dataset for smart contract analytics, accessed: 2025-05-07 (2018)
E. Medvedev, Ethereum in bigquery: a public dataset for smart contract analytics, accessed: 2025-05-07 (2018). URLhttps://cloud.google.com/blog/products/data-analytics/ethereum-bigquery-public-dataset-smart-contract-analytics
2025
-
[34]
Grech, M
N. Grech, M. Kong, A. Jurisevic, L. Brent, B. Scholz, Y . Smaragdakis, Madmax: Surviving out-of-gas conditions in ethereum smart contracts, Proceedings of the ACM on Programming Languages 2 (OOPSLA) (2018) 1–27
2018
-
[35]
Z. Gao, L. Jiang, X. Xia, D. Lo, J. Grundy, Checking smart contracts with structural code embedding, IEEE Transactions on Software Engineering 47 (12) (2020) 2874–2891
2020
-
[36]
Bojanowski, E
P. Bojanowski, E. Grave, A. Joulin, T. Mikolov, Enriching word vectors with subword information, Transactions of the association for computational linguistics 5 (2017) 135–146
2017
-
[37]
N. Reimers, I. Gurevych, Making monolingual sentence embeddings multilingual using knowledge distillation, arXiv preprint arXiv:2004.09813 (2020)
-
[38]
Efficient Estimation of Word Representations in Vector Space
T. Mikolov, K. Chen, G. Corrado, J. Dean, Efficient estimation of word representations in vector space, arXiv preprint arXiv:1301.3781 (2013)
work page internal anchor Pith review arXiv 2013
-
[39]
BERTScore: Evaluating Text Generation with BERT
T. Zhang, V . Kishore, F. Wu, K. Q. Weinberger, Y . Artzi, Bertscore: Evaluating text generation with bert, arXiv preprint arXiv:1904.09675 (2019)
work page internal anchor Pith review arXiv 1904
-
[40]
S. E. Maxwell, K. Kelley, J. R. Rausch, Sample size planning for statistical power and accuracy in parameter estimation, Annu. Rev. Psychol. 59 (1) (2008) 537–563
2008
-
[41]
Cohen, A coefficient of agreement for nominal scales, Educational and psychological measurement 20 (1) (1960) 37–46
J. Cohen, A coefficient of agreement for nominal scales, Educational and psychological measurement 20 (1) (1960) 37–46
1960
- [42]
-
[43]
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
J. Devlin, M. Chang, K. Lee, K. Toutanova, BERT: pre-training of deep bidirectional transformers for language understanding, CoRR abs/1810.04805 (2018).arXiv:1810.04805. URLhttp://arxiv.org/abs/1810.04805 23
work page internal anchor Pith review arXiv 2018
-
[44]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, M. Zhou, Codebert: A pre-trained model for programming and natural languages (2020).arXiv:2002.08155
work page internal anchor Pith review arXiv 2020
-
[45]
Ghaleb, J
A. Ghaleb, J. Rubin, K. Pattabiraman, Achecker: Statically detecting smart contract access control vulnera- bilities, in: 2023 IEEE/ACM 45th International Conference on Software Engineering (ICSE), IEEE, 2023, pp. 945–956
2023
-
[46]
P. Rani, A. Blasi, N. Stulova, S. Panichella, A. Gorla, O. Nierstrasz, A decade of code comment quality assess- ment: A systematic literature review, Journal of Systems and Software 195 (2023) 111515
2023
-
[47]
Haque, Z
S. Haque, Z. Eberhart, A. Bansal, C. McMillan, Semantic similarity metrics for evaluating source code summa- rization, in: Proceedings of the 30th IEEE/ACM International Conference on Program Comprehension, 2022, pp. 36–47
2022
-
[48]
Y . Hong, C. Tantithamthavorn, P. Thongtanunam, A. Aleti, Commentfinder: a simpler, faster, more accurate code review comments recommendation, in: Proceedings of the 30th ACM joint European software engineering conference and symposium on the foundations of software engineering, 2022, pp. 507–519
2022
-
[49]
X. Hu, Z. Gao, X. Xia, D. Lo, X. Yang, Automating user notice generation for smart contract functions, in: 2021 36th IEEE/ACM International Conference on Automated Software Engineering (ASE), IEEE, 2021, pp. 5–17. 24
2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.