Recognition: 2 theorem links
· Lean TheoremContractShield: Bridging Semantic-Structural Gaps via Hierarchical Cross-Modal Fusion for Multi-Label Vulnerability Detection in Obfuscated Smart Contracts
Pith reviewed 2026-05-13 20:05 UTC · model grok-4.3
The pith
ContractShield detects smart contract vulnerabilities under obfuscation by fusing semantic, temporal and structural features with hierarchical cross-modal attention.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ContractShield establishes that hierarchical cross-modal fusion—starting with self-attention per modality, followed by cross-modal attention to connect complementary signals, and ending with adaptive weighting based on feature reliability—bridges semantic-structural gaps and delivers robust multi-label vulnerability detection in obfuscated smart contracts, with only minor performance degradation compared to clean data.
What carries the argument
The three-level fusion mechanism of self-attention, cross-modal attention, and adaptive weighting that integrates outputs from CodeBERT, xLSTM, and GATv2 to correlate features and calibrate contributions under obfuscation.
If this is right
- The approach achieves an 89% Hamming score on obfuscated data, dropping only 1-3% from non-obfuscated performance.
- It detects five major vulnerability types simultaneously at 91% F1-score.
- Performance exceeds state-of-the-art methods by 6-15% in adversarial obfuscated conditions.
- Structural invariants captured by graph attention remain useful despite control flow manipulation.
Where Pith is reading between the lines
- Similar hierarchical fusion could improve detection in other domains with obfuscated or noisy code, such as malware analysis.
- Deploying this in smart contract auditing tools might lower the success rate of hidden exploits in decentralized applications.
- Testing the adaptive weighting on other multimodal datasets could reveal if it generalizes beyond smart contracts without labeled reliability data.
- Extending the model to include additional modalities like data flow graphs might further strengthen resilience to advanced obfuscation.
Load-bearing premise
The semantic, temporal, and structural modalities provide complementary information even after obfuscation, allowing the adaptive weighting to down-weight unreliable features correctly without ground-truth reliability information at inference time.
What would settle it
Evaluating ContractShield on a fresh collection of smart contracts obfuscated with techniques not seen during training, where the Hamming score drops below 80 percent or no longer outperforms baselines, would disprove the claimed resilience.
Figures
read the original abstract
Smart contracts are increasingly targeted by adversaries employing obfuscation techniques such as bogus code injection and control flow manipulation to evade vulnerability detection. Existing multimodal methods often process semantic, temporal, and structural features in isolation and fuse them using simple strategies such as concatenation, which neglects cross-modal interactions and weakens robustness, as obfuscation of a single modality can sharply degrade detection accuracy. To address these challenges, we propose ContractShield, a robust multimodal framework with a novel fusion mechanism that effectively correlates multiple complementary features through a three-level fusion. Self-attention first identifies patterns that indicate vulnerability within each feature space. Cross-modal attention then establishes meaningful connections between complementary signals across modalities. Then, adaptive weighting dynamically calibrates feature contributions based on their reliability under obfuscation. For feature extraction, ContractShield integrates (1) CodeBERT with a sliding window mechanism to capture semantic dependencies in source code, (2) Extended long short-term memory (xLSTM) to model temporal dynamics in opcode sequences, and (3) GATv2 to identify structural invariants in control flow graphs (CFGs) that remain stable across obfuscation. Empirical evaluation demonstrates resilience of ContractShield, achieving a 89 percentage Hamming Score with only a 1-3 percentage drop compared to non-obfuscated data. The framework simultaneously detects five major vulnerability types with 91 percentage F1-score, outperforming state-of-the-art approaches by 6-15 percentage under adversarial conditions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes ContractShield, a multimodal framework for multi-label vulnerability detection in obfuscated smart contracts. It extracts semantic features via CodeBERT with sliding windows, temporal dynamics via xLSTM on opcode sequences, and structural invariants via GATv2 on CFGs. These are fused hierarchically through per-modality self-attention, cross-modal attention, and an adaptive weighting step that purportedly calibrates contributions based on reliability under obfuscation. The central empirical claim is that the model achieves 89% Hamming score (1-3% drop from clean data) and 91% F1 across five vulnerability types while outperforming prior work by 6-15% under adversarial conditions.
Significance. If the reported robustness results hold under proper validation, the work would address a practically important gap in smart-contract security by demonstrating that hierarchical cross-modal fusion can maintain detection performance when individual modalities are degraded by common obfuscation techniques. The choice of complementary extractors (CodeBERT, xLSTM, GATv2) is well-motivated for the domain.
major comments (2)
- [Methodology] Methodology section (adaptive weighting paragraph): the claim that the weighting step 'dynamically calibrates feature contributions based on their reliability under obfuscation' is unsupported because no equation, auxiliary loss, or inference-time procedure is given for estimating per-modality reliability when all three inputs may be simultaneously altered by the same obfuscator. Without such a mechanism the 1-3% drop result cannot be explained.
- [Experimental evaluation] Experimental evaluation section: the abstract states concrete metrics (89% Hamming score, 91% F1, 6-15% improvement) yet supplies no dataset description, obfuscation generation procedure, baseline implementations, ablation tables, or statistical significance tests. These omissions make the central robustness claim impossible to evaluate against the paper's own evidence.
minor comments (1)
- [Abstract] Abstract: replace '89 percentage' and '91 percentage' with standard '89%' and '91%' notation.
Simulated Author's Rebuttal
We are grateful to the referee for their thorough review and constructive suggestions. We address each major comment in detail below, committing to revisions that will enhance the clarity and completeness of our work.
read point-by-point responses
-
Referee: [Methodology] Methodology section (adaptive weighting paragraph): the claim that the weighting step 'dynamically calibrates feature contributions based on their reliability under obfuscation' is unsupported because no equation, auxiliary loss, or inference-time procedure is given for estimating per-modality reliability when all three inputs may be simultaneously altered by the same obfuscator. Without such a mechanism the 1-3% drop result cannot be explained.
Authors: We agree that the description of the adaptive weighting is insufficiently detailed. In the revised manuscript, we will provide the mathematical formulation of the adaptive weighting module, including the equations for computing modality-specific reliability scores (based on a learned reliability estimator trained with an auxiliary loss that penalizes over-reliance on degraded modalities) and the inference-time procedure for dynamic calibration. This will rigorously support the robustness claims. revision: yes
-
Referee: [Experimental evaluation] Experimental evaluation section: the abstract states concrete metrics (89% Hamming score, 91% F1, 6-15% improvement) yet supplies no dataset description, obfuscation generation procedure, baseline implementations, ablation tables, or statistical significance tests. These omissions make the central robustness claim impossible to evaluate against the paper's own evidence.
Authors: We acknowledge the lack of detailed experimental information in the current submission. We will revise the Experimental Evaluation section to include a complete dataset description (including collection methodology, statistics on obfuscated vs. clean contracts, and vulnerability label distribution), the obfuscation generation procedure (detailing the specific techniques and parameters used to create adversarial samples), re-implementations of baselines with exact hyperparameters, full ablation studies with tables, and statistical significance tests (e.g., paired t-tests with p-values < 0.05 for the reported improvements). These additions will substantiate the empirical claims. revision: yes
Circularity Check
No circularity; purely empirical framework with no derivation chain or self-referential definitions.
full rationale
The paper presents ContractShield as an empirical multimodal architecture (CodeBERT + xLSTM + GATv2 with self-attention, cross-modal attention, and adaptive weighting) evaluated on obfuscated smart-contract datasets. No equations, first-principles derivations, or mathematical claims appear in the provided text. Performance numbers (89% Hamming score, 91% F1, 1-3% drop) are reported as direct measurements rather than predictions derived from fitted parameters or prior self-citations. The adaptive-weighting description is high-level and lacks an explicit loss term or reliability estimator, but because the paper advances no derivation that reduces to its inputs by construction, this does not trigger any of the enumerated circularity patterns. The work is self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
ContractShield integrates (1) CodeBERT with a sliding window mechanism... (2) Extended long short-term memory (xLSTM)... (3) GATv2 to identify structural invariants... hierarchical fusion mechanism combining self-attention, cross-modal attention, and adaptive weighting
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
Empirical evaluation demonstrates resilience... 89% Hamming Score... 91% F1-score
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
H. Chu, P. Zhang, H. Dong, Y . Xiao, S. Ji, W. Li, A survey on smart contract vulnerabilities: Data sources, detection and repair, Information and Software Technology 159 (2023) 107221
work page 2023
-
[2]
URL https://www.gemini.com/cryptopedia/the-dao-hack-m akerdao
Gemini Cryptopedia, The dao: What was the dao hack?, accessed: April 12, 2025. URL https://www.gemini.com/cryptopedia/the-dao-hack-m akerdao
work page 2025
-
[3]
S. M. Wanjiku, dforce confirms the return of exploited $3.65m to their vaults, accessed: April 12, 2025 (2023). URL https://crypto.news/dforce-confirms-the-return-o f-exploited-3-65m-to-their-vaults/
work page 2025
- [4]
- [5]
- [6]
-
[7]
G. Wu, H. Wang, X. Lai, M. Wang, D. He, S. Chan, A comprehensive survey of smart contract security: State of the art and research directions, Journal of Network and Computer Applications (2024) 103882
work page 2024
- [8]
- [9]
-
[10]
F. R. Vidal, N. Ivaki, N. Laranjeiro, Vulnerability detection techniques for smart contracts: A systematic literature review, Journal of Systems and Software (2024) 112160
work page 2024
-
[11]
P. Qian, Z. Liu, Q. He, R. Zimmermann, X. Wang, Towards automated reentrancy detection for smart contracts based on sequential models, IEEE Access 8 (2020) 19685–19695. doi:10.1109/ACCESS.2020.2969429
- [12]
-
[13]
D. Vu, T. Nguyen, V . Tong, S. Souihil, Enhancing multi-label vulnerability detection of smart contract using language model, in: 2023 5th Conference on Blockchain Research & Applications for Innovative Networks and Services (BRAINS), IEEE, 2023, pp. 1–4
work page 2023
-
[14]
V . Tong, C. Dao, H.-A. Tran, T. X. Tran, S. Souihi, Enhancing bert-based language model for multi-label vulnerability detection of smart contract in blockchain, Journal of Network and Systems Management 32 (3) (2024) 63
work page 2024
- [15]
-
[16]
F. Luo, R. Luo, T. Chen, A. Qiao, Z. He, S. Song, Y . Jiang, S. Li, Scvhunter: Smart contract vulnerability detection based on heterogeneous graph at- tention network, in: Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, 2024, pp. 1–13
work page 2024
-
[17]
J. Upadhya, K. Upadhyay, A. Sainju, S. Poudel, M. N. Hasan, K. Poudel, J. Ranganathan, Quadracode ai: Smart contract vulnerability detection with multimodal representation, in: 2024 33rd International Conference on Computer Communications and Networks (ICCCN), IEEE, 2024, pp. 1–9
work page 2024
-
[18]
J. Hu, J. Guo, C. Luo, Y . Hu, M. Lanzinger, Z. Li, Enabling generalized zero-shot vulnerability classification, IEEE Transactions on Dependable and Secure Computing (2025)
work page 2025
-
[19]
W. Lian, Z. Bao, X. Zhang, B. Jia, Y . Zhang, A universal and efficient multi-modal smart contract vulnerability detection framework for big data, IEEE Transactions on Big Data (2024)
work page 2024
-
[20]
W. Jie, Q. Chen, J. Wang, A. S. V . Koe, J. Li, P. Huang, Y . Wu, Y . Wang, A novel extended multimodal ai framework towards vulnerability detection in smart contracts, Information Sciences 636 (2023) 118907
work page 2023
-
[21]
J. Li, G. Lu, Y . Gao, F. Gao, A smart contract vulnerability detection method based on multimodal feature fusion and deep learning, Mathemat- ics 11 (23) (2023) 4823
work page 2023
-
[22]
W. Deng, H. Wei, T. Huang, C. Cao, Y . Peng, X. Hu, Smart contract vulnerability detection based on deep learning and multimodal decision fusion, Sensors 23 (16) (2023) 7246
work page 2023
-
[23]
P. T. Duy, N. H. Khoa, N. H. Quyen, L. C. Trinh, V . T. Kien, T. M. Hoang, V .-H. Pham, Vulnsense: Efficient vulnerability detection in ethereum smart contracts by multimodal learning with graph neural network and language model, International Journal of Information Security 24 (1) (2025) 48
work page 2025
-
[24]
M. Khodadadi, J. Tahmoresnezhad, Hymo: Vulnerability detection in smart contracts using a novel multi-modal hybrid model, arXiv preprint arXiv:2304.13103 (2023)
- [25]
-
[26]
J.-W. Liao, T.-T. Tsai, C.-K. He, C.-W. Tien, Soliaudit: Smart contract vulnerability assessment based on machine learning and fuzz testing, in: 2019 Sixth International Conference on Internet of Things: Systems, Management and Security (IOTSMS), IEEE, 2019. 15
work page 2019
-
[27]
J. F. Ferreira, P. Cruz, T. Durieux, R. Abreu, Smartbugs: A framework to analyze solidity smart contracts, in: Proceedings of the 35th IEEE/ACM International Conference on Automated Software Engineering, 2020, pp. 1349–1352
work page 2020
-
[28]
M. Di Angelo, G. Salzer, Consolidation of ground truth sets for weakness detection in smart contracts, in: International Conference on Financial Cryptography and Data Security, Springer, 2023, pp. 439–455
work page 2023
- [29]
-
[30]
B. Wang, Y . Tong, S. Ji, H. Dong, X. Luo, P. Zhang, A review of learning- based smart contract vulnerability detection: A perspective on code repre- sentation, ACM Transactions on Software Engineering and Methodology (2025)
work page 2025
-
[31]
J. Crisostomo, F. Bacao, V . Lobo, Machine learning methods for detect- ing smart contracts vulnerabilities within ethereum blockchain- a review, Expert Systems with Applications (2025) 126353
work page 2025
- [32]
-
[33]
D. Yuan, X. Wang, Y . Li, T. Zhang, Optimizing smart contract vulnerability detection via multi-modality code and entropy embedding, Journal of Systems and Software 202 (2023) 111699
work page 2023
-
[34]
W. Li, X. Li, Y . Mao, Y . Zhang, Interaction-aware vulnerability detection in smart contract bytecodes, IEEE Transactions on Dependable and Secure Computing (2025)
work page 2025
- [35]
-
[36]
Z. Wei, J. Sun, Y . Sun, Y . Liu, D. Wu, Z. Zhang, X. Zhang, M. Li, Y . Liu, C. Li, et al., Advanced smart contract vulnerability detection via llm- powered multi-agent systems, IEEE Transactions on Software Engineering (2025)
work page 2025
-
[37]
H. Ding, Y . Liu, X. Piao, H. Song, Z. Ji, Smartguard: An llm-enhanced framework for smart contract vulnerability detection, Expert Systems with Applications 269 (2025) 126479
work page 2025
-
[38]
Q. Yu, P. Zhang, H. Dong, Y . Xiao, S. Ji, Bytecode obfuscation for smart contracts, in: 2022 29th Asia-Pacific Software Engineering Conference (APSEC), IEEE, 2022, pp. 566–567
work page 2022
-
[39]
J. Yang, S. Liu, S. Dai, Y . Fang, K. Xie, Y . Lu, Byteeye: A smart contract vulnerability detection framework at bytecode level with graph neural networks, Automated Software Engineering 33 (1) (2026) 1–38
work page 2026
-
[40]
H. H. Nguyen, N.-M. Nguyen, C. Xie, Z. Ahmadi, D. Kudendo, T.-N. Doan, L. Jiang, Mando-hgt: Heterogeneous graph transformers for smart contract vulnerability detection, in: 2023 IEEE/ACM 20th International Conference on Mining Software Repositories (MSR), IEEE, 2023, pp. 334–346
work page 2023
-
[41]
J. Cai, B. Li, T. Zhang, J. Zhang, X. Sun, Fine-grained smart contract vul- nerability detection by heterogeneous code feature learning and automated dataset construction, Journal of Systems and Software 209 (2024) 111919
work page 2024
-
[42]
D. Han, P. Qi, J. Zhang, Z. Guo, L. Fan, Mkdd-vul: A lightweight multi-modal knowledge distillation framework for detecting vulnera- bilities in smart contracts, Expert Systems with Applications (2025) 130619doi:https://doi.org/10.1016/j.eswa.2025.130619. URL https://www.sciencedirect.com/science/article/pii/ S0957417425042344
-
[43]
T. Wang, X. Zhao, J. Zhang, Tmf-net: Multimodal smart contract vul- nerability detection based on multiscale transformer fusion, Information Fusion 122 (2025) 103189
work page 2025
- [44]
-
[45]
Z. Feng, D. Guo, D. Tang, N. Duan, X. Feng, M. Gong, L. Shou, B. Qin, T. Liu, D. Jiang, et al., Codebert: A pre-trained model for programming and natural languages, arXiv preprint arXiv:2002.08155 (2020)
work page internal anchor Pith review arXiv 2002
-
[46]
J. Devlin, M.-W. Chang, K. Lee, K. Toutanova, BERT: Pre-training of deep bidirectional transformers for language understanding, in: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, V olume 1 (Long and Short Papers), Association for Computational Linguistics, Min...
-
[47]
G. Sahu, O. Vechtomova, Adaptive fusion techniques for multimodal data, in: Proceedings of the 16th conference of the European chapter of the Association for Computational Linguistics: Main V olume, 2021, pp. 3156–3166
work page 2021
-
[48]
J. Upadhya, A. Sainju, K. Upadhyay, S. Poudel, M. N. Hasan, K. Poudel, J. Ranganathan, Vulnfusion: Exploiting multimodal representations for advanced smart contract vulnerability detection, in: 2024 6th International Conference on Blockchain Computing and Applications (BCCA), IEEE, 2024, pp. 505–515. 16
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.