Rethinking Molecular Graph Backdoors under Chemistry-aware Admission
Pith reviewed 2026-06-26 08:54 UTC · model grok-4.3
The pith
Admission checks in molecular pipelines invalidate many graph backdoors, yet ChemBack shows chemically valid ones still succeed.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under ChemGuard, which admits a record only when its molecular string is sanitizable and the reconstructed graph matches the submitted graph, many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent. ChemBack constructs chemically feasible motif-anchor attachments, ranks admitted candidates by Tanimoto similarity to clean target-class molecules using fingerprints, and remains model-free, relying only on structures, target labels, fingerprints, and public validity checks. Across benchmarks, validators, architectures, and defenses, it delivers high attack success with admitted poisons while preserving clean accuracy.
What carries the argument
ChemGuard, the admission protocol requiring a sanitizable molecular string and exact graph-string consistency before a record enters the pipeline.
If this is right
- Chemically invalid or inconsistent poisons are filtered before training and therefore do not trigger the backdoor.
- Model-free construction using molecular structures and fingerprint similarity can still produce admitted poisons that achieve high attack success.
- Admission checks alone leave a remaining threat that requires additional defenses beyond sanitization.
- Clean accuracy can be preserved while attack success remains high when poisons respect chemical validity.
Where Pith is reading between the lines
- Molecular pipelines may benefit from additional chemical property checks beyond string sanitization and graph consistency.
- The motif-anchor approach could be adapted to other structured data domains that impose domain-specific validity filters.
- Attackers with access to public chemical databases could further refine similarity-based ranking without model access.
Load-bearing premise
That ChemGuard accurately captures the admission stage present in realistic molecular learning pipelines and that the reported benchmarks reflect typical validator and architecture combinations used in practice.
What would settle it
A test in which ChemBack poisons are submitted to an actual deployed molecular GNN pipeline using a validator or sanitization routine different from those evaluated and the attack success rate drops below the levels reported.
Figures
read the original abstract
Backdoor attacks on molecular graph neural networks (GNNs) are typically evaluated as abstract graph edits, but real molecular learning pipelines do not train on arbitrary graphs. Molecular records must first survive parsing, sanitization, canonicalization, and graph-string consistency checks. We formalize this overlooked admission stage as ChemGuard, an operational protocol for testing whether a submitted molecular record can enter a realistic learning pipeline, while complementing existing defenses. ChemGuard admits a record only when its molecular string is sanitizable and the graph reconstructed from that string matches the submitted molecular graph. Under this operational view, many existing graph-based backdoors lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent. We then show that admission checks alone are insufficient to rule out molecular backdoors. We propose ChemBack, an admission-aware molecular backdoor attack that constructs chemically feasible motif-anchor attachments and ranks admitted candidates by fingerprint-based Tanimoto similarity to clean target-class molecules. ChemBack is model-free during trigger selection, using molecular structures, target labels, fingerprints, and public validity checks, but no victim model, surrogate GNN, learned embedding, gradient, logit, or training-code access. Across molecular benchmarks, validators, architectures, and defenses, \textbf{ChemBack} achieves high attack success with fully admitted poisons while preserving clean accuracy. Our results reveal a two-sided lesson, chemistry-aware admission suppresses many graph-only backdoors, yet chemically valid and target-aligned molecular backdoors remain a practical threat.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript claims that molecular graph backdoors must be evaluated under realistic pipeline admission constraints, formalized as ChemGuard (a record is admitted only if its string is sanitizable and the graph reconstructed from the string exactly matches the submitted graph). Under this view, many existing graph-only backdoors produce chemically invalid or representation-inconsistent poisons and therefore lose efficacy. The authors introduce ChemBack, a model-free attack that constructs chemically feasible motif-anchor attachments, ranks candidates by fingerprint Tanimoto similarity to target-class molecules, and achieves high attack success rates with fully admitted poisons while preserving clean accuracy across benchmarks, validators, architectures, and defenses.
Significance. If the central claims hold, the work is significant for shifting the evaluation of molecular backdoors from abstract graph edits to chemistry-aware admission, demonstrating that admission filters suppress some but not all threats. Credit is given for the model-free construction that relies only on molecular structures, target labels, fingerprints, and public validity checks without any victim-model, surrogate, gradient, or training-code access.
major comments (2)
- [Abstract] Abstract: the claim that existing graph-based backdoors 'lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent' is load-bearing and rests on ChemGuard accurately reproducing the admission logic of the validators actually used in the reported benchmarks. No side-by-side comparison of admission outcomes on identical poison sets is supplied, so the reported drop could be an artifact of the specific ChemGuard implementation rather than a general property of chemistry-aware admission.
- [Abstract] Abstract: the assertion that ChemBack 'achieves high attack success with fully admitted poisons while preserving clean accuracy' across 'molecular benchmarks, validators, architectures, and defenses' is presented without any quantitative metrics, error bars, dataset sizes, or exclusion criteria. This absence prevents verification that the central empirical claim is supported.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive comments. We address each major point below and will incorporate revisions to strengthen the manuscript.
read point-by-point responses
-
Referee: [Abstract] Abstract: the claim that existing graph-based backdoors 'lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent' is load-bearing and rests on ChemGuard accurately reproducing the admission logic of the validators actually used in the reported benchmarks. No side-by-side comparison of admission outcomes on identical poison sets is supplied, so the reported drop could be an artifact of the specific ChemGuard implementation rather than a general property of chemistry-aware admission.
Authors: We agree that a direct side-by-side comparison on identical poison sets would make the claim more robust and rule out implementation-specific artifacts. The manuscript defines ChemGuard from standard RDKit sanitization and graph-string roundtrip checks that are common in molecular ML pipelines, but we will add an explicit table in the revised version comparing admission rates for poisons from prior graph backdoor works under both their original reported settings and under ChemGuard. revision: yes
-
Referee: [Abstract] Abstract: the assertion that ChemBack 'achieves high attack success with fully admitted poisons while preserving clean accuracy' across 'molecular benchmarks, validators, architectures, and defenses' is presented without any quantitative metrics, error bars, dataset sizes, or exclusion criteria. This absence prevents verification that the central empirical claim is supported.
Authors: The abstract is intentionally concise and omits specific numbers. The full manuscript reports the quantitative results (attack success rates, clean accuracies, standard deviations, dataset sizes, and exclusion criteria) across all listed benchmarks, validators, architectures, and defenses. To improve verifiability from the abstract itself, we will revise it to include a small number of key quantitative highlights (e.g., average ASR ranges and dataset counts) while remaining within length limits. revision: partial
Circularity Check
No significant circularity; derivation is self-contained
full rationale
The paper defines ChemGuard operationally from standard molecular parsing/sanitization steps and evaluates backdoors under it, then introduces ChemBack as a model-free construction using public fingerprints and validity checks. No equations, fitted parameters, or predictions are present. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims rest on empirical results across validators and architectures rather than reducing by construction to the authors' own inputs or definitions. This matches the default expectation of a non-circular paper.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Molecular records must survive parsing, sanitization, canonicalization, and graph-string consistency checks before entering a learning pipeline.
invented entities (2)
-
ChemGuard
no independent evidence
-
ChemBack
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Deeper insights into graph convolutional networks for semi-supervised learning
Qimai Li, Zhichao Han, and Xiao-Ming Wu. Deeper insights into graph convolutional networks for semi-supervised learning. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018
2018
-
[2]
Graph neural networks: A review of methods and applications
Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020
2020
-
[3]
Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018
Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S Pappu, Karl Leswing, and Vijay Pande. Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018
2018
-
[4]
Neural message passing for quantum chemistry
Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational conference on machine learning, pages 1263–1272. Pmlr, 2017
2017
-
[5]
Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023
Haibin Zheng, Haiyang Xiong, Jinyin Chen, Haonan Ma, and Guohan Huang. Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023
2023
-
[6]
Unnoticeable backdoor attacks on graph neural networks
Enyan Dai, Minhua Lin, Xiang Zhang, and Suhang Wang. Unnoticeable backdoor attacks on graph neural networks. InProceedings of the ACM Web Conference 2023, pages 2263–2273, 2023
2023
-
[7]
Rethinking graph backdoor attacks: A distribution-preserving perspective
Zhiwei Zhang, Minhua Lin, Enyan Dai, and Suhang Wang. Rethinking graph backdoor attacks: A distribution-preserving perspective. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 4386–4397, 2024
2024
-
[8]
Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022
Chuanze Kang, Han Zhang, Zhuo Liu, Shenwei Huang, and Yanbin Yin. Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022
2022
-
[9]
Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022
Yahui Long, Min Wu, Yong Liu, Yuan Fang, Chee Keong Kwoh, Jinmiao Chen, Jiawei Luo, and Xiaoli Li. Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022
2022
-
[10]
A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020
Oliver Wieder, Stefan Kohlbacher, Mélaine Kuenemann, Arthur Garon, Pierre Ducrot, Thomas Seidel, and Thierry Langer. A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020
2020
-
[11]
Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025
R Satheeskumar. Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025
2025
-
[12]
Rdkit documentation.Release, 1(1-79):4, 2013
Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013
2013
-
[13]
Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011
Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011
2011
-
[14]
Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011
Dmitry Pavlov, Mikhail Rybalkin, Boris Karulin, Mikhail Kozhevnikov, Alexey Savelyev, and A Churinov. Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011
2011
-
[15]
Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010
David Rogers and Mathew Hahn. Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010
2010
-
[16]
Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015
Dávid Bajusz, Anita Rácz, and Károly Héberger. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015
2015
-
[17]
The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009
Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009. doi: 10.1109/ TNN.2008.2005605
arXiv 2009
-
[18]
David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. InProceedings of the 29th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, page 2224–2232, Cambridge, MA, U...
2015
-
[19]
Kipf and Max Welling
Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum? id=SJU4ayYgl. 10
2017
-
[20]
Hamilton, Rex Ying, and Jure Leskovec
William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc. ISBN 9781510860964
2017
-
[21]
Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020
Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020
2020
-
[22]
Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017
Pith/arXiv arXiv 2017
-
[23]
Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020
Tuan Anh Nguyen and Anh Tran. Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020
2020
-
[24]
Lira: Learnable, imperceptible and robust backdoor attacks
Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. Lira: Learnable, imperceptible and robust backdoor attacks. InProceedings of the IEEE/CVF international conference on computer vision, pages 11966–11976, 2021
2021
-
[25]
Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions
Thuy Dung Nguyen, Tuan Nguyen, Phi Le Nguyen, Hieu H Pham, Khoa D Doan, and Kok-Seng Wong. Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions. Engineering Applications of Artificial Intelligence, 127:107166, 2024
2024
-
[26]
Graph backdoor
Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In30th USENIX security symposium (USENIX Security 21), pages 1523–1540, 2021
2021
-
[27]
Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y . Zhao. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In2019 IEEE Symposium on Security and Privacy (SP), pages 707–723, 2019. doi: 10.1109/SP.2019.00031
-
[28]
Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018
Brandon Tran, Jerry Li, and Aleksander Madry. Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018
2018
-
[29]
Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning
Hao Yu, Chuan Ma, Xinhang Wan, Jun Wang, Tao Xiang, Meng Shen, and Xinwang Liu. Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning. InNetwork and Distributed System Security Symposium, NDSS, 2025
2025
-
[30]
Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024
Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, and Suhang Wang. Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024
arXiv 2024
-
[31]
Robust graph convolutional networks against adversarial attacks
Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. Robust graph convolutional networks against adversarial attacks. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1399–1407, 2019
2019
-
[32]
Gnnguard: Defending graph neural networks against adversarial attacks
Xiang Zhang and Marinka Zitnik. Gnnguard: Defending graph neural networks against adversarial attacks. Advances in neural information processing systems, 33:9263–9275, 2020
2020
-
[33]
Graph structure learning for robust graph neural networks
Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. Graph structure learning for robust graph neural networks. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 66–74, 2020
2020
-
[34]
Certified robustness of graph neural networks against adversarial structural perturbation
Binghui Wang, Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of graph neural networks against adversarial structural perturbation. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1645–1653, 2021
2021
-
[35]
Distributed backdoor attacks on federated graph learning and certified defenses
Yuxin Yang, Qiang Li, Jinyuan Jia, Yuan Hong, and Binghui Wang. Distributed backdoor attacks on federated graph learning and certified defenses. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 2829–2843, 2024
2024
-
[36]
Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations
Jiate Li, Meng Pang, Yun Dong, and Binghui Wang. Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5020–5029, 2025
2025
-
[37]
A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012
Ines Filipa Martins, Ana L Teixeira, Luis Pinheiro, and Andre O Falcao. A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012
2012
-
[38]
Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016
Govindan Subramanian, Bharath Ramsundar, Vijay Pande, and Rajiah Aldrin Denny. Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016. 11
1936
-
[39]
The sider database of drugs and side effects
Michael Kuhn, Ivica Letunic, Lars Juhl Jensen, and Peer Bork. The sider database of drugs and side effects. Nucleic acids research, 44(D1):D1075–D1079, 2016
2016
-
[40]
Ruili Huang, Menghang Xia, Dac-Trung Nguyen, Tongan Zhao, Srilatha Sakamuru, Jinghua Zhao, Sampada A Shahane, Anna Rossoshek, and Anton Simeonov. Tox21challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs.Frontiers in Environmental Science, 3:85, 2016
2016
-
[41]
Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012
Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, Zhigang Zhou, Lianyi Han, Karen Karapetyan, Svetlana Dracheva, Benjamin A Shoemaker, et al. Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012
2012
-
[42]
Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009
Sebastian G Rohrer and Knut Baumann. Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009
2009
-
[43]
Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997
Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997
1997
-
[44]
Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002
Daniel F Veber, Stephen R Johnson, Hung-Yuan Cheng, Brian R Smith, Keith W Ward, and Kenneth D Kopple. Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002
2002
-
[45]
Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012
G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012
2012
-
[46]
Peter C Austin. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples.Statistics in medicine, 28(25):3083–3107, 2009
2009
-
[47]
Donald J Schuirmann. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability.Journal of pharmacokinetics and biopharmaceutics, 15(6):657–680, 1987
1987
-
[48]
Equivalence tests: A practical primer for t tests, correlations, and meta-analyses
D Lakens. Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. social psychological and personality science, 8 (4), 355–362, 2017. 12 A Real-world Motivating Scenario To ground the threat model, consider an organization that uses a centralized molecular classifier to screen third-party chemical submissions before downstream ...
2017
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.