Rethinking Molecular Graph Backdoors under Chemistry-aware Admission

Chee Seng Chan; Khoa D. Doan; Kok-Seng Wong; Sze Jue Yang; Thinh T. H. Nguyen

arxiv: 2606.23361 · v1 · pith:4V5FDIJ5new · submitted 2026-06-22 · 💻 cs.LG · cs.AI

Rethinking Molecular Graph Backdoors under Chemistry-aware Admission

Thinh T. H. Nguyen , Sze Jue Yang , Khoa D. Doan , Chee Seng Chan , Kok-Seng Wong This is my paper

Pith reviewed 2026-06-26 08:54 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords backdoor attacksmolecular graphsgraph neural networksadmission checkschemical validityChemGuardChemBack

0 comments

The pith

Admission checks in molecular pipelines invalidate many graph backdoors, yet ChemBack shows chemically valid ones still succeed.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that real molecular learning pipelines require records to survive parsing, sanitization, canonicalization, and graph-string consistency before any training occurs. Existing backdoor methods often produce poisons that fail these steps and therefore lose efficacy under realistic conditions. By defining ChemGuard as the admission protocol, the work demonstrates that many prior attacks become ineffective because their triggers are chemically invalid or representation-inconsistent. ChemBack then constructs feasible motif-anchor attachments and ranks them by fingerprint similarity to clean target molecules, achieving high attack success with fully admitted poisons while keeping clean accuracy intact. The central lesson is that admission filters some threats but does not eliminate the possibility of practical molecular backdoors.

Core claim

Under ChemGuard, which admits a record only when its molecular string is sanitizable and the reconstructed graph matches the submitted graph, many existing graph-based backdoors lose efficacy because their poisons are chemically invalid or representation-inconsistent. ChemBack constructs chemically feasible motif-anchor attachments, ranks admitted candidates by Tanimoto similarity to clean target-class molecules using fingerprints, and remains model-free, relying only on structures, target labels, fingerprints, and public validity checks. Across benchmarks, validators, architectures, and defenses, it delivers high attack success with admitted poisons while preserving clean accuracy.

What carries the argument

ChemGuard, the admission protocol requiring a sanitizable molecular string and exact graph-string consistency before a record enters the pipeline.

If this is right

Chemically invalid or inconsistent poisons are filtered before training and therefore do not trigger the backdoor.
Model-free construction using molecular structures and fingerprint similarity can still produce admitted poisons that achieve high attack success.
Admission checks alone leave a remaining threat that requires additional defenses beyond sanitization.
Clean accuracy can be preserved while attack success remains high when poisons respect chemical validity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Molecular pipelines may benefit from additional chemical property checks beyond string sanitization and graph consistency.
The motif-anchor approach could be adapted to other structured data domains that impose domain-specific validity filters.
Attackers with access to public chemical databases could further refine similarity-based ranking without model access.

Load-bearing premise

That ChemGuard accurately captures the admission stage present in realistic molecular learning pipelines and that the reported benchmarks reflect typical validator and architecture combinations used in practice.

What would settle it

A test in which ChemBack poisons are submitted to an actual deployed molecular GNN pipeline using a validator or sanitization routine different from those evaluated and the attack success rate drops below the levels reported.

Figures

Figures reproduced from arXiv: 2606.23361 by Chee Seng Chan, Khoa D. Doan, Kok-Seng Wong, Sze Jue Yang, Thinh T. H. Nguyen.

**Figure 2.** Figure 2: Overview of ChemBack under ChemGuard. ChemBack forms a trigger library from candidate motifs, attaches them to sampled non-target hosts, and filters feasible motif-anchor attachments with ChemGuard for sanitization and graph-string consistency. It then selects admitted triggers by fingerprint-based Tanimoto similarity to clean target-class molecules. The selected trigger produces ChemGuard-admissible trai… view at source ↗

**Figure 3.** Figure 3: Operational ASR before and after enforcing [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: Relation between Tanimoto similarity to the clean target class and clean-model target-class [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Sensitivity to poison rate α ∈ {1, 5, 10}%. For each dataset, the left plot shows CA on clean test molecules, and the right plot shows ASR on triggered non-targets. Curves are mean±std over 5 seeds. 0.0 0.5 1.0 1.5 2.0 λTan 68 70 72 ASR (%) 0.0 0.5 1.0 1.5 2.0 λTan 96 97 98 99 100 EPR (%) 0.0 0.5 1.0 1.5 2.0 λTan 0.4 0.6 0.8 Tanimoto similarity (a) BBBP. 0.0 0.5 1.0 1.5 2.0 λTan 97 98 99 ASR (%) 0.0 0.5 1.… view at source ↗

**Figure 6.** Figure 6: Sensitivity to the Tanimoto reward weight [PITH_FULL_IMAGE:figures/full_fig_p020_6.png] view at source ↗

**Figure 7.** Figure 7: Post-hoc embedding diagnostics for representative graph backdoors on BBBP. Clean [PITH_FULL_IMAGE:figures/full_fig_p022_7.png] view at source ↗

**Figure 8.** Figure 8: Post-hoc embedding diagnostics for ChemBack on BACE and Tox21. Left panels overlay clean target molecules and ChemGuard-admissible poisons. Right panels plot target probability against clean-model MD2 . The clean model is used only for analysis; ChemBack selects triggers using model-free Tanimoto similarity. improve representation-space alignment relative to simpler graph edits, but they still do not guara… view at source ↗

**Figure 9.** Figure 9: Task-wise ASR for ChemBack under ChemGuard on multi-task benchmarks. Each box summarizes the distribution of ASR across seeds for each task. While ASR varies across endpoints, ChemBack remains consistently effective across the evaluated task panel. The first factor is qtest and the second factor is rcond. Since rcond ∈ [0, 1], we obtain ASR ≤ qtest. This proposition explains the main evaluation gap. Even i… view at source ↗

read the original abstract

Backdoor attacks on molecular graph neural networks (GNNs) are typically evaluated as abstract graph edits, but real molecular learning pipelines do not train on arbitrary graphs. Molecular records must first survive parsing, sanitization, canonicalization, and graph-string consistency checks. We formalize this overlooked admission stage as ChemGuard, an operational protocol for testing whether a submitted molecular record can enter a realistic learning pipeline, while complementing existing defenses. ChemGuard admits a record only when its molecular string is sanitizable and the graph reconstructed from that string matches the submitted molecular graph. Under this operational view, many existing graph-based backdoors lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent. We then show that admission checks alone are insufficient to rule out molecular backdoors. We propose ChemBack, an admission-aware molecular backdoor attack that constructs chemically feasible motif-anchor attachments and ranks admitted candidates by fingerprint-based Tanimoto similarity to clean target-class molecules. ChemBack is model-free during trigger selection, using molecular structures, target labels, fingerprints, and public validity checks, but no victim model, surrogate GNN, learned embedding, gradient, logit, or training-code access. Across molecular benchmarks, validators, architectures, and defenses, \textbf{ChemBack} achieves high attack success with fully admitted poisons while preserving clean accuracy. Our results reveal a two-sided lesson, chemistry-aware admission suppresses many graph-only backdoors, yet chemically valid and target-aligned molecular backdoors remain a practical threat.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows that many graph backdoor attacks on molecular GNNs fail basic chemical admission checks, while a new model-free construction succeeds with valid poisons.

read the letter

The core contribution here is the operational split between abstract graph edits and the actual admission filters that molecular pipelines apply. ChemGuard formalizes the check as sanitizable string plus exact graph-string reconstruction match, and the authors use it to show that prior backdoor poisons often get rejected before they reach training. That observation is useful because it explains why some published attack numbers may not translate to deployed systems.

ChemBack then builds on this by attaching motifs to anchors in a way that passes the same checks, ranks candidates by Tanimoto similarity on fingerprints, and stays model-free. The approach avoids gradients or surrogate models, which keeps the threat model realistic for an attacker who only has access to public tools and the target label distribution. If the experiments hold up with the numbers and controls that are missing from the abstract, this is a clear step forward from treating molecules as arbitrary graphs.

The main soft spot is whether ChemGuard reproduces the exact sequence of canonicalization, valence, and aromaticity rules used in the benchmark validators. The stress-test note flags this, and the abstract does not include a direct side-by-side on the same poison sets, so the reported drop in existing attack success could partly reflect implementation differences rather than a general property. Dataset sizes, exact success rates, and exclusion criteria are also not visible here, which makes it hard to judge effect sizes or reproducibility.

This work is aimed at people who evaluate or defend GNNs in cheminformatics and drug discovery. It deserves a serious referee because the admission-stage point is concrete and the attack construction is technically straightforward to test. I would send it out for review with a request for the missing experimental details and a validation that ChemGuard matches the pipelines it claims to model.

Referee Report

2 major / 0 minor

Summary. The manuscript claims that molecular graph backdoors must be evaluated under realistic pipeline admission constraints, formalized as ChemGuard (a record is admitted only if its string is sanitizable and the graph reconstructed from the string exactly matches the submitted graph). Under this view, many existing graph-only backdoors produce chemically invalid or representation-inconsistent poisons and therefore lose efficacy. The authors introduce ChemBack, a model-free attack that constructs chemically feasible motif-anchor attachments, ranks candidates by fingerprint Tanimoto similarity to target-class molecules, and achieves high attack success rates with fully admitted poisons while preserving clean accuracy across benchmarks, validators, architectures, and defenses.

Significance. If the central claims hold, the work is significant for shifting the evaluation of molecular backdoors from abstract graph edits to chemistry-aware admission, demonstrating that admission filters suppress some but not all threats. Credit is given for the model-free construction that relies only on molecular structures, target labels, fingerprints, and public validity checks without any victim-model, surrogate, gradient, or training-code access.

major comments (2)

[Abstract] Abstract: the claim that existing graph-based backdoors 'lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent' is load-bearing and rests on ChemGuard accurately reproducing the admission logic of the validators actually used in the reported benchmarks. No side-by-side comparison of admission outcomes on identical poison sets is supplied, so the reported drop could be an artifact of the specific ChemGuard implementation rather than a general property of chemistry-aware admission.
[Abstract] Abstract: the assertion that ChemBack 'achieves high attack success with fully admitted poisons while preserving clean accuracy' across 'molecular benchmarks, validators, architectures, and defenses' is presented without any quantitative metrics, error bars, dataset sizes, or exclusion criteria. This absence prevents verification that the central empirical claim is supported.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. We address each major point below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: the claim that existing graph-based backdoors 'lose much of their apparent efficacy because their poisons are chemically invalid or representation-inconsistent' is load-bearing and rests on ChemGuard accurately reproducing the admission logic of the validators actually used in the reported benchmarks. No side-by-side comparison of admission outcomes on identical poison sets is supplied, so the reported drop could be an artifact of the specific ChemGuard implementation rather than a general property of chemistry-aware admission.

Authors: We agree that a direct side-by-side comparison on identical poison sets would make the claim more robust and rule out implementation-specific artifacts. The manuscript defines ChemGuard from standard RDKit sanitization and graph-string roundtrip checks that are common in molecular ML pipelines, but we will add an explicit table in the revised version comparing admission rates for poisons from prior graph backdoor works under both their original reported settings and under ChemGuard. revision: yes
Referee: [Abstract] Abstract: the assertion that ChemBack 'achieves high attack success with fully admitted poisons while preserving clean accuracy' across 'molecular benchmarks, validators, architectures, and defenses' is presented without any quantitative metrics, error bars, dataset sizes, or exclusion criteria. This absence prevents verification that the central empirical claim is supported.

Authors: The abstract is intentionally concise and omits specific numbers. The full manuscript reports the quantitative results (attack success rates, clean accuracies, standard deviations, dataset sizes, and exclusion criteria) across all listed benchmarks, validators, architectures, and defenses. To improve verifiability from the abstract itself, we will revise it to include a small number of key quantitative highlights (e.g., average ASR ranges and dataset counts) while remaining within length limits. revision: partial

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained

full rationale

The paper defines ChemGuard operationally from standard molecular parsing/sanitization steps and evaluates backdoors under it, then introduces ChemBack as a model-free construction using public fingerprints and validity checks. No equations, fitted parameters, or predictions are present. No self-citations are invoked as load-bearing uniqueness theorems or ansatzes. The central claims rest on empirical results across validators and architectures rather than reducing by construction to the authors' own inputs or definitions. This matches the default expectation of a non-circular paper.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

The central claim rests on the domain assumption that real molecular pipelines enforce sanitization and graph-string consistency, and on the empirical claim that ChemBack poisons remain chemically valid under those checks. No free parameters or invented physical entities are described.

axioms (1)

domain assumption Molecular records must survive parsing, sanitization, canonicalization, and graph-string consistency checks before entering a learning pipeline.
Stated in the abstract as the basis for ChemGuard; this premise defines which poisons are admitted.

invented entities (2)

ChemGuard no independent evidence
purpose: Operational protocol formalizing the admission stage for molecular records.
Newly defined filter that existing backdoors are tested against.
ChemBack no independent evidence
purpose: Admission-aware backdoor attack using motif-anchor attachments and Tanimoto ranking.
New attack method claimed to produce admitted, effective poisons.

pith-pipeline@v0.9.1-grok · 5815 in / 1495 out tokens · 25532 ms · 2026-06-26T08:54:25.202180+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

48 extracted references · 1 canonical work pages

[1]

Deeper insights into graph convolutional networks for semi-supervised learning

Qimai Li, Zhichao Han, and Xiao-Ming Wu. Deeper insights into graph convolutional networks for semi-supervised learning. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

2018
[2]

Graph neural networks: A review of methods and applications

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020

2020
[3]

Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018

Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S Pappu, Karl Leswing, and Vijay Pande. Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018

2018
[4]

Neural message passing for quantum chemistry

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational conference on machine learning, pages 1263–1272. Pmlr, 2017

2017
[5]

Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023

Haibin Zheng, Haiyang Xiong, Jinyin Chen, Haonan Ma, and Guohan Huang. Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023

2023
[6]

Unnoticeable backdoor attacks on graph neural networks

Enyan Dai, Minhua Lin, Xiang Zhang, and Suhang Wang. Unnoticeable backdoor attacks on graph neural networks. InProceedings of the ACM Web Conference 2023, pages 2263–2273, 2023

2023
[7]

Rethinking graph backdoor attacks: A distribution-preserving perspective

Zhiwei Zhang, Minhua Lin, Enyan Dai, and Suhang Wang. Rethinking graph backdoor attacks: A distribution-preserving perspective. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 4386–4397, 2024

2024
[8]

Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022

Chuanze Kang, Han Zhang, Zhuo Liu, Shenwei Huang, and Yanbin Yin. Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022

2022
[9]

Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022

Yahui Long, Min Wu, Yong Liu, Yuan Fang, Chee Keong Kwoh, Jinmiao Chen, Jiawei Luo, and Xiaoli Li. Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022

2022
[10]

A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020

Oliver Wieder, Stefan Kohlbacher, Mélaine Kuenemann, Arthur Garon, Pierre Ducrot, Thomas Seidel, and Thierry Langer. A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020

2020
[11]

Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025

R Satheeskumar. Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025

2025
[12]

Rdkit documentation.Release, 1(1-79):4, 2013

Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013

2013
[13]

Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011

Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011

2011
[14]

Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011

Dmitry Pavlov, Mikhail Rybalkin, Boris Karulin, Mikhail Kozhevnikov, Alexey Savelyev, and A Churinov. Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011

2011
[15]

Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010

David Rogers and Mathew Hahn. Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010

2010
[16]

Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015

Dávid Bajusz, Anita Rácz, and Károly Héberger. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015

2015
[17]

The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009. doi: 10.1109/ TNN.2008.2005605

arXiv 2009
[18]

David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. InProceedings of the 29th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, page 2224–2232, Cambridge, MA, U...

2015
[19]

Kipf and Max Welling

Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum? id=SJU4ayYgl. 10

2017
[20]

Hamilton, Rex Ying, and Jure Leskovec

William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc. ISBN 9781510860964

2017
[21]

Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020

Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020

2020
[22]

Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017

Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017

Pith/arXiv arXiv 2017
[23]

Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020

Tuan Anh Nguyen and Anh Tran. Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020

2020
[24]

Lira: Learnable, imperceptible and robust backdoor attacks

Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. Lira: Learnable, imperceptible and robust backdoor attacks. InProceedings of the IEEE/CVF international conference on computer vision, pages 11966–11976, 2021

2021
[25]

Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions

Thuy Dung Nguyen, Tuan Nguyen, Phi Le Nguyen, Hieu H Pham, Khoa D Doan, and Kok-Seng Wong. Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions. Engineering Applications of Artificial Intelligence, 127:107166, 2024

2024
[26]

Graph backdoor

Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In30th USENIX security symposium (USENIX Security 21), pages 1523–1540, 2021

2021
[27]

Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y . Zhao. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In2019 IEEE Symposium on Security and Privacy (SP), pages 707–723, 2019. doi: 10.1109/SP.2019.00031

work page doi:10.1109/sp.2019.00031 2019
[28]

Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018

Brandon Tran, Jerry Li, and Aleksander Madry. Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018

2018
[29]

Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning

Hao Yu, Chuan Ma, Xinhang Wan, Jun Wang, Tao Xiang, Meng Shen, and Xinwang Liu. Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning. InNetwork and Distributed System Security Symposium, NDSS, 2025

2025
[30]

Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024

Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, and Suhang Wang. Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024

arXiv 2024
[31]

Robust graph convolutional networks against adversarial attacks

Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. Robust graph convolutional networks against adversarial attacks. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1399–1407, 2019

2019
[32]

Gnnguard: Defending graph neural networks against adversarial attacks

Xiang Zhang and Marinka Zitnik. Gnnguard: Defending graph neural networks against adversarial attacks. Advances in neural information processing systems, 33:9263–9275, 2020

2020
[33]

Graph structure learning for robust graph neural networks

Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. Graph structure learning for robust graph neural networks. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 66–74, 2020

2020
[34]

Certified robustness of graph neural networks against adversarial structural perturbation

Binghui Wang, Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of graph neural networks against adversarial structural perturbation. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1645–1653, 2021

2021
[35]

Distributed backdoor attacks on federated graph learning and certified defenses

Yuxin Yang, Qiang Li, Jinyuan Jia, Yuan Hong, and Binghui Wang. Distributed backdoor attacks on federated graph learning and certified defenses. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 2829–2843, 2024

2024
[36]

Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations

Jiate Li, Meng Pang, Yun Dong, and Binghui Wang. Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5020–5029, 2025

2025
[37]

A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012

Ines Filipa Martins, Ana L Teixeira, Luis Pinheiro, and Andre O Falcao. A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012

2012
[38]

Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016

Govindan Subramanian, Bharath Ramsundar, Vijay Pande, and Rajiah Aldrin Denny. Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016. 11

1936
[39]

The sider database of drugs and side effects

Michael Kuhn, Ivica Letunic, Lars Juhl Jensen, and Peer Bork. The sider database of drugs and side effects. Nucleic acids research, 44(D1):D1075–D1079, 2016

2016
[40]

Ruili Huang, Menghang Xia, Dac-Trung Nguyen, Tongan Zhao, Srilatha Sakamuru, Jinghua Zhao, Sampada A Shahane, Anna Rossoshek, and Anton Simeonov. Tox21challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs.Frontiers in Environmental Science, 3:85, 2016

2016
[41]

Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012

Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, Zhigang Zhou, Lianyi Han, Karen Karapetyan, Svetlana Dracheva, Benjamin A Shoemaker, et al. Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012

2012
[42]

Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009

Sebastian G Rohrer and Knut Baumann. Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009

2009
[43]

Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997

Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997

1997
[44]

Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002

Daniel F Veber, Stephen R Johnson, Hung-Yuan Cheng, Brian R Smith, Keith W Ward, and Kenneth D Kopple. Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002

2002
[45]

Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

2012
[46]

Peter C Austin. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples.Statistics in medicine, 28(25):3083–3107, 2009

2009
[47]

Donald J Schuirmann. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability.Journal of pharmacokinetics and biopharmaceutics, 15(6):657–680, 1987

1987
[48]

Equivalence tests: A practical primer for t tests, correlations, and meta-analyses

D Lakens. Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. social psychological and personality science, 8 (4), 355–362, 2017. 12 A Real-world Motivating Scenario To ground the threat model, consider an organization that uses a centralized molecular classifier to screen third-party chemical submissions before downstream ...

2017

[1] [1]

Deeper insights into graph convolutional networks for semi-supervised learning

Qimai Li, Zhichao Han, and Xiao-Ming Wu. Deeper insights into graph convolutional networks for semi-supervised learning. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018

2018

[2] [2]

Graph neural networks: A review of methods and applications

Jie Zhou, Ganqu Cui, Shengding Hu, Zhengyan Zhang, Cheng Yang, Zhiyuan Liu, Lifeng Wang, Changcheng Li, and Maosong Sun. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020

2020

[3] [3]

Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018

Zhenqin Wu, Bharath Ramsundar, Evan N Feinberg, Joseph Gomes, Caleb Geniesse, Aneesh S Pappu, Karl Leswing, and Vijay Pande. Moleculenet: a benchmark for molecular machine learning.Chemical science, 9(2):513–530, 2018

2018

[4] [4]

Neural message passing for quantum chemistry

Justin Gilmer, Samuel S Schoenholz, Patrick F Riley, Oriol Vinyals, and George E Dahl. Neural message passing for quantum chemistry. InInternational conference on machine learning, pages 1263–1272. Pmlr, 2017

2017

[5] [5]

Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023

Haibin Zheng, Haiyang Xiong, Jinyin Chen, Haonan Ma, and Guohan Huang. Motif-backdoor: Rethinking the backdoor attack on graph neural networks via motifs.IEEE Transactions on Computational Social Systems, 11(2):2479–2493, 2023

2023

[6] [6]

Unnoticeable backdoor attacks on graph neural networks

Enyan Dai, Minhua Lin, Xiang Zhang, and Suhang Wang. Unnoticeable backdoor attacks on graph neural networks. InProceedings of the ACM Web Conference 2023, pages 2263–2273, 2023

2023

[7] [7]

Rethinking graph backdoor attacks: A distribution-preserving perspective

Zhiwei Zhang, Minhua Lin, Enyan Dai, and Suhang Wang. Rethinking graph backdoor attacks: A distribution-preserving perspective. InProceedings of the 30th ACM SIGKDD conference on knowledge discovery and data mining, pages 4386–4397, 2024

2024

[8] [8]

Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022

Chuanze Kang, Han Zhang, Zhuo Liu, Shenwei Huang, and Yanbin Yin. Lr-gnn: A graph neural network based on link representation for predicting molecular associations.Briefings in Bioinformatics, 23(1): bbab513, 2022

2022

[9] [9]

Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022

Yahui Long, Min Wu, Yong Liu, Yuan Fang, Chee Keong Kwoh, Jinmiao Chen, Jiawei Luo, and Xiaoli Li. Pre-training graph neural networks for link prediction in biomedical networks.Bioinformatics, 38(8): 2254–2262, 2022

2022

[10] [10]

A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020

Oliver Wieder, Stefan Kohlbacher, Mélaine Kuenemann, Arthur Garon, Pierre Ducrot, Thomas Seidel, and Thierry Langer. A compact review of molecular property prediction with graph neural networks.Drug Discovery Today: Technologies, 37:1–12, 2020

2020

[11] [11]

Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025

R Satheeskumar. Enhancing drug discovery with ai: Predictive modeling of pharmacokinetics using graph neural networks and ensemble learning.Intelligent Pharmacy, 3(2):127–140, 2025

2025

[12] [12]

Rdkit documentation.Release, 1(1-79):4, 2013

Greg Landrum. Rdkit documentation.Release, 1(1-79):4, 2013

2013

[13] [13]

Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011

Noel M O’Boyle, Michael Banck, Craig A James, Chris Morley, Tim Vandermeersch, and Geoffrey R Hutchison. Open babel: An open chemical toolbox.Journal of cheminformatics, 3(1):33, 2011

2011

[14] [14]

Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011

Dmitry Pavlov, Mikhail Rybalkin, Boris Karulin, Mikhail Kozhevnikov, Alexey Savelyev, and A Churinov. Indigo: universal cheminformatics api.Journal of cheminformatics, 3(Suppl 1):P4, 2011

2011

[15] [15]

Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010

David Rogers and Mathew Hahn. Extended-connectivity fingerprints.Journal of chemical information and modeling, 50(5):742–754, 2010

2010

[16] [16]

Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015

Dávid Bajusz, Anita Rácz, and Károly Héberger. Why is tanimoto index an appropriate choice for fingerprint-based similarity calculations?Journal of cheminformatics, 7(1):20, 2015

2015

[17] [17]

The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009

Franco Scarselli, Marco Gori, Ah Chung Tsoi, Markus Hagenbuchner, and Gabriele Monfardini. The graph neural network model.IEEE Transactions on Neural Networks, 20(1):61–80, 2009. doi: 10.1109/ TNN.2008.2005605

arXiv 2009

[18] [18]

David Duvenaud, Dougal Maclaurin, Jorge Aguilera-Iparraguirre, Rafael Gómez-Bombarelli, Timothy Hirzel, Alán Aspuru-Guzik, and Ryan P. Adams. Convolutional networks on graphs for learning molecular fingerprints. InProceedings of the 29th International Conference on Neural Information Processing Systems - Volume 2, NIPS’15, page 2224–2232, Cambridge, MA, U...

2015

[19] [19]

Kipf and Max Welling

Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum? id=SJU4ayYgl. 10

2017

[20] [20]

Hamilton, Rex Ying, and Jure Leskovec

William L. Hamilton, Rex Ying, and Jure Leskovec. Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1025–1035, Red Hook, NY , USA, 2017. Curran Associates Inc. ISBN 9781510860964

2017

[21] [21]

Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020

Yu Rong, Yatao Bian, Tingyang Xu, Weiyang Xie, Ying Wei, Wenbing Huang, and Junzhou Huang. Self- supervised graph transformer on large-scale molecular data.Advances in neural information processing systems, 33:12559–12571, 2020

2020

[22] [22]

Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017

Tianyu Gu, Brendan Dolan-Gavitt, and Siddharth Garg. Badnets: Identifying vulnerabilities in the machine learning model supply chain.arXiv preprint arXiv:1708.06733, 2017

Pith/arXiv arXiv 2017

[23] [23]

Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020

Tuan Anh Nguyen and Anh Tran. Input-aware dynamic backdoor attack.Advances in Neural Information Processing Systems, 33:3454–3464, 2020

2020

[24] [24]

Lira: Learnable, imperceptible and robust backdoor attacks

Khoa Doan, Yingjie Lao, Weijie Zhao, and Ping Li. Lira: Learnable, imperceptible and robust backdoor attacks. InProceedings of the IEEE/CVF international conference on computer vision, pages 11966–11976, 2021

2021

[25] [25]

Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions

Thuy Dung Nguyen, Tuan Nguyen, Phi Le Nguyen, Hieu H Pham, Khoa D Doan, and Kok-Seng Wong. Backdoor attacks and defenses in federated learning: Survey, challenges and future research directions. Engineering Applications of Artificial Intelligence, 127:107166, 2024

2024

[26] [26]

Graph backdoor

Zhaohan Xi, Ren Pang, Shouling Ji, and Ting Wang. Graph backdoor. In30th USENIX security symposium (USENIX Security 21), pages 1523–1540, 2021

2021

[27] [27]

Bolun Wang, Yuanshun Yao, Shawn Shan, Huiying Li, Bimal Viswanath, Haitao Zheng, and Ben Y . Zhao. Neural cleanse: Identifying and mitigating backdoor attacks in neural networks. In2019 IEEE Symposium on Security and Privacy (SP), pages 707–723, 2019. doi: 10.1109/SP.2019.00031

work page doi:10.1109/sp.2019.00031 2019

[28] [28]

Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018

Brandon Tran, Jerry Li, and Aleksander Madry. Spectral signatures in backdoor attacks.Advances in neural information processing systems, 31, 2018

2018

[29] [29]

Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning

Hao Yu, Chuan Ma, Xinhang Wan, Jun Wang, Tao Xiang, Meng Shen, and Xinwang Liu. Dshield: Defending against backdoor attacks on graph neural networks via discrepancy learning. InNetwork and Distributed System Security Symposium, NDSS, 2025

2025

[30] [30]

Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024

Zhiwei Zhang, Minhua Lin, Junjie Xu, Zongyu Wu, Enyan Dai, and Suhang Wang. Robustness inspired graph backdoor defense.arXiv preprint arXiv:2406.09836, 2024

arXiv 2024

[31] [31]

Robust graph convolutional networks against adversarial attacks

Dingyuan Zhu, Ziwei Zhang, Peng Cui, and Wenwu Zhu. Robust graph convolutional networks against adversarial attacks. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 1399–1407, 2019

2019

[32] [32]

Gnnguard: Defending graph neural networks against adversarial attacks

Xiang Zhang and Marinka Zitnik. Gnnguard: Defending graph neural networks against adversarial attacks. Advances in neural information processing systems, 33:9263–9275, 2020

2020

[33] [33]

Graph structure learning for robust graph neural networks

Wei Jin, Yao Ma, Xiaorui Liu, Xianfeng Tang, Suhang Wang, and Jiliang Tang. Graph structure learning for robust graph neural networks. InProceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, pages 66–74, 2020

2020

[34] [34]

Certified robustness of graph neural networks against adversarial structural perturbation

Binghui Wang, Jinyuan Jia, Xiaoyu Cao, and Neil Zhenqiang Gong. Certified robustness of graph neural networks against adversarial structural perturbation. InProceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 1645–1653, 2021

2021

[35] [35]

Distributed backdoor attacks on federated graph learning and certified defenses

Yuxin Yang, Qiang Li, Jinyuan Jia, Yuan Hong, and Binghui Wang. Distributed backdoor attacks on federated graph learning and certified defenses. InProceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, pages 2829–2843, 2024

2024

[36] [36]

Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations

Jiate Li, Meng Pang, Yun Dong, and Binghui Wang. Deterministic certification of graph neural networks against graph poisoning attacks with arbitrary perturbations. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5020–5029, 2025

2025

[37] [37]

A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012

Ines Filipa Martins, Ana L Teixeira, Luis Pinheiro, and Andre O Falcao. A bayesian approach to in silico blood-brain barrier penetration modeling.Journal of chemical information and modeling, 52(6): 1686–1697, 2012

2012

[38] [38]

Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016

Govindan Subramanian, Bharath Ramsundar, Vijay Pande, and Rajiah Aldrin Denny. Computational modeling of β-secretase 1 (bace-1) inhibitors using ligand based approaches.Journal of chemical information and modeling, 56(10):1936–1949, 2016. 11

1936

[39] [39]

The sider database of drugs and side effects

Michael Kuhn, Ivica Letunic, Lars Juhl Jensen, and Peer Bork. The sider database of drugs and side effects. Nucleic acids research, 44(D1):D1075–D1079, 2016

2016

[40] [40]

Ruili Huang, Menghang Xia, Dac-Trung Nguyen, Tongan Zhao, Srilatha Sakamuru, Jinghua Zhao, Sampada A Shahane, Anna Rossoshek, and Anton Simeonov. Tox21challenge to build predictive models of nuclear receptor and stress response pathways as mediated by exposure to environmental chemicals and drugs.Frontiers in Environmental Science, 3:85, 2016

2016

[41] [41]

Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012

Yanli Wang, Jewen Xiao, Tugba O Suzek, Jian Zhang, Jiyao Wang, Zhigang Zhou, Lianyi Han, Karen Karapetyan, Svetlana Dracheva, Benjamin A Shoemaker, et al. Pubchem’s bioassay database.Nucleic acids research, 40(D1):D400–D412, 2012

2012

[42] [42]

Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009

Sebastian G Rohrer and Knut Baumann. Maximum unbiased validation (muv) data sets for virtual screening based on pubchem bioactivity data.Journal of chemical information and modeling, 49(2):169–184, 2009

2009

[43] [43]

Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997

Christopher A Lipinski, Franco Lombardo, Beryl W Dominy, and Paul J Feeney. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings.Advanced drug delivery reviews, 23(1-3):3–25, 1997

1997

[44] [44]

Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002

Daniel F Veber, Stephen R Johnson, Hung-Yuan Cheng, Brian R Smith, Keith W Ward, and Kenneth D Kopple. Molecular properties that influence the oral bioavailability of drug candidates.Journal of medicinal chemistry, 45(12):2615–2623, 2002

2002

[45] [45]

Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

G Richard Bickerton, Gaia V Paolini, Jérémy Besnard, Sorel Muresan, and Andrew L Hopkins. Quantifying the chemical beauty of drugs.Nature chemistry, 4(2):90–98, 2012

2012

[46] [46]

Peter C Austin. Balance diagnostics for comparing the distribution of baseline covariates between treatment groups in propensity-score matched samples.Statistics in medicine, 28(25):3083–3107, 2009

2009

[47] [47]

Donald J Schuirmann. A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability.Journal of pharmacokinetics and biopharmaceutics, 15(6):657–680, 1987

1987

[48] [48]

Equivalence tests: A practical primer for t tests, correlations, and meta-analyses

D Lakens. Equivalence tests: A practical primer for t tests, correlations, and meta-analyses. social psychological and personality science, 8 (4), 355–362, 2017. 12 A Real-world Motivating Scenario To ground the threat model, consider an organization that uses a centralized molecular classifier to screen third-party chemical submissions before downstream ...

2017