GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

Bolin Shen; Kaixiang Zhao; Shayok Chakraborty; Yushun Dong; Yuyang Dai

arxiv: 2605.12827 · v2 · pith:B6X7YNKJnew · submitted 2026-05-12 · 💻 cs.CR · cs.AI· cs.LG

GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

Kaixiang Zhao , Bolin Shen , Yuyang Dai , Shayok Chakraborty , Yushun Dong This is my paper

Pith reviewed 2026-06-30 21:51 UTC · model grok-4.3

classification 💻 cs.CR cs.AIcs.LG

keywords graph neural networksmodel extraction attackswatermarkingownership verificationbenchmarkadversarial machine learning

0 comments

The pith

Stealing a graph neural network succeeds at medium query budgets while most defenses fail to stop or trace it.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper builds GraphIP-Bench to run a single consistent black-box protocol that measures how readily an adversary can extract a GNN surrogate from query responses and whether ownership defenses can block or identify the theft. It evaluates twelve attacks against twelve defenses across ten graphs that span homophilic, heterophilic, and large-scale regimes, three backbones, and three tasks. The results show high-fidelity extraction at moderate budgets, that watermarks remain detectable on the original model but largely disappear on the surrogate, and that heterophilic structure raises the cost of extraction while architecture mismatch lowers but does not eliminate success.

Core claim

Under the unified benchmark, model extraction attacks on GNNs achieve high fidelity at medium query budgets, most defenses do not change extraction success, several watermarks verify on the protected model yet lose most verification signal on the extracted surrogate, heterophilic graphs are systematically harder to steal, and cross-architecture mismatch reduces but does not prevent extraction.

What carries the argument

GraphIP-Bench, the unified benchmark that standardizes twelve extraction attacks, twelve defenses, ten graphs, three backbones, three tasks, and joint attack-defense evaluation under one black-box protocol.

If this is right

GNN services become practically stealable once an adversary can issue a few thousand to tens of thousands of queries.
Ownership watermarks must be re-evaluated on extracted surrogates rather than only on the original model.
Heterophilic graph structure supplies measurable resistance to extraction that homophilic graphs lack.
Architecture mismatch between target and surrogate lowers fidelity but still permits usable extraction.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar joint attack-defense benchmarks could expose comparable gaps when applied to image or language models.
Future defenses may need to treat robustness against extraction as a primary requirement instead of an afterthought.
Large-scale or proprietary graphs could exhibit different extraction costs than the public graphs tested here.

Load-bearing premise

The chosen twelve attacks, twelve defenses, ten graphs spanning homophilic and heterophilic regimes, three backbones, and three tasks under a single black-box protocol are representative of real-world GNN deployments and threat models.

What would settle it

Re-running the full attack-defense matrix on a fresh collection of graphs or under different query budgets and finding that extraction fidelity stays low at medium budgets or that watermarks retain verification power on surrogates would falsify the central claims.

Figures

Figures reproduced from arXiv: 2605.12827 by Bolin Shen, Kaixiang Zhao, Shayok Chakraborty, Yushun Dong, Yuyang Dai.

**Figure 1.** Figure 1: Sample efficiency across ten datasets and four regimes. Extension to large-scale and heterophilic graphs. The seven graphs in the original protocol are small and homophilic, which limits the conclusions to one structural regime. To extend the protocol along three independent axes, we evaluate the same twelve attacks on three additional graphs: OGBN-Arxiv (169,343 nodes, 40 classes, edge homophily 0.699) … view at source ↗

**Figure 3.** Figure 3: Surrogate fidelity (%) at budget 0.25× on the three additional graphs for all twelve attacks (mean over three seeds, whiskers are ± one std). Among the watermarking and integrity defenses, BackdoorWM offers the best protection-utility balance: it reaches the highest median fidelity (80.07%), perfect verification, and only a 3.27 pp median utility drop. ImperceptibleWM also reaches perfect verification, b… view at source ↗

**Figure 4.** Figure 4: Protection-utility scatter across ten datasets and three seeds per defense (one point per dataset-seed run). Upper-left is best. To answer RQ3, we evaluate the defended model’s task utility and its alignment with the original target together with ownership verification on a fixed verification set under the unified protocol in Section 3. All defenses protect the same architecture and use identical splits; … view at source ↗

**Figure 5.** Figure 5: Defense effectiveness across all ten datasets: (a) utility drop and (b) ownership verification. [PITH_FULL_IMAGE:figures/full_fig_p032_5.png] view at source ↗

**Figure 6.** Figure 6: Radar profile of the five watermarking and integrity defenses across six axes. F1, fidelity, [PITH_FULL_IMAGE:figures/full_fig_p034_6.png] view at source ↗

**Figure 7.** Figure 7: Heatmap view of the seven information-limiting and query-detection defenses across all [PITH_FULL_IMAGE:figures/full_fig_p037_7.png] view at source ↗

**Figure 8.** Figure 8: Peak GPU memory (GB) on a symmetric-log scale, aggregated across all ten datasets. Bars [PITH_FULL_IMAGE:figures/full_fig_p038_8.png] view at source ↗

**Figure 9.** Figure 9: Wall-clock cost summary on a log scale. (a) Total attack time (min) at [PITH_FULL_IMAGE:figures/full_fig_p038_9.png] view at source ↗

**Figure 10.** Figure 10: Budget–metric curves on all ten datasets (columns) for accuracy, fidelity, and macro [PITH_FULL_IMAGE:figures/full_fig_p039_10.png] view at source ↗

**Figure 11.** Figure 11: Regime sensitivity across budgets. Cells show the ratio between the average fidelity in a [PITH_FULL_IMAGE:figures/full_fig_p040_11.png] view at source ↗

**Figure 12.** Figure 12: Per-attack surrogate-fidelity curves on all ten datasets in the [PITH_FULL_IMAGE:figures/full_fig_p040_12.png] view at source ↗

**Figure 13.** Figure 13: Regime sensitivity from two complementary views: differential (left) and absolute (right). [PITH_FULL_IMAGE:figures/full_fig_p041_13.png] view at source ↗

**Figure 14.** Figure 14: Distribution of per-run attack wall-clock time (log scale, in minutes) across all (dataset, [PITH_FULL_IMAGE:figures/full_fig_p041_14.png] view at source ↗

**Figure 15.** Figure 15: Pairwise Pearson correlation between the twelve attacks, where each attack is represented [PITH_FULL_IMAGE:figures/full_fig_p042_15.png] view at source ↗

**Figure 16.** Figure 16: Per-dataset utility loss of the seven information-limiting and query-detection defenses [PITH_FULL_IMAGE:figures/full_fig_p042_16.png] view at source ↗

**Figure 17.** Figure 17: RQ5 joint evaluation on Computers at 0.25× (mean over three seeds). (a) Surrogate fidelity (%) against the five watermarks. (b) Surrogate fidelity (%) against the seven informationlimiting defenses; PRADA and AdaptMisinfo reduce the strongest attacks from 80–92% to 25–55%. (c) Watermark verification rate (%) on the extracted surrogate; Integrity survives at near 100% on most attacks, whereas SurviveWM, R… view at source ↗

**Figure 18.** Figure 18: Distribution and structural correlates of watermark survival. [PITH_FULL_IMAGE:figures/full_fig_p043_18.png] view at source ↗

**Figure 19.** Figure 19: Distributional views of surrogate fidelity. (a) shows that fidelity distributions are nearly [PITH_FULL_IMAGE:figures/full_fig_p045_19.png] view at source ↗

**Figure 20.** Figure 20: Per-dataset heatmap grid of joint surrogate fidelity (%). Each panel is a [PITH_FULL_IMAGE:figures/full_fig_p052_20.png] view at source ↗

**Figure 21.** Figure 21: Cross-task heatmap of surrogate fidelity (%) on three task settings: link prediction on [PITH_FULL_IMAGE:figures/full_fig_p054_21.png] view at source ↗

read the original abstract

Graph neural networks (GNNs) deployed as cloud services can be stolen through model-extraction attacks, which train a surrogate from query responses to reproduce the target's behavior, and a growing line of ownership defenses tries to prevent or trace such theft. This paper asks two questions: how hard is it to steal a GNN, and can we stop it? Prior work cannot answer either, because experiments use inconsistent datasets, threat models, and metrics. We introduce GraphIP-Bench, a unified benchmark that evaluates both sides under a single black-box protocol. GraphIP-Bench integrates twelve extraction attacks, twelve defenses spanning watermarking, output perturbation, and query-pattern detection, ten public graphs covering homophilic, heterophilic, and large-scale regimes, three GNN backbones, and three graph-learning tasks. It reports fidelity, task utility, ownership verification, and computational cost on shared splits, queries, and budgets. We further add a joint attack-and-defense track that runs every attack on every defended target and measures watermark verification on the resulting surrogate, exposing how much protection a defense retains after extraction. The empirical picture is clear: stealing a GNN is easy at medium query budgets and most defenses do not change this; several watermarks verify reliably on the protected model but lose most of their verification signal on the extracted surrogate, exposing a gap that single-model evaluations miss; and heterophilic graphs are systematically harder to steal, while a cross-architecture mismatch between target and surrogate reduces but does not prevent extraction. We release GraphIP-Bench with reproducible scripts and configurations, and integrate the attacks and defenses into the PyGIP library. Code: https://github.com/LabRAI/GraphIP-Bench. Library: https://labrai.github.io/PyGIP/index.html.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GraphIP-Bench gives the field a single-protocol benchmark for GNN extraction and defenses, but its headline claims about stealing ease rest on whether the fixed set of 12 attacks, 12 defenses, and 10 graphs is representative enough.

read the letter

The paper's core contribution is GraphIP-Bench, which runs twelve extraction attacks against twelve defenses across ten graphs (homophilic, heterophilic, large-scale), three backbones, and three tasks under one black-box protocol. It also adds a joint track that measures how well watermarks survive on the extracted surrogate. This setup is new; earlier papers used mismatched datasets, budgets, and metrics, so comparisons were difficult. Releasing the code and folding the components into PyGIP is a practical step that makes the benchmark usable.

The reported picture—that medium-budget extraction works reliably, most defenses add little, watermarks lose signal on surrogates, and heterophily raises the bar—follows directly from running everything together. If the full paper documents the exact query budgets, fidelity metrics, and verification procedures clearly, the joint track is the part that single-model papers miss.

The main soft spot is selection. The stress-test concern is fair: the trends only generalize if these twelve attacks include the stronger or adaptive variants that would target heterophilic structure, and if the twelve defenses cover the main classes without obvious omissions. The abstract does not spell out the justification for the exact lists, so a reader has to check whether the choices were driven by coverage or by what was already implemented. Minor gaps in documentation of splits or cost measurement would be easy to fix.

This is a benchmark paper, so it is for researchers working on GNN security or cloud model protection who need a shared testbed. It is coherent on its own terms and deserves peer review to verify the methodology details and selection rationale.

Referee Report

2 major / 2 minor

Summary. The paper introduces GraphIP-Bench, a unified benchmark for evaluating model extraction attacks against GNNs and ownership defenses under a consistent black-box protocol. It combines 12 attacks, 12 defenses (watermarking, perturbation, detection), 10 graphs (homophilic, heterophilic, large-scale), 3 backbones, and 3 tasks, reporting fidelity, utility, ownership verification, and cost. Key findings include that GNN stealing is feasible at medium query budgets, most defenses fail to prevent it, watermarks verify on protected models but not on extracted surrogates, heterophilic graphs are harder to steal, and architecture mismatch reduces but does not eliminate extraction success. The benchmark and code are released for reproducibility.

Significance. If the benchmark components are representative, this work provides the first standardized evaluation framework for GNN intellectual property protection, revealing critical gaps in existing defenses that single-model tests miss. The release of reproducible scripts, configurations, and integration into the PyGIP library is a notable strength that enables future research.

major comments (2)

[Benchmark construction] Benchmark construction section: the selection of the twelve attacks, twelve defenses, and ten graphs is presented without a systematic justification or coverage analysis of the broader space of attacks (e.g., adaptive or query-efficient variants targeting heterophilic structure) and defenses; without evidence that these choices are representative rather than convenience-driven, the central claims that 'stealing a GNN is easy at medium query budgets and most defenses do not change this' and that 'heterophilic graphs are systematically harder' do not follow from the reported experiments.
[Joint attack-and-defense track] Joint attack-and-defense track (results section): the claim that watermarks 'verify reliably on the protected model but lose most of their verification signal on the extracted surrogate' is load-bearing for the gap identified in single-model evaluations, yet the paper reports this only for 'several' watermarks without quantifying the fraction of verification signal retained across all watermarking defenses or providing statistical tests for the loss.

minor comments (2)

[Abstract] Abstract: the phrase 'medium query budgets' is used without a concrete range or reference to the specific budgets in the experimental setup, which would aid immediate assessment of the findings.
[Experimental setup] Experimental setup: the single black-box protocol is clearly stated, but a brief discussion of why other threat models (e.g., gray-box) were excluded would strengthen the scope statement.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on GraphIP-Bench. The comments highlight important aspects of benchmark design and result presentation. We address each major comment below with clarifications and proposed revisions where appropriate.

read point-by-point responses

Referee: [Benchmark construction] Benchmark construction section: the selection of the twelve attacks, twelve defenses, and ten graphs is presented without a systematic justification or coverage analysis of the broader space of attacks (e.g., adaptive or query-efficient variants targeting heterophilic structure) and defenses; without evidence that these choices are representative rather than convenience-driven, the central claims that 'stealing a GNN is easy at medium query budgets and most defenses do not change this' and that 'heterophilic graphs are systematically harder' do not follow from the reported experiments.

Authors: The twelve attacks were chosen to span the primary categories in the GNN extraction literature (gradient-based, query-based, and structure-aware methods), the twelve defenses cover the three main IP-protection paradigms (watermarking, output perturbation, and query detection), and the ten graphs were selected to include both homophilic and heterophilic regimes plus large-scale examples. These choices draw directly from the most-cited prior works to ensure the benchmark reflects current practice rather than convenience. We acknowledge that an exhaustive enumeration of every adaptive variant is outside the paper's scope. In revision we will add an explicit subsection in the benchmark construction section that (i) tabulates the category coverage, (ii) cites the representative papers for each selected method, and (iii) explains why the observed trends on medium budgets and heterophilic difficulty are expected to generalize within the covered categories. revision: partial
Referee: [Joint attack-and-defense track] Joint attack-and-defense track (results section): the claim that watermarks 'verify reliably on the protected model but lose most of their verification signal on the extracted surrogate' is load-bearing for the gap identified in single-model evaluations, yet the paper reports this only for 'several' watermarks without quantifying the fraction of verification signal retained across all watermarking defenses or providing statistical tests for the loss.

Authors: The joint-track experiments were in fact run on every watermarking defense; the word 'several' in the abstract and discussion was used to highlight the consistent pattern rather than to indicate a subset. To make the claim fully quantitative we will add a new table (and accompanying text) that reports, for each watermarking defense, the verification accuracy on the protected target versus the extracted surrogate, the retained signal fraction, and the results of paired statistical tests. This will replace the current qualitative statement with precise numbers and significance levels. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark with no derivations or fitted predictions

full rationale

The paper is a benchmarking study that integrates existing attacks and defenses under a unified protocol and reports empirical metrics (fidelity, utility, verification) on fixed datasets and budgets. No equations, derivations, parameter fits, or self-citation chains are present that reduce any claimed result to its inputs by construction. The central claims rest on the representativeness of the chosen components rather than any self-referential reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an empirical benchmarking paper; the central claims rest on the experimental setup and chosen components rather than mathematical axioms, fitted parameters, or invented entities.

pith-pipeline@v0.9.1-grok · 5881 in / 1214 out tokens · 30260 ms · 2026-06-30T21:51:49.251171+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

AGDN: Learning to Solve Traveling Salesman Problem with Anisotropic Graph Diffusion Network
cs.LG 2026-06 unverdicted novelty 7.0

AGDN is a new GNN framework using a MixScore matrix and anisotropic graph diffusion to outperform prior methods on TSP instances across sizes and distributions.

Reference graph

Works this paper leans on

43 extracted references · 14 canonical work pages · cited by 1 Pith paper · 4 internal anchors

[1]

Pregip: Watermarking the pretraining of graph neural networks for deep intellectual property protection.arXiv preprint arXiv:2402.04435, 2024

Enyan Dai, Minhua Lin, and Suhang Wang. Pregip: Watermarking the pretraining of graph neural networks for deep intellectual property protection.arXiv preprint arXiv:2402.04435, 2024

work page arXiv 2024
[2]

A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability.Machine Intelligence Research, pages 1–51, 2024

Enyan Dai, Tianxiang Zhao, Huaisheng Zhu, Junjie Xu, Zhimeng Guo, Hui Liu, Jiliang Tang, and Suhang Wang. A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability.Machine Intelligence Research, pages 1–51, 2024

2024
[3]

Adversarial model extraction on graph neural networks.arXiv preprint arXiv:1912.07721, 2019

David DeFazio and Arti Ramesh. Adversarial model extraction on graph neural networks.arXiv preprint arXiv:1912.07721, 2019

work page arXiv 1912
[4]

A realistic model extraction attack against graph neural networks.Knowledge-Based Systems, page 112144, 2024

Faqian Guan, Tianqing Zhu, Hanjin Tong, and Wanlei Zhou. A realistic model extraction attack against graph neural networks.Knowledge-Based Systems, page 112144, 2024

2024
[5]

Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

2017
[6]

Prada: protecting against dnn model stealing attacks

Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512–527, 2019

2019
[7]

Defending against model stealing attacks with adaptive misinformation

Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770–778, 2020

2020
[8]

Model extraction warning in mlaas paradigm

Manish Kesarwani, Bhaskar Mukhoty, Vijay Arya, and Sameep Mehta. Model extraction warning in mlaas paradigm. InProceedings of the 34th Annual Computer Security Applications Conference, pages 371–380, 2018

2018
[9]

Semi-Supervised Classification with Graph Convolutional Networks

Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[10]

Intellectual property in graph-based machine learning as a service: Attacks and defenses

Lincan Li, Bolin Shen, Chenxi Zhao, Yuxiang Sun, Kaixiang Zhao, Shirui Pan, and Yushun Dong. Intellectual property in graph-based machine learning as a service: Attacks and defenses. arXiv preprint arXiv:2508.19641, 2025

work page arXiv 2025
[11]

Drug repurposing based on the dtd-gnn graph neural network: revealing the relationships among drugs, targets and diseases

Wenjun Li, Wanjun Ma, Mengyun Yang, and Xiwei Tang. Drug repurposing based on the dtd-gnn graph neural network: revealing the relationships among drugs, targets and diseases. BMC genomics, 25, 2024

2024
[12]

Model extraction attacks revisited

Jiacheng Liang, Ren Pang, Changjiang Li, and Ting Wang. Model extraction attacks revisited. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security, pages 1231–1245, 2024

2024
[13]

How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection

Mantas Mazeika, Bo Li, and David Forsyth. How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection. InInternational conference on machine learning, pages 15241–15254. PMLR, 2022

2022
[14]

Dag-net: Double attentive graph neural network for trajectory forecasting

Alessio Monti, Alessia Bertugli, Simone Calderara, and Rita Cucchiara. Dag-net: Double attentive graph neural network for trajectory forecasting. In2020 25th international conference on pattern recognition (ICPR), pages 2551–2558. IEEE, 2021. 10

2021
[15]

TUDataset: A collection of benchmark datasets for learning with graphs

Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2007
[16]

Knockoff nets: Stealing functionality of black-box models

Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. Knockoff nets: Stealing functionality of black-box models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4954–4963, 2019

2019
[17]

Intellectual property protection of dnn models.World Wide Web, 26(4):1877–1911, 2023

Sen Peng, Yufei Chen, Jie Xu, Zizhuo Chen, Cong Wang, and Xiaohua Jia. Intellectual property protection of dnn models.World Wide Web, 26(4):1877–1911, 2023

1911
[18]

A critical look at the evaluation of gnns under heterophily: Are we really making progress?arXiv preprint arXiv:2302.11640,

Oleg Platonov, Denis Kuznedelev, Michael Diskin, Artem Babenko, and Liudmila Prokhorenkova. A critical look at the evaluation of gnns under heterophily: Are we really making progress?arXiv preprint arXiv:2302.11640, 2023

work page arXiv 2023
[19]

Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation

Zhisheng Qi, Utkarsh Sahu, Li Ma, Haoyu Han, Ryan Rossi, Franck Dernoncourt, Ma- hantesh Halappanavar, Nesreen Ahmed, Yushun Dong, Yue Zhao, et al. Benchmarking knowledge-extraction attack and defense on retrieval-augmented generation.arXiv preprint arXiv:2602.09319, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026
[20]

A review of adversarial attacks and defenses on graphs

Hanjin Sun, Wen Yang, and Yatie Xiao. A review of adversarial attacks and defenses on graphs. InProceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering, page 416–421, 2024

2024
[21]

Yu, Lifang He, and Bo Li

Lichao Sun, Yingtong Dou, Carl Yang, Kai Zhang, Ji Wang, Philip S. Yu, Lifang He, and Bo Li. Adversarial attack and defense on graph data: A survey.IEEE Transactions on Knowledge and Data Engineering, 35(8):7693–7711, 2023

2023
[22]

Deep intellectual property protection: A survey.arXiv preprint arXiv:2304.14613, 2023

Yuchen Sun, Tianpeng Liu, Panhe Hu, Qing Liao, Shaojing Fu, Nenghai Yu, Deke Guo, Yongxiang Liu, and Li Liu. Deep intellectual property protection: A survey.arXiv preprint arXiv:2304.14613, 2023

work page arXiv 2023
[23]

Stealing machine learning models via prediction {APIs}

Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. Stealing machine learning models via prediction {APIs}. In25th USENIX security symposium (USENIX Security 16), pages 601–618, 2016

2016
[24]

Graph Attention Networks

Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[25]

Asim Waheed, Vasisht Duddu, and N. Asokan. Grove: Ownership verification of graph neural networks using embeddings. In2024 IEEE Symposium on Security and Privacy (SP), pages 2460–2477, 2024

2024
[26]

Making watermark survive model extraction attacks in graph neural networks

Haiming Wang, Zhikun Zhang, Min Chen, and Shibo He. Making watermark survive model extraction attacks in graph neural networks. InICC 2023-IEEE International Conference on Communications, pages 57–62, 2023

2023
[27]

Cega: A cost-effective approach for graph-based model extraction and acquisition.arXiv preprint arXiv:2506.17709, 2025

Zebin Wang, Menghan Lin, Bolin Shen, Ken Anderson, Molei Liu, Tianxi Cai, and Yushun Dong. Cega: A cost-effective approach for graph-based model extraction and acquisition.arXiv preprint arXiv:2506.17709, 2025

work page arXiv 2025
[28]

Model extraction attacks on graph neural networks: Taxonomy and realisation

Bang Wu, Xiangwen Yang, Shirui Pan, and Xingliang Yuan. Model extraction attacks on graph neural networks: Taxonomy and realisation. InProceedings of the 2022 ACM on Asia conference on computer and communications security, pages 337–350, 2022

2022
[29]

Securing graph neural networks in mlaas: A comprehensive realization of query-based integrity verification

Bang Wu, Xingliang Yuan, Shuo Wang, Qi Li, Minhui Xue, and Shirui Pan. Securing graph neural networks in mlaas: A comprehensive realization of query-based integrity verification. In 2024 IEEE Symposium on Security and Privacy (SP), pages 2534–2552. IEEE, 2024

2024
[30]

Trustworthy graph learning: Reliability, explainability, and privacy protection

Bingzhe Wu, Yatao Bian, Hengtong Zhang, Jintang Li, Junchi Yu, Liang Chen, Chaochao Chen, and Junzhou Huang. Trustworthy graph learning: Reliability, explainability, and privacy protection. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4838–4839, 2022. 11

2022
[31]

Watermarking graph neural networks based on backdoor attacks

Jing Xu, Stefanos Koffas, O ˘guzhan Ersoy, and Stjepan Picek. Watermarking graph neural networks based on backdoor attacks. In2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), pages 1179–1197, 2023

2023
[32]

Intellectual property protection for deep learning models: Taxonomy, methods, attacks, and evaluations.IEEE Transactions on Artificial Intelligence, 3(6):908–923, 2021

Mingfu Xue, Yushu Zhang, Jian Wang, and Weiqiang Liu. Intellectual property protection for deep learning models: Taxonomy, methods, attacks, and evaluations.IEEE Transactions on Artificial Intelligence, 3(6):908–923, 2021

2021
[33]

Dgrec: Graph neural network for recommendation with diversified embedding generation

Liangwei Yang, Shengjie Wang, Yunzhe Tao, Jiankai Sun, Xiaolong Liu, Philip S Yu, and Taiqing Wang. Dgrec: Graph neural network for recommendation with diversified embedding generation. InProceedings of the sixteenth ACM international conference on web search and data mining, pages 661–669, 2023

2023
[34]

Gnnfingers: A fingerprinting framework for verifying ownerships of graph neural networks

Xiaoyu You, Youhe Jiang, Jianwei Xu, Mi Zhang, and Min Yang. Gnnfingers: A fingerprinting framework for verifying ownerships of graph neural networks. InProceedings of the ACM on Web Conference 2024, pages 652–663, 2024

2024
[35]

Trustworthy graph neural networks: Aspects, methods and trends.arXiv preprint arXiv:2205.07424, 2022

He Zhang, Bang Wu, Xingliang Yuan, Shirui Pan, Hanghang Tong, and Jian Pei. Trustworthy graph neural networks: Aspects, methods and trends.arXiv preprint arXiv:2205.07424, 2022

work page arXiv 2022
[36]

An imperceptible and owner-unique watermarking method for graph neural networks

Linji Zhang, Mingfu Xue, Leo Yu Zhang, Yushu Zhang, and Weiqiang Liu. An imperceptible and owner-unique watermarking method for graph neural networks. InProceedings of the ACM Turing Award Celebration Conference-China 2024, pages 108–113, 2024

2024
[37]

arXiv preprint arXiv:2502.16065 , year=

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A survey of model extraction attacks and defenses in distributed computing environments.arXiv preprint arXiv:2502.16065, 2025

work page arXiv 2025
[38]

arXiv preprint arXiv:2506.22521 , year =

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A survey on model extraction attacks and defenses for large language models.arXiv preprint arXiv:2506.22521, 2025

work page arXiv 2025
[39]

A systematic survey of model extraction attacks and defenses: State-of-the-art and perspectives

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A systematic survey of model extraction attacks and defenses: State-of-the-art and perspectives. arXiv preprint arXiv:2508.15031, 2025

work page arXiv 2025
[40]

Watermarking graph neural networks by random graphs

Xiangyu Zhao, Hanzhou Wu, and Xinpeng Zhang. Watermarking graph neural networks by random graphs. In2021 9th International Symposium on Digital Forensics and Security (ISDFS), pages 1–6, 2021

2021
[41]

Graph robustness benchmark: Benchmarking the adversarial robustness of graph machine learning

Qinkai Zheng, Xu Zou, Yuxiao Dong, Yukuo Cen, Da Yin, Jiarong Xu, Yang Yang, and Jie Tang. Graph robustness benchmark: Benchmarking the adversarial robustness of graph machine learning. InThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021

2021
[42]

Yuanxin Zhuang, Chuan Shi, Mengmei Zhang, Jinghui Chen, Lingjuan Lyu, Pan Zhou, and Lichao Sun. Unveiling the secrets without data: Can graph neural networks be exploited through {Data-Free} model extraction attacks? In33rd USENIX Security Symposium (USENIX Security 24), pages 5251–5268, 2024. 12 Appendix Contents A Related Work 14 B Limitations and Futur...

2024
[43]

the extraction can achieve up to ∼80% fidelity

with NVIDIA driver 570.195.03 and a CUDA 12.8 driver capability. Login nodes have no GPU access, so all timing and memory numbers are recorded from Slurm-allocated GPU jobs only. Software stack.The full pipeline runs in a dedicated Conda environment namedgraphip: Python 3.11.15, PyTorch 2.2.1 + cu121, DGL 2.1.0 + cu121, PyTorch Geometric 2.7.0, OGB 1.3.6,...

2022

[1] [1]

Pregip: Watermarking the pretraining of graph neural networks for deep intellectual property protection.arXiv preprint arXiv:2402.04435, 2024

Enyan Dai, Minhua Lin, and Suhang Wang. Pregip: Watermarking the pretraining of graph neural networks for deep intellectual property protection.arXiv preprint arXiv:2402.04435, 2024

work page arXiv 2024

[2] [2]

A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability.Machine Intelligence Research, pages 1–51, 2024

Enyan Dai, Tianxiang Zhao, Huaisheng Zhu, Junjie Xu, Zhimeng Guo, Hui Liu, Jiliang Tang, and Suhang Wang. A comprehensive survey on trustworthy graph neural networks: Privacy, robustness, fairness, and explainability.Machine Intelligence Research, pages 1–51, 2024

2024

[3] [3]

Adversarial model extraction on graph neural networks.arXiv preprint arXiv:1912.07721, 2019

David DeFazio and Arti Ramesh. Adversarial model extraction on graph neural networks.arXiv preprint arXiv:1912.07721, 2019

work page arXiv 1912

[4] [4]

A realistic model extraction attack against graph neural networks.Knowledge-Based Systems, page 112144, 2024

Faqian Guan, Tianqing Zhu, Hanjin Tong, and Wanlei Zhou. A realistic model extraction attack against graph neural networks.Knowledge-Based Systems, page 112144, 2024

2024

[5] [5]

Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

2017

[6] [6]

Prada: protecting against dnn model stealing attacks

Mika Juuti, Sebastian Szyller, Samuel Marchal, and N Asokan. Prada: protecting against dnn model stealing attacks. In2019 IEEE European Symposium on Security and Privacy (EuroS&P), pages 512–527, 2019

2019

[7] [7]

Defending against model stealing attacks with adaptive misinformation

Sanjay Kariyappa and Moinuddin K Qureshi. Defending against model stealing attacks with adaptive misinformation. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 770–778, 2020

2020

[8] [8]

Model extraction warning in mlaas paradigm

Manish Kesarwani, Bhaskar Mukhoty, Vijay Arya, and Sameep Mehta. Model extraction warning in mlaas paradigm. InProceedings of the 34th Annual Computer Security Applications Conference, pages 371–380, 2018

2018

[9] [9]

Semi-Supervised Classification with Graph Convolutional Networks

Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[10] [10]

Intellectual property in graph-based machine learning as a service: Attacks and defenses

Lincan Li, Bolin Shen, Chenxi Zhao, Yuxiang Sun, Kaixiang Zhao, Shirui Pan, and Yushun Dong. Intellectual property in graph-based machine learning as a service: Attacks and defenses. arXiv preprint arXiv:2508.19641, 2025

work page arXiv 2025

[11] [11]

Drug repurposing based on the dtd-gnn graph neural network: revealing the relationships among drugs, targets and diseases

Wenjun Li, Wanjun Ma, Mengyun Yang, and Xiwei Tang. Drug repurposing based on the dtd-gnn graph neural network: revealing the relationships among drugs, targets and diseases. BMC genomics, 25, 2024

2024

[12] [12]

Model extraction attacks revisited

Jiacheng Liang, Ren Pang, Changjiang Li, and Ting Wang. Model extraction attacks revisited. InProceedings of the 19th ACM Asia Conference on Computer and Communications Security, pages 1231–1245, 2024

2024

[13] [13]

How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection

Mantas Mazeika, Bo Li, and David Forsyth. How to steer your adversary: Targeted and efficient model stealing defenses with gradient redirection. InInternational conference on machine learning, pages 15241–15254. PMLR, 2022

2022

[14] [14]

Dag-net: Double attentive graph neural network for trajectory forecasting

Alessio Monti, Alessia Bertugli, Simone Calderara, and Rita Cucchiara. Dag-net: Double attentive graph neural network for trajectory forecasting. In2020 25th international conference on pattern recognition (ICPR), pages 2551–2558. IEEE, 2021. 10

2021

[15] [15]

TUDataset: A collection of benchmark datasets for learning with graphs

Christopher Morris, Nils M Kriege, Franka Bause, Kristian Kersting, Petra Mutzel, and Marion Neumann. Tudataset: A collection of benchmark datasets for learning with graphs.arXiv preprint arXiv:2007.08663, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2007

[16] [16]

Knockoff nets: Stealing functionality of black-box models

Tribhuvanesh Orekondy, Bernt Schiele, and Mario Fritz. Knockoff nets: Stealing functionality of black-box models. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4954–4963, 2019

2019

[17] [17]

Intellectual property protection of dnn models.World Wide Web, 26(4):1877–1911, 2023

Sen Peng, Yufei Chen, Jie Xu, Zizhuo Chen, Cong Wang, and Xiaohua Jia. Intellectual property protection of dnn models.World Wide Web, 26(4):1877–1911, 2023

1911

[18] [18]

A critical look at the evaluation of gnns under heterophily: Are we really making progress?arXiv preprint arXiv:2302.11640,

Oleg Platonov, Denis Kuznedelev, Michael Diskin, Artem Babenko, and Liudmila Prokhorenkova. A critical look at the evaluation of gnns under heterophily: Are we really making progress?arXiv preprint arXiv:2302.11640, 2023

work page arXiv 2023

[19] [19]

Benchmarking Knowledge-Extraction Attack and Defense on Retrieval-Augmented Generation

Zhisheng Qi, Utkarsh Sahu, Li Ma, Haoyu Han, Ryan Rossi, Franck Dernoncourt, Ma- hantesh Halappanavar, Nesreen Ahmed, Yushun Dong, Yue Zhao, et al. Benchmarking knowledge-extraction attack and defense on retrieval-augmented generation.arXiv preprint arXiv:2602.09319, 2026

work page internal anchor Pith review Pith/arXiv arXiv 2026

[20] [20]

A review of adversarial attacks and defenses on graphs

Hanjin Sun, Wen Yang, and Yatie Xiao. A review of adversarial attacks and defenses on graphs. InProceedings of the 4th International Conference on Artificial Intelligence and Computer Engineering, page 416–421, 2024

2024

[21] [21]

Yu, Lifang He, and Bo Li

Lichao Sun, Yingtong Dou, Carl Yang, Kai Zhang, Ji Wang, Philip S. Yu, Lifang He, and Bo Li. Adversarial attack and defense on graph data: A survey.IEEE Transactions on Knowledge and Data Engineering, 35(8):7693–7711, 2023

2023

[22] [22]

Deep intellectual property protection: A survey.arXiv preprint arXiv:2304.14613, 2023

Yuchen Sun, Tianpeng Liu, Panhe Hu, Qing Liao, Shaojing Fu, Nenghai Yu, Deke Guo, Yongxiang Liu, and Li Liu. Deep intellectual property protection: A survey.arXiv preprint arXiv:2304.14613, 2023

work page arXiv 2023

[23] [23]

Stealing machine learning models via prediction {APIs}

Florian Tramèr, Fan Zhang, Ari Juels, Michael K Reiter, and Thomas Ristenpart. Stealing machine learning models via prediction {APIs}. In25th USENIX security symposium (USENIX Security 16), pages 601–618, 2016

2016

[24] [24]

Graph Attention Networks

Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[25] [25]

Asim Waheed, Vasisht Duddu, and N. Asokan. Grove: Ownership verification of graph neural networks using embeddings. In2024 IEEE Symposium on Security and Privacy (SP), pages 2460–2477, 2024

2024

[26] [26]

Making watermark survive model extraction attacks in graph neural networks

Haiming Wang, Zhikun Zhang, Min Chen, and Shibo He. Making watermark survive model extraction attacks in graph neural networks. InICC 2023-IEEE International Conference on Communications, pages 57–62, 2023

2023

[27] [27]

Cega: A cost-effective approach for graph-based model extraction and acquisition.arXiv preprint arXiv:2506.17709, 2025

Zebin Wang, Menghan Lin, Bolin Shen, Ken Anderson, Molei Liu, Tianxi Cai, and Yushun Dong. Cega: A cost-effective approach for graph-based model extraction and acquisition.arXiv preprint arXiv:2506.17709, 2025

work page arXiv 2025

[28] [28]

Model extraction attacks on graph neural networks: Taxonomy and realisation

Bang Wu, Xiangwen Yang, Shirui Pan, and Xingliang Yuan. Model extraction attacks on graph neural networks: Taxonomy and realisation. InProceedings of the 2022 ACM on Asia conference on computer and communications security, pages 337–350, 2022

2022

[29] [29]

Securing graph neural networks in mlaas: A comprehensive realization of query-based integrity verification

Bang Wu, Xingliang Yuan, Shuo Wang, Qi Li, Minhui Xue, and Shirui Pan. Securing graph neural networks in mlaas: A comprehensive realization of query-based integrity verification. In 2024 IEEE Symposium on Security and Privacy (SP), pages 2534–2552. IEEE, 2024

2024

[30] [30]

Trustworthy graph learning: Reliability, explainability, and privacy protection

Bingzhe Wu, Yatao Bian, Hengtong Zhang, Jintang Li, Junchi Yu, Liang Chen, Chaochao Chen, and Junzhou Huang. Trustworthy graph learning: Reliability, explainability, and privacy protection. InProceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4838–4839, 2022. 11

2022

[31] [31]

Watermarking graph neural networks based on backdoor attacks

Jing Xu, Stefanos Koffas, O ˘guzhan Ersoy, and Stjepan Picek. Watermarking graph neural networks based on backdoor attacks. In2023 IEEE 8th European Symposium on Security and Privacy (EuroS&P), pages 1179–1197, 2023

2023

[32] [32]

Intellectual property protection for deep learning models: Taxonomy, methods, attacks, and evaluations.IEEE Transactions on Artificial Intelligence, 3(6):908–923, 2021

Mingfu Xue, Yushu Zhang, Jian Wang, and Weiqiang Liu. Intellectual property protection for deep learning models: Taxonomy, methods, attacks, and evaluations.IEEE Transactions on Artificial Intelligence, 3(6):908–923, 2021

2021

[33] [33]

Dgrec: Graph neural network for recommendation with diversified embedding generation

Liangwei Yang, Shengjie Wang, Yunzhe Tao, Jiankai Sun, Xiaolong Liu, Philip S Yu, and Taiqing Wang. Dgrec: Graph neural network for recommendation with diversified embedding generation. InProceedings of the sixteenth ACM international conference on web search and data mining, pages 661–669, 2023

2023

[34] [34]

Gnnfingers: A fingerprinting framework for verifying ownerships of graph neural networks

Xiaoyu You, Youhe Jiang, Jianwei Xu, Mi Zhang, and Min Yang. Gnnfingers: A fingerprinting framework for verifying ownerships of graph neural networks. InProceedings of the ACM on Web Conference 2024, pages 652–663, 2024

2024

[35] [35]

Trustworthy graph neural networks: Aspects, methods and trends.arXiv preprint arXiv:2205.07424, 2022

He Zhang, Bang Wu, Xingliang Yuan, Shirui Pan, Hanghang Tong, and Jian Pei. Trustworthy graph neural networks: Aspects, methods and trends.arXiv preprint arXiv:2205.07424, 2022

work page arXiv 2022

[36] [36]

An imperceptible and owner-unique watermarking method for graph neural networks

Linji Zhang, Mingfu Xue, Leo Yu Zhang, Yushu Zhang, and Weiqiang Liu. An imperceptible and owner-unique watermarking method for graph neural networks. InProceedings of the ACM Turing Award Celebration Conference-China 2024, pages 108–113, 2024

2024

[37] [37]

arXiv preprint arXiv:2502.16065 , year=

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A survey of model extraction attacks and defenses in distributed computing environments.arXiv preprint arXiv:2502.16065, 2025

work page arXiv 2025

[38] [38]

arXiv preprint arXiv:2506.22521 , year =

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A survey on model extraction attacks and defenses for large language models.arXiv preprint arXiv:2506.22521, 2025

work page arXiv 2025

[39] [39]

A systematic survey of model extraction attacks and defenses: State-of-the-art and perspectives

Kaixiang Zhao, Lincan Li, Kaize Ding, Neil Zhenqiang Gong, Yue Zhao, and Yushun Dong. A systematic survey of model extraction attacks and defenses: State-of-the-art and perspectives. arXiv preprint arXiv:2508.15031, 2025

work page arXiv 2025

[40] [40]

Watermarking graph neural networks by random graphs

Xiangyu Zhao, Hanzhou Wu, and Xinpeng Zhang. Watermarking graph neural networks by random graphs. In2021 9th International Symposium on Digital Forensics and Security (ISDFS), pages 1–6, 2021

2021

[41] [41]

Graph robustness benchmark: Benchmarking the adversarial robustness of graph machine learning

Qinkai Zheng, Xu Zou, Yuxiao Dong, Yukuo Cen, Da Yin, Jiarong Xu, Yang Yang, and Jie Tang. Graph robustness benchmark: Benchmarking the adversarial robustness of graph machine learning. InThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021

2021

[42] [42]

Yuanxin Zhuang, Chuan Shi, Mengmei Zhang, Jinghui Chen, Lingjuan Lyu, Pan Zhou, and Lichao Sun. Unveiling the secrets without data: Can graph neural networks be exploited through {Data-Free} model extraction attacks? In33rd USENIX Security Symposium (USENIX Security 24), pages 5251–5268, 2024. 12 Appendix Contents A Related Work 14 B Limitations and Futur...

2024

[43] [43]

the extraction can achieve up to ∼80% fidelity

with NVIDIA driver 570.195.03 and a CUDA 12.8 driver capability. Login nodes have no GPU access, so all timing and memory numbers are recorded from Slurm-allocated GPU jobs only. Software stack.The full pipeline runs in a dedicated Conda environment namedgraphip: Python 3.11.15, PyTorch 2.2.1 + cu121, DGL 2.1.0 + cu121, PyTorch Geometric 2.7.0, OGB 1.3.6,...

2022