hub

Pitfalls of graph neural network evaluation

· 2018 · cs.LG · arXiv 1811.05868

14 Pith papers cite this work. Polarity classification is still indexing.

14 Pith papers citing it

open full Pith review browse 14 citing papers arXiv PDF

abstract

Semi-supervised node classification in graphs is a fundamental problem in graph mining, and the recently proposed graph neural networks (GNNs) have achieved unparalleled results on this task. Due to their massive success, GNNs have attracted a lot of attention, and many novel architectures have been put forward. In this paper we show that existing evaluation strategies for GNN models have serious shortcomings. We show that using the same train/validation/test splits of the same datasets, as well as making significant changes to the training procedure (e.g. early stopping criteria) precludes a fair comparison of different architectures. We perform a thorough empirical evaluation of four prominent GNN models and show that considering different splits of the data leads to dramatically different rankings of models. Even more importantly, our findings suggest that simpler GNN architectures are able to outperform the more sophisticated ones if the hyperparameters and the training procedure are tuned fairly for all models.

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 2

citation-polarity summary

background 1 unclear 1

representative citing papers

Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning

cs.LG · 2026-04-10 · unverdicted · novelty 7.0

Neighbourhood Transformers apply local self-attention for monophily-aware graph learning, guarantee expressiveness at least as strong as message-passing GNNs, and outperform prior methods on node classification across ten datasets while cutting memory and time costs substantially.

SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and long-term agent benchmarks.

Random-Set Graph Neural Networks

cs.AI · 2026-05-12 · unverdicted · novelty 6.0

RS-GNNs predict random sets over classes using belief functions to jointly produce class probabilities and epistemic uncertainty estimates for graph nodes.

Learning Graph Foundation Models on Riemannian Graph-of-Graphs

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.

UFO: A Unified Flow-Oriented Framework for Robust Continual Graph Learning

cs.LG · 2026-05-11 · unverdicted · novelty 6.0

UFO combines flow-based generative replay with instance-level reliability scoring to handle both catastrophic forgetting and catastrophic remembering from noisy supervision in evolving graphs, outperforming baselines on four datasets.

From Model to Data (M2D): Shifting Complexity from GNNs to Graphs for Transparent Graph Learning

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

M2D distillation augments input graphs with model-derived features and structure, letting simple student GNNs match teacher performance while exposing mechanisms such as attention and fairness directly in the data.

Adversarial Graph Neural Network Benchmarks: Towards Practical and Fair Evaluation

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

A large-scale standardized benchmark of GNN attacks and defenses reveals that target node selection and attacked-model training process can completely distort measured attack effectiveness.

Improving Graph Few-shot Learning with Hyperbolic Space and Denoising Diffusion

cs.LG · 2026-04-30 · unverdicted · novelty 6.0

IMPRESS improves graph few-shot learning by learning representations in hyperbolic space and using denoising diffusion to better approximate target distributions from few support samples.

Toward a universal foundation model for graph-structured data

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

A pretrained graph model using feature-agnostic structural prompts matches or exceeds supervised baselines and shows strong zero-shot and few-shot transfer on held-out biomedical graphs, with a 21.8% ROC-AUC gain on SagePPI.

Analytic Drift Resister for Non-Exemplar Continual Graph Learning

cs.LG · 2026-04-03 · unverdicted · novelty 6.0

ADR achieves theoretically zero-forgetting class-incremental graph learning by combining backpropagation adaptation with ridge-regression-based layer-wise merging of GNN linear transformations.

Rethinking Generalization in Graph Neural Networks: A Structural Complexity Perspective

cs.LG · 2026-05-13 · unverdicted · novelty 5.0

GNN generalization depends explicitly on graph structural complexity measured by effective edges, with a new regularization method shown to balance underfitting and overfitting.

Layer Embedding Deep Fusion Graph Neural Network

cs.LG · 2026-04-25 · unverdicted · novelty 5.0

LEDF-GNN fuses multi-layer embeddings nonlinearly and runs parallel processing on original and reconstructed topologies to capture long-range dependencies and mitigate heterophily-induced misaggregation in deep GNNs.

Learning How Much to Think: Difficulty-Aware Dynamic MoEs for Graph Node Classification

cs.LG · 2026-04-13 · unverdicted · novelty 5.0

D2MoE dynamically allocates expert resources in graph MoEs via difficulty-driven top-p routing based on predictive entropy, yielding higher accuracy and lower memory/time costs on node classification benchmarks.

Unified Graph Prompt Learning via Low-Rank Graph Message Prompting

cs.LG · 2026-04-13 · unverdicted · novelty 5.0

LR-GMP unifies graph prompting via a low-rank Graph Message Prompt paradigm to achieve better generalization than component-specific methods.

citing papers explorer

Showing 14 of 14 citing papers.

Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning cs.LG · 2026-04-10 · unverdicted · none · ref 22
Neighbourhood Transformers apply local self-attention for monophily-aware graph learning, guarantee expressiveness at least as strong as message-passing GNNs, and outperform prior methods on node classification across ten datasets while cutting memory and time costs substantially.
SAGE: A Self-Evolving Agentic Graph-Memory Engine for Structure-Aware Associative Memory cs.AI · 2026-05-12 · unverdicted · none · ref 108
SAGE is a self-evolving agentic graph-memory engine that dynamically constructs and refines structured memory graphs via writer-reader feedback, yielding performance gains on multi-hop QA, open-domain retrieval, and long-term agent benchmarks.
Random-Set Graph Neural Networks cs.AI · 2026-05-12 · unverdicted · none · ref 24
RS-GNNs predict random sets over classes using belief functions to jointly produce class probabilities and epistemic uncertainty estimates for graph nodes.
Learning Graph Foundation Models on Riemannian Graph-of-Graphs cs.LG · 2026-05-11 · unverdicted · none · ref 21
R-GFM constructs multi-scale Riemannian graph-of-graphs to learn geometry-adaptive representations, reducing structural domain generalization error and delivering up to 49% relative gains on downstream graph tasks.
UFO: A Unified Flow-Oriented Framework for Robust Continual Graph Learning cs.LG · 2026-05-11 · unverdicted · none · ref 40
UFO combines flow-based generative replay with instance-level reliability scoring to handle both catastrophic forgetting and catastrophic remembering from noisy supervision in evolving graphs, outperforming baselines on four datasets.
From Model to Data (M2D): Shifting Complexity from GNNs to Graphs for Transparent Graph Learning cs.LG · 2026-05-07 · unverdicted · none · ref 43
M2D distillation augments input graphs with model-derived features and structure, letting simple student GNNs match teacher performance while exposing mechanisms such as attention and fairness directly in the data.
Adversarial Graph Neural Network Benchmarks: Towards Practical and Fair Evaluation cs.LG · 2026-05-07 · unverdicted · none · ref 51
A large-scale standardized benchmark of GNN attacks and defenses reveals that target node selection and attacked-model training process can completely distort measured attack effectiveness.
Improving Graph Few-shot Learning with Hyperbolic Space and Denoising Diffusion cs.LG · 2026-04-30 · unverdicted · none · ref 4
IMPRESS improves graph few-shot learning by learning representations in hyperbolic space and using denoising diffusion to better approximate target distributions from few support samples.
Toward a universal foundation model for graph-structured data cs.LG · 2026-04-07 · unverdicted · none · ref 33
A pretrained graph model using feature-agnostic structural prompts matches or exceeds supervised baselines and shows strong zero-shot and few-shot transfer on held-out biomedical graphs, with a 21.8% ROC-AUC gain on SagePPI.
Analytic Drift Resister for Non-Exemplar Continual Graph Learning cs.LG · 2026-04-03 · unverdicted · none · ref 42
ADR achieves theoretically zero-forgetting class-incremental graph learning by combining backpropagation adaptation with ridge-regression-based layer-wise merging of GNN linear transformations.
Rethinking Generalization in Graph Neural Networks: A Structural Complexity Perspective cs.LG · 2026-05-13 · unverdicted · none · ref 33 · internal anchor
GNN generalization depends explicitly on graph structural complexity measured by effective edges, with a new regularization method shown to balance underfitting and overfitting.
Layer Embedding Deep Fusion Graph Neural Network cs.LG · 2026-04-25 · unverdicted · none · ref 26
LEDF-GNN fuses multi-layer embeddings nonlinearly and runs parallel processing on original and reconstructed topologies to capture long-range dependencies and mitigate heterophily-induced misaggregation in deep GNNs.
Learning How Much to Think: Difficulty-Aware Dynamic MoEs for Graph Node Classification cs.LG · 2026-04-13 · unverdicted · none · ref 17
D2MoE dynamically allocates expert resources in graph MoEs via difficulty-driven top-p routing based on predictive entropy, yielding higher accuracy and lower memory/time costs on node classification benchmarks.
Unified Graph Prompt Learning via Low-Rank Graph Message Prompting cs.LG · 2026-04-13 · unverdicted · none · ref 41
LR-GMP unifies graph prompting via a low-rank Graph Message Prompt paradigm to achieve better generalization than component-specific methods.

Pitfalls of graph neural network evaluation

hub tools

citation-role summary

citation-polarity summary

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer