Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

Antoine Siraudin; Ben Finkelshtein; Bryan Perozzi; Christopher Morris; Fabrizio Frasca; Jan T\"onshoff; Luis M\"uller; Mathias Niepert; Maya Bechler-Speicher; Michael M. Bronstein

arxiv: 2502.14546 · v1 · pith:BGYPDLB7new · submitted 2025-02-20 · 💻 cs.LG · cs.AI· cs.NE

Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks

Maya Bechler-Speicher , Ben Finkelshtein , Fabrizio Frasca , Luis M\"uller , Jan T\"onshoff , Antoine Siraudin , Viktor Zaverkin , Michael M. Bronstein

show 4 more authors

Mathias Niepert Bryan Perozzi Mikhail Galkin Christopher Morris

This is my paper

classification 💻 cs.LG cs.AIcs.NE

keywords graphlearningbenchmarkingbenchmarksdesignfocusfurthergraphs

0 comments

read the original abstract

While machine learning on graphs has demonstrated promise in drug design and molecular property prediction, significant benchmarking challenges hinder its further progress and relevance. Current benchmarking practices often lack focus on transformative, real-world applications, favoring narrow domains like two-dimensional molecular graphs over broader, impactful areas such as combinatorial optimization, relational databases, or chip design. Additionally, many benchmark datasets poorly represent the underlying data, leading to inadequate abstractions and misaligned use cases. Fragmented evaluations and an excessive focus on accuracy further exacerbate these issues, incentivizing overfitting rather than fostering generalizable insights. These limitations have prevented the development of truly useful graph foundation models. This position paper calls for a paradigm shift toward more meaningful benchmarks, rigorous evaluation protocols, and stronger collaboration with domain experts to drive impactful and reliable advances in graph learning research, unlocking the potential of graph learning.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 4 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Deep Neural Sheaf Diffusion
cs.LG 2026-05 unverdicted novelty 6.0

DNSD replaces the sheaf Laplacian with a sheaf adjacency operator to maintain informative signals in deep layers, outperforming GNN and NSD baselines on long-range synthetic and real graph tasks.
When Structure Doesn't Help: LLMs Do Not Read Text-Attributed Graphs as Effectively as We Expected
cs.LG 2025-11 conditional novelty 6.0

LLMs achieve strong results on text-attributed graphs using only node textual descriptions, while most methods for encoding graph structure deliver marginal or negative gains.
Deep Neural Sheaf Diffusion
cs.LG 2026-05 unverdicted novelty 5.0

DNSD replaces the sheaf Laplacian with a sheaf adjacency operator, adds normalization and gating, and empirically outperforms GNN and NSD baselines by up to 30 percentage points on synthetic long-range graph tasks whi...
Artificial Intelligence for Food Innovation
cs.CE 2025-09 unverdicted novelty 3.0

A review paper that surveys AI uses across the food innovation pipeline for sustainable proteins and identifies four strategic priorities for the emerging field.