arxiv: 2605.07397 · v1 · submitted 2026-05-08 · 💻 cs.LG · math.AT

Recognition: 1 theorem link

· Lean Theorem

Have Graph -- Will Lift? The Case for Higher-Order Benchmarks

Bastian Rieck

Authors on Pith no claims yet

Pith reviewed 2026-05-11 02:00 UTC · model grok-4.3

classification 💻 cs.LG math.AT

keywords higher-order benchmarkstopological deep learninggraph liftinggeometric machine learningbenchmark datasetsinductive biasessimplicial complexes

0 comments

The pith

Higher-order models in machine learning need native benchmark datasets rather than lifted graph data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Geometric and topological approaches have gained ground in machine learning through message passing on graphs and higher-order structures like simplicial complexes. Yet suitable benchmark datasets remain scarce, prompting researchers to adapt existing graph datasets by adding higher-order elements. This opinion paper contends that such lifting practices fall short for properly testing models and exploring inductive biases. It urges the community to create or source new datasets that inherently contain higher-order information from the start. These datasets would provide a stronger basis for advancing topological deep learning.

Core claim

Message passing on graphs or higher-order complexes drives geometric deep learning and has incorporated abstract ideas like sheaves as inductive biases, but the diversity of models stands in contrast to the scarcity of benchmark datasets. Researchers therefore lift graph datasets to include higher-order information, and the paper calls for also sourcing new datasets to build firmer foundations for the research field.

What carries the argument

Lifting existing graph datasets to incorporate higher-order structures such as simplicial complexes or hyperedges, used as a substitute for dedicated benchmarks.

If this is right

Model comparisons would more accurately reflect the strengths of higher-order architectures without artifacts from data transformation.
Inductive biases based on sheaves or other topological constructs could be tested in settings that match their intended structure.
Evaluation protocols would shift away from graph-centric assumptions toward structures that capture multi-way relations directly.
The field could develop datasets drawn from domains where higher-order interactions occur naturally, such as molecular or social systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Interdisciplinary efforts with domain experts in biology or chemistry might yield datasets that expose gaps in current lifting methods.
Widespread adoption of native benchmarks could prompt re-examination of whether existing models truly leverage higher-order information or merely gain from extra features.
This shift might parallel how vision benchmarks drove architecture innovation, potentially accelerating practical use of topological models.

Load-bearing premise

New datasets with native higher-order information can be feasibly sourced and will yield better model evaluations and inductive bias insights than lifted graph data.

What would settle it

A controlled comparison where models trained and evaluated on lifted graph datasets match or exceed the performance, generalization, and bias insights obtained from any new native higher-order datasets would undermine the call for sourcing new data.

Figures

Figures reproduced from arXiv: 2605.07397 by Bastian Rieck.

**Figure 2.** Figure 2: A visualization of some triangulations of [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗

read the original abstract

After a somewhat rocky start, geometry and topology have established a foothold in machine learning. Message passing, either on graphs or higher-order complexes, is one of the main drivers of geometric deep learning, and paradigms that were once considered to be firmly in the realm of the abstract-like sheaves-have been "tamed" to serve as novel inductive biases for model architectures in topological deep learning. The veritable diversity of models, however, is in stark contrast to the scarcity of suitable benchmark datasets. As a result, researchers often resort to lifting existing graph datasets to include higher-order information. In this opinion paper, I want to encourage the community to also source new datasets, which may be used to prop up the foundations of our research field.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 0 minor

Summary. The manuscript is a short opinion paper observing that geometric and topological deep learning has produced a wide variety of message-passing models on graphs and higher-order complexes, yet suitable benchmark datasets remain scarce. It notes that researchers therefore frequently lift existing graph datasets to incorporate higher-order structure and argues that the community should also prioritize sourcing new, native higher-order datasets to strengthen the empirical foundations of the field.

Significance. The opinion identifies a genuine mismatch between model diversity and data resources in an emerging area. If the recommendation is followed, the field could obtain benchmarks that avoid potential lifting artifacts and support more reliable evaluation of inductive biases; the piece is therefore potentially useful as a community call to action even though it advances no quantitative claims.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review and for recommending acceptance. We appreciate the recognition that the opinion piece highlights a genuine mismatch between the diversity of higher-order models and the scarcity of native benchmark datasets, and we agree that following the recommendation could help avoid lifting artifacts in future evaluations.

Circularity Check

0 steps flagged

No significant circularity: opinion piece with no derivations or fitted quantities

full rationale

The paper is a short normative opinion piece whose central claim is a recommendation to source native higher-order datasets instead of lifting graphs. It contains no equations, no parameters, no theorems, no experiments, and no load-bearing derivations of any kind. The text identifies a scarcity of benchmarks as motivation but advances no testable quantitative claim that could reduce to a fit or self-citation. No self-citations are used to justify uniqueness or ansatzes, and the argument is self-contained as discursive advocacy rather than a derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

The paper is an opinion piece with no mathematical claims, derivations, or empirical modeling; therefore it introduces no free parameters, axioms, or invented entities.

pith-pipeline@v0.9.0 · 5411 in / 977 out tokens · 25815 ms · 2026-05-11T02:00:23.636046+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

researchers often resort to lifting existing graph datasets to include higher-order information... encourage the community to also source new datasets

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

35 extracted references · 35 canonical work pages · 1 internal anchor

[1]

Proceedings of the 41st International Conference on Machine Learning , publisher =

Position: Topological Deep Learning is the New Frontier for Relational Learning , author =. Proceedings of the 41st International Conference on Machine Learning , publisher =

work page
[2]

and Bruna, Joan and LeCun, Yann and Szlam, Arthur and Vandergheynst, Pierre , year = 2017, journal =

Bronstein, Michael M. and Bruna, Joan and LeCun, Yann and Szlam, Arthur and Vandergheynst, Pierre , year = 2017, journal =. Geometric Deep Learning: Going beyond

work page 2017
[3]

Everything is connected: Graph neural networks , author =

work page
[4]

Foundations and Trends

Graph Kernels: State-of-the-Art and Future Challenges , author =. Foundations and Trends

work page
[5]

Kriege and Martin Grohe and Matthias Fey and Karsten Borgwardt , year = 2023, journal =

Christopher Morris and Yaron Lipman and Haggai Maron and Bastian Rieck and Nils M. Kriege and Martin Grohe and Matthias Fey and Karsten Borgwardt , year = 2023, journal =

work page 2023
[6]

Topological Deep Learning: Graphs, Complexes, Sheaves , author =

work page
[7]

Jakob Jonsson , title =

work page
[8]

Differentiable Lifting for Topological Neural Networks , author =

work page
[9]

Bodnar, Cristian and Frasca, Fabrizio and Wang, Yuguang and Otter, Nina and Montufar, Guido F. and Li. Weisfeiler and. Proceedings of the 38th International Conference on Machine Learning , publisher =

work page
[10]

Weisfeiler and

Bodnar, Cristian and Frasca, Fabrizio and Otter, Nina and Wang, Yuguang and Li\`. Weisfeiler and. Advances in Neural Information Processing Systems , publisher =

work page
[11]

Persistence Homology of Networks: Methods and Applications , author =

work page
[12]

Annals of Mathematics , volume = 56, number = 1, pages =

Homology Groups of Relations , author =. Annals of Mathematics , volume = 56, number = 1, pages =

work page
[13]

, title =

Sutton, Richard S. , title =. 2019 , url =

work page 2019
[14]

Martinez and Halley Fritze and Marissa Masden and Valentina S

Lev Telyatnikov and Guillermo Bernardez and Marco Montagna and Mustafa Hajij and Martin Carrasco and Pavlo Vasylenko and Mathilde Papillon and Ghada Zamzmi and Michael T Schaub and Jonas Verhellen and Pavel Snopov and Bertran Miquel-Oliver and Manel Gil-Sorribes and Alexis Molina and VICTOR GUALLAR and Theodore Long and Julian Suk and Patryk Rygiel and Al...

work page
[15]

Perspectives on Psychological Science , volume = 11, number = 5, pages =

Increasing Transparency Through a Multiverse Analysis , author =. Perspectives on Psychological Science , volume = 11, number = 5, pages =

work page
[16]

Proceedings of the 41st International Conference on Machine Learning , series =

Mapping the Multiverse of Latent Representations , author =. Proceedings of the 41st International Conference on Machine Learning , series =

work page
[17]

Proceedings of the 42nd International Conference on Machine Learning , publisher =

Position: Graph Learning Will Lose Relevance Due To Poor Benchmarks , author =. Proceedings of the 42nd International Conference on Machine Learning , publisher =

work page
[18]

Proceedings of the 42nd International Conference on Machine Learning , series =

No Metric to Rule Them All: Toward Principled Evaluations of Graph-Learning Datasets , author =. Proceedings of the 42nd International Conference on Machine Learning , series =

work page
[19]

Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing , author =

work page
[20]

A Fair Comparison of Graph Neural Networks for Graph Classification , author =

work page
[21]

doi:10.1145/3447548.3467442 , year = 2021, booktitle =

Filtration Curves for Graph Representation , author =. doi:10.1145/3447548.3467442 , year = 2021, booktitle =

work page doi:10.1145/3447548.3467442 2021
[22]

Matthias Fey and Jinu Sunil and Akihiro Nitta and Rishi Puri and Manan Shah and Bla

work page
[23]

, title =

Feynman, Richard P. , title =. Surely You're Joking,

work page
[24]

Lipton and Jacob Steinhardt

Troubling Trends in Machine Learning Scholarship , author =. 1807.03341 , archiveprefix =

work page arXiv
[25]

Communications of the ACM , publisher =

Datasheets for datasets , author =. Communications of the ACM , publisher =

work page
[26]

Rubén Ballester and Ernst Röell and Daniel Bīn Schmid and Mathieu Alain and Sergio Escalera and Carles Casacuberta and Bastian Rieck , year = 2025, booktitle =

work page 2025
[27]

Discrete Differential Geometry , publisher =

Enumeration and Random Realization of Triangulated Surfaces , author =. Discrete Differential Geometry , publisher =

work page
[28]

, title =

Lutz, Frank H. , title =

work page
[29]

Carrasco, G

Martin Carrasco and Guillermo Bernardez and Marco Montagna and Nina Miolane and Lev Telyatnikov , year = 2025, url =. 2505.15405 , archiveprefix =

work page arXiv 2025
[30]

Attending to topological spaces: The cellular transformer.arXiv preprint arXiv:2405.14094, 2024

Attending to Topological Spaces: The Cellular Transformer , author =. 2405.14094 , archiveprefix =

work page arXiv
[31]

2512.04475 , archiveprefix =

Timo Stoll and Chendi Qian and Ben Finkelshtein and Ali Parviz and Darius Weber and Fabrizio Frasca and Hadar Shavit and Antoine Siraudin and Arman Mielke and Marie Anastacio and Erik Müller and Maya Bechler-Speicher and Michael Bronstein and Mikhail Galkin and Holger Hoos and Mathias Niepert and Bryan Perozzi and Jan Tönshoff and Christopher Morris , yea...

work page internal anchor Pith review arXiv 2026
[32]

2511.03068 , archiveprefix =

Graph Homomorphism Distortion: A Metric to Distinguish Them All and in the Latent Space Bind Them , author =. 2511.03068 , archiveprefix =

work page arXiv
[33]

Burton , year = 2004, journal =

Benjamin A. Burton , year = 2004, journal =. Introducing

work page 2004
[34]

Burton and Ryan Budney and William Pettersson and others , title =

Benjamin A. Burton and Ryan Budney and William Pettersson and others , title =

work page
[35]

Surveys on Surgery Theory, Volume 1 , publisher =

A guide to the classification of manifolds , author =. Surveys on Surgery Theory, Volume 1 , publisher =

work page