hub

Relational inductive biases, deep learning, and graph networks

Peter W. Battaglia, Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Vinicius Zambaldi, Mateusz Malinowski · 2018 · cs.LG · arXiv 1806.01261

29 Pith papers cite this work. Polarity classification is still indexing.

29 Pith papers citing it

open full Pith review browse 29 citing papers arXiv PDF

abstract

Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

hub tools

JSON dossier citing papers JSON arXiv source

representative citing papers

Can Graphs Help Vision SSMs See Better?

cs.CV · 2026-05-11 · unverdicted · novelty 7.0

GraphScan replaces geometric or coordinate-based scanning in Vision SSMs with learned local semantic graph routing, yielding SOTA results among such models on classification and segmentation tasks.

Accelerating 3D Non-LTE Synthesis with Graph Neural Networks

astro-ph.SR · 2026-05-10 · unverdicted · novelty 7.0

Graph neural networks can approximate full 3D non-LTE Ca II populations in solar models with correlations above 0.99 and extreme computational efficiency.

Reentrant value fields as delayed coupled reaction-diffusion systems on finite graphs

math.DS · 2026-05-05 · unverdicted · novelty 7.0 · 2 refs

Establishes well-posedness, compact global attractors, and delay-independent global stability for retarded functional differential equations modeling reentrant value fields as coupled reaction-diffusion systems on finite graphs.

Graph World Models: Concepts, Taxonomy, and Future Directions

cs.AI · 2026-04-30 · unverdicted · novelty 7.0

The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.

PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty

cs.LG · 2026-04-29 · unverdicted · novelty 7.0

PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.

One Scale at a Time: Scale-Autoregressive Modeling for Fluid Flow Distributions

cs.CE · 2026-04-13 · conditional · novelty 7.0

Scale-autoregressive modeling (SAR) samples fluid flow distributions hierarchically from coarse to fine resolutions on meshes, achieving lower distributional error and 2-7x faster runtime than diffusion or flow-matching baselines.

Equivariant Multi-agent Reinforcement Learning for Multimodal Vehicle-to-Infrastructure Systems

cs.LG · 2026-04-08 · unverdicted · novelty 7.0

A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.

Fast Graph Representation Learning with PyTorch Geometric

cs.LG · 2019-03-06 · accept · novelty 7.0

PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.

SACHI: Structured Agent Coordination via Holistic Information Integration in Multi-Agent Reinforcement Learning

cs.LG · 2026-05-08 · unverdicted · novelty 6.0

SACHI uses graph transformer convolutions on inter-agent coordination graphs to enrich partial-observation agents with content-dependent teammate information, yielding statistically significant gains over baselines in five cooperative tasks.

LINC: Decoupling Local Consequence Scoring from Hidden Matching in Constructive Neural Routing

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

LINC decouples local consequence scoring from hidden matching in constructive neural routing solvers, cutting CVRPTW gaps for PolyNet from 13.83%/38.15% to 7.26%/14.71% on Solomon/Homberger benchmarks.

Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

Sheet as Token represents each worksheet as a single dense token and uses a multi-channel graph retriever to improve retrieval of supporting sheets in multi-sheet workbooks.

Deep Wave Network for Modeling Multi-Scale Physical Dynamics

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

DW-Net improves the accuracy versus computational cost Pareto front over standard U-Nets for 2D and 3D multi-scale flow benchmarks by stacking multiple waves while keeping training settings identical.

Learning to Theorize the World from Observation

cs.LG · 2026-05-05 · unverdicted · novelty 6.0

NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.

Mesh Field Theory: Port-Hamiltonian Formulation of Mesh-Based Physics

cs.LG · 2026-05-01 · unverdicted · novelty 6.0

Mesh Field Theory reduces mesh-based physics to port-Hamiltonian form with topology fixing interconnections and metrics entering only via constitutive relations, enabling MeshFT-Net to achieve near-zero energy drift, correct dispersion, momentum conservation, and strong out-of-distribution fidelity.

Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework

cs.LG · 2026-04-29 · unverdicted · novelty 6.0

ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.

Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs

cs.LG · 2026-04-26 · unverdicted · novelty 6.0

A unified homogeneous graph framework with feature homogenization enables linear-complexity RL policies for job shop scheduling that generalize zero-shot via structural saturation at balanced job-machine ratios.

TransXion: A High-Fidelity Graph Benchmark for Realistic Anti-Money Laundering

cs.LG · 2026-04-19 · unverdicted · novelty 6.0

TransXion supplies a 3-million-transaction graph benchmark with profile-aware normal activity and stochastic illicit subgraphs that produces lower detection scores than prior AML datasets.

Cluster Attention for Graph Machine Learning

cs.LG · 2026-04-08 · unverdicted · novelty 6.0

Cluster attention uses off-the-shelf community detection to define attention scopes within graph clusters, augmenting MPNNs and Graph Transformers to achieve larger receptive fields with preserved structural inductive biases and improved performance on diverse graph datasets.

The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning

cs.LG · 2026-04-07 · unverdicted · novelty 6.0

LLMs discover latent planning strategies up to five steps during training and execute them up to eight steps at test time, with larger models reaching seven under few-shot prompting, revealing a dissociation between discovery and execution.

Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation

eess.AS · 2026-04-02 · unverdicted · novelty 6.0

The CDMA speech depression model generalizes across languages, favors emotional speech, and aligns with EEG markers of emotional dysregulation.

Metriplector: From Field Theory to Neural Architecture

cs.AI · 2026-03-31 · unverdicted · novelty 6.0

Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small parameter counts.

Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges

cs.LG · 2021-04-27 · accept · novelty 6.0

Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.

Attention-based graph neural networks: a survey

cs.SI · 2026-05-09 · unverdicted · novelty 5.0

The survey groups attention-based GNNs into three stages—graph recurrent attention networks, graph attention networks, and graph transformers—while reviewing architectures and future directions.

Mesh Based Simulations with Spatial and Temporal awareness

cs.LG · 2026-05-02 · unverdicted · novelty 5.0

A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.

citing papers explorer

Showing 29 of 29 citing papers.

Can Graphs Help Vision SSMs See Better? cs.CV · 2026-05-11 · unverdicted · none · ref 1 · internal anchor
GraphScan replaces geometric or coordinate-based scanning in Vision SSMs with learned local semantic graph routing, yielding SOTA results among such models on classification and segmentation tasks.
Accelerating 3D Non-LTE Synthesis with Graph Neural Networks astro-ph.SR · 2026-05-10 · unverdicted · none · ref 2 · internal anchor
Graph neural networks can approximate full 3D non-LTE Ca II populations in solar models with correlations above 0.99 and extreme computational efficiency.
Reentrant value fields as delayed coupled reaction-diffusion systems on finite graphs math.DS · 2026-05-05 · unverdicted · none · ref 4 · 2 links · internal anchor
Establishes well-posedness, compact global attractors, and delay-independent global stability for retarded functional differential equations modeling reentrant value fields as coupled reaction-diffusion systems on finite graphs.
Graph World Models: Concepts, Taxonomy, and Future Directions cs.AI · 2026-04-30 · unverdicted · none · ref 14 · internal anchor
The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.
PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty cs.LG · 2026-04-29 · unverdicted · none · ref 56 · internal anchor
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
One Scale at a Time: Scale-Autoregressive Modeling for Fluid Flow Distributions cs.CE · 2026-04-13 · conditional · none · ref 4 · internal anchor
Scale-autoregressive modeling (SAR) samples fluid flow distributions hierarchically from coarse to fine resolutions on meshes, achieving lower distributional error and 2-7x faster runtime than diffusion or flow-matching baselines.
Equivariant Multi-agent Reinforcement Learning for Multimodal Vehicle-to-Infrastructure Systems cs.LG · 2026-04-08 · unverdicted · none · ref 42 · internal anchor
A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.
Fast Graph Representation Learning with PyTorch Geometric cs.LG · 2019-03-06 · accept · none · ref 1 · internal anchor
PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.
SACHI: Structured Agent Coordination via Holistic Information Integration in Multi-Agent Reinforcement Learning cs.LG · 2026-05-08 · unverdicted · none · ref 40 · internal anchor
SACHI uses graph transformer convolutions on inter-agent coordination graphs to enrich partial-observation agents with content-dependent teammate information, yielding statistically significant gains over baselines in five cooperative tasks.
LINC: Decoupling Local Consequence Scoring from Hidden Matching in Constructive Neural Routing cs.LG · 2026-05-07 · unverdicted · none · ref 2 · internal anchor
LINC decouples local consequence scoring from hidden matching in constructive neural routing solvers, cutting CVRPTW gaps for PolyNet from 13.83%/38.15% to 7.26%/14.71% on Solomon/Homberger benchmarks.
Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding cs.AI · 2026-05-07 · unverdicted · none · ref 6 · internal anchor
Sheet as Token represents each worksheet as a single dense token and uses a multi-channel graph retriever to improve retrieval of supporting sheets in multi-sheet workbooks.
Deep Wave Network for Modeling Multi-Scale Physical Dynamics cs.LG · 2026-05-05 · unverdicted · none · ref 68 · internal anchor
DW-Net improves the accuracy versus computational cost Pareto front over standard U-Nets for 2D and 3D multi-scale flow benchmarks by stacking multiple waves while keeping training settings identical.
Learning to Theorize the World from Observation cs.LG · 2026-05-05 · unverdicted · none · ref 272 · internal anchor
NEO induces compositional latent programs as world theories from observations and executes them to enable explanation-driven generalization.
Mesh Field Theory: Port-Hamiltonian Formulation of Mesh-Based Physics cs.LG · 2026-05-01 · unverdicted · none · ref 17 · internal anchor
Mesh Field Theory reduces mesh-based physics to port-Hamiltonian form with topology fixing interconnections and metrics entering only via constitutive relations, enabling MeshFT-Net to achieve near-zero energy drift, correct dispersion, momentum conservation, and strong out-of-distribution fidelity.
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework cs.LG · 2026-04-29 · unverdicted · none · ref 21 · internal anchor
ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs cs.LG · 2026-04-26 · unverdicted · none · ref 1 · internal anchor
A unified homogeneous graph framework with feature homogenization enables linear-complexity RL policies for job shop scheduling that generalize zero-shot via structural saturation at balanced job-machine ratios.
TransXion: A High-Fidelity Graph Benchmark for Realistic Anti-Money Laundering cs.LG · 2026-04-19 · unverdicted · none · ref 6 · internal anchor
TransXion supplies a 3-million-transaction graph benchmark with profile-aware normal activity and stochastic illicit subgraphs that produces lower detection scores than prior AML datasets.
Cluster Attention for Graph Machine Learning cs.LG · 2026-04-08 · unverdicted · none · ref 2 · internal anchor
Cluster attention uses off-the-shelf community detection to define attention scopes within graph clusters, augmenting MPNNs and Graph Transformers to achieve larger receptive fields with preserved structural inductive biases and improved performance on diverse graph datasets.
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning cs.LG · 2026-04-07 · unverdicted · none · ref 4 · internal anchor
LLMs discover latent planning strategies up to five steps during training and execute them up to eight steps at test time, with larger models reaching seven under few-shot prompting, revealing a dissociation between discovery and execution.
Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation eess.AS · 2026-04-02 · unverdicted · none · ref 59 · internal anchor
The CDMA speech depression model generalizes across languages, favors emotional speech, and aligns with EEG markers of emotional dysregulation.
Metriplector: From Field Theory to Neural Architecture cs.AI · 2026-03-31 · unverdicted · none · ref 1 · internal anchor
Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small parameter counts.
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges cs.LG · 2021-04-27 · accept · none · ref 7 · internal anchor
Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.
Attention-based graph neural networks: a survey cs.SI · 2026-05-09 · unverdicted · none · ref 179 · internal anchor
The survey groups attention-based GNNs into three stages—graph recurrent attention networks, graph attention networks, and graph transformers—while reviewing architectures and future directions.
Mesh Based Simulations with Spatial and Temporal awareness cs.LG · 2026-05-02 · unverdicted · none · ref 59 · internal anchor
A unified training framework for mesh-based ML surrogates in CFD improves accuracy and long-horizon stability by enforcing spatial derivative consistency via multi-node prediction, using temporal cross-attention correction, and adding 3D rotary positional embeddings.
Inductive Subgraphs as Shortcuts: Causal Disentanglement for Heterophilic Graph Learning cs.LG · 2026-04-21 · unverdicted · none · ref 2 · internal anchor
Inductive subgraphs serve as shortcuts in heterophilic graphs, and CD-GNN disentangles spurious from causal subgraphs by blocking non-causal paths to improve robustness and accuracy.
Extracting Money Laundering Transactions from Quasi-Temporal Graph Representation cs.LG · 2026-04-03 · unverdicted · none · ref 2 · internal anchor
ExSTraQt uses quasi-temporal graph representations and supervised learning to detect suspicious transactions, achieving F1 score uplifts of up to 1% on real data and over 8% on synthetic datasets compared to prior AML models.
Spatiotemporal Convolutions on EEG signal -- A Representation Learning Perspective on Efficient and Explainable EEG Classification with Convolutional Neural Nets cs.LG · 2026-05-05 · unverdicted · none · ref 15 · internal anchor
2D spatiotemporal convolutions reduce training time on high-dimensional EEG data while maintaining performance and creating distinct representational geometries compared with concatenated 1D convolutions.
Middle-mile logistics through the lens of goal-conditioned reinforcement learning stat.ML · 2026-05-04 · unverdicted · none · ref 3 · internal anchor
Middle-mile logistics is cast as a multi-object goal-conditioned MDP and solved by combining graph neural networks with model-free RL via extraction of small feature graphs.
Toward Generalizable Graph Learning for 3D Engineering AI: Explainable Workflows for CAE Mode Shape Classification and CFD Field Prediction eess.SY · 2026-04-09 · unverdicted · none · ref 21 · internal anchor
A graph learning framework turns heterogeneous 3D engineering data into physics-aware graphs processed by GNNs for CAE mode classification and CFD field prediction in automotive applications.

Relational inductive biases, deep learning, and graph networks

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer