GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
hub Mixed citations
Relational inductive biases, deep learning, and graph networks
Mixed citation behavior. Most common role is background (56%).
abstract
Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.
hub tools
citation-role summary
citation-polarity summary
representative citing papers
Face-Feature Tuning is a label-free logit remapping method that reduces FPR/TPR gaps across groups in deepfake detection while preserving overall accuracy.
Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size generalization.
Introduces transitive inference with exceptions task and analytically shows kernel ridge regression balances relational generalization and memorization depending on representational geometry, with validation in finetuned language models.
GraphScan replaces geometric or coordinate-based scanning in Vision SSMs with learned local semantic graph routing, yielding SOTA results among such models on classification and segmentation tasks.
Graph neural networks can approximate full 3D non-LTE Ca II populations in solar models with correlations above 0.99 and extreme computational efficiency.
The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
Scale-autoregressive modeling (SAR) samples fluid flow distributions hierarchically from coarse to fine resolutions on meshes, achieving lower distributional error and 2-7x faster runtime than diffusion or flow-matching baselines.
A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.
Smoothness assumptions on graphical model kernels produce Wasserstein estimation rates determined by local graph structure rather than ambient dimension.
Causal Process Models reframe dynamic causal graph discovery as multi-agent reinforcement learning to build sparse time-varying graphs only at active interactions, outperforming dense baselines on physical prediction.
In-weights learning induces linear embeddings enabling transitive inference in transformers, whereas in-context learning defaults to match-and-copy unless pre-trained on linear tasks or prompted with linear mental maps.
Unsupervised GNN model learns local updates for approximate MaxIS on dynamic graphs, achieving competitive ratios on 200-1000 node instances and 1.00-1.18x larger solutions than other unsupervised models when generalizing to 100x larger graphs.
Temporal Graph Networks combine memory modules and graph operators to learn on dynamic graphs as timed event sequences, outperforming prior methods on transductive and inductive tasks while unifying earlier models as special cases.
Graph Kernel Networks learn PDE solution operators that generalize across discretization methods and grid resolutions using graph-based kernel integration.
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
Placeto learns generalizable RL policies for device placement via iterative improvements and graph embeddings, needing up to 6.1x fewer steps than prior methods and applying to unseen graphs without retraining.
PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.
A graph neural network learns to approximate altruistic robot transfers across heterogeneous teams using Hamilton's rule, achieving near-optimal allocation in simulated firefighting scenarios.
GOAL uses conditioned diffusion on relational graphs with typed edges to produce feasible multi-objective solutions for scheduling problems, reporting 100% feasibility and sub-0.2% MAPE on FSP, JSP, and FJSP up to 20 jobs.
Ada-Diffuser is a causal diffusion model that jointly learns observed interaction structure and underlying latent dynamics from minimal observations for adaptive planning and policy learning.
Neural point-forms are introduced as permutation-invariant neural layers that output learned form-comparison matrices for point clouds, with a claimed consistency proof under sampling and manifold assumptions and competitive results on synthetic and biological data.
SACHI enriches agent representations via graph transformer convolutions over inter-agent graphs to enable holistic information integration, outperforming baselines across five cooperative tasks with statistical significance.
citing papers explorer
-
Dark Matter in Draco and Bo\"otes I: Hints of a Core in an Ultra-Faint Dwarf from Simulation-Based Inference
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
-
Toward Calibrated, Fair, and accurate Deepfake Detection
Face-Feature Tuning is a label-free logit remapping method that reduces FPR/TPR gaps across groups in deepfake detection while preserving overall accuracy.
-
Learning Dynamic Stability Landscapes in Synchronization Networks
Introduces graph-to-image prediction of per-node dynamic stability landscapes in oscillator networks from topology, releases two 10k-graph datasets, and shows GNN-CNN models achieve good accuracy with cross-size generalization.
-
A mathematical theory of balancing relational generalization and memorization
Introduces transitive inference with exceptions task and analytically shows kernel ridge regression balances relational generalization and memorization depending on representational geometry, with validation in finetuned language models.
-
Can Graphs Help Vision SSMs See Better?
GraphScan replaces geometric or coordinate-based scanning in Vision SSMs with learned local semantic graph routing, yielding SOTA results among such models on classification and segmentation tasks.
-
Accelerating 3D Non-LTE Synthesis with Graph Neural Networks
Graph neural networks can approximate full 3D non-LTE Ca II populations in solar models with correlations above 0.99 and extreme computational efficiency.
-
Graph World Models: Concepts, Taxonomy, and Future Directions
The paper unifies emerging graph-based world models under a new paradigm and proposes a taxonomy organized by spatial, physical, and logical relational inductive biases.
-
PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
-
One Scale at a Time: Scale-Autoregressive Modeling for Fluid Flow Distributions
Scale-autoregressive modeling (SAR) samples fluid flow distributions hierarchically from coarse to fine resolutions on meshes, achieving lower distributional error and 2-7x faster runtime than diffusion or flow-matching baselines.
-
Equivariant Multi-agent Reinforcement Learning for Multimodal Vehicle-to-Infrastructure Systems
A self-supervised multimodal alignment step plus equivariant GNN-based MARL yields over twofold sensing accuracy and 50% performance gains in decentralized V2I rate maximization.
-
Fast Wasserstein rates for estimating probability distributions of probabilistic graphical models
Smoothness assumptions on graphical model kernels produce Wasserstein estimation rates determined by local graph structure rather than ambient dimension.
-
Causal Process Models: Reframing Dynamic Causal Graph Discovery as a Reinforcement Learning Problem
Causal Process Models reframe dynamic causal graph discovery as multi-agent reinforcement learning to build sparse time-varying graphs only at active interactions, outperforming dense baselines on physical prediction.
-
Relational reasoning and inductive bias in transformers and large language models
In-weights learning induces linear embeddings enabling transitive inference in transformers, whereas in-context learning defaults to match-and-copy unless pre-trained on linear tasks or prompted with linear mental maps.
-
Unsupervised Learning of Local Updates for Maximum Independent Set in Dynamic Graphs
Unsupervised GNN model learns local updates for approximate MaxIS on dynamic graphs, achieving competitive ratios on 200-1000 node instances and 1.00-1.18x larger solutions than other unsupervised models when generalizing to 100x larger graphs.
-
Temporal Graph Networks for Deep Learning on Dynamic Graphs
Temporal Graph Networks combine memory modules and graph operators to learn on dynamic graphs as timed event sequences, outperforming prior methods on transductive and inductive tasks while unifying earlier models as special cases.
-
Neural Operator: Graph Kernel Network for Partial Differential Equations
Graph Kernel Networks learn PDE solution operators that generalize across discretization methods and grid resolutions using graph-based kernel integration.
-
Language Models as Knowledge Bases?
BERT stores relational knowledge extractable via cloze queries without fine-tuning and matches supervised baselines on open-domain QA tasks.
-
Placeto: Learning Generalizable Device Placement Algorithms for Distributed Machine Learning
Placeto learns generalizable RL policies for device placement via iterative improvements and graph embeddings, needing up to 6.1x fewer steps than prior methods and applying to unseen graphs without retraining.
-
Fast Graph Representation Learning with PyTorch Geometric
PyTorch Geometric is a PyTorch library that delivers fast graph neural network training through sparse GPU kernels and variable-size mini-batching.
-
Learning Altruistic Collaboration in Heterogeneous Multi-Team Systems
A graph neural network learns to approximate altruistic robot transfers across heterogeneous teams using Hamilton's rule, achieving near-optimal allocation in simulated firefighting scenarios.
-
GOAL: Graph-based Objective-Aligned Diffusion Solvers for Dynamic Multi-Objective Optimization
GOAL uses conditioned diffusion on relational graphs with typed edges to produce feasible multi-objective solutions for scheduling problems, reporting 100% feasibility and sub-0.2% MAPE on FSP, JSP, and FJSP up to 20 jobs.
-
Ada-Diffuser: Latent-Aware Adaptive Diffusion for Decision-Making
Ada-Diffuser is a causal diffusion model that jointly learns observed interaction structure and underlying latent dynamics from minimal observations for adaptive planning and policy learning.
-
Neural Point-Forms
Neural point-forms are introduced as permutation-invariant neural layers that output learned form-comparison matrices for point clouds, with a claimed consistency proof under sampling and manifold assumptions and competitive results on synthetic and biological data.
-
SACHI: Structured Agent Coordination via Holistic Information Integration in Multi-Agent Reinforcement Learning
SACHI enriches agent representations via graph transformer convolutions over inter-agent graphs to enable holistic information integration, outperforming baselines across five cooperative tasks with statistical significance.
-
LINC: Decoupling Local Consequence Scoring from Hidden Matching in Constructive Neural Routing
LINC decouples local consequence scoring from hidden matching in constructive neural routing solvers, cutting CVRPTW gaps for PolyNet from 13.83%/38.15% to 7.26%/14.71% on Solomon/Homberger benchmarks.
-
Sheet as Token: A Graph-Enhanced Representation for Multi-Sheet Spreadsheet Understanding
Sheet as Token represents each worksheet as a single dense token and uses a multi-channel graph retriever to improve retrieval of supporting sheets in multi-sheet workbooks.
-
Deep Wave Network for Modeling Multi-Scale Physical Dynamics
DW-Net improves the accuracy versus computational cost Pareto front over standard U-Nets for 2D and 3D multi-scale flow benchmarks by stacking multiple waves while keeping training settings identical.
-
Exploring the Potential of Probabilistic Transformer for Time Series Modeling: A Report on the ST-PT Framework
ST-PT turns transformers into explicit factor graphs for time series, enabling structural injection of symbolic priors, per-sample conditional generation, and principled latent autoregressive forecasting via MFVI iterations.
-
Scalable Production Scheduling: Linear Complexity via Unified Homogeneous Graphs
A unified homogeneous graph framework with feature homogenization enables linear-complexity RL policies for job shop scheduling that generalize zero-shot via structural saturation at balanced job-machine ratios.
-
Cluster Attention for Graph Machine Learning
Cluster attention uses off-the-shelf community detection to define attention scopes within graph clusters, augmenting MPNNs and Graph Transformers to achieve larger receptive fields with preserved structural inductive biases and improved performance on diverse graph datasets.
-
The Depth Ceiling: On the Limits of Large Language Models in Discovering Latent Planning
LLMs discover latent planning strategies up to five steps during training and execute them up to eight steps at test time, with larger models reaching seven under few-shot prompting, revealing a dissociation between discovery and execution.
-
Validating Computational Markers of Depressive Behavior: Cross-Linguistic Speech-Based Depression Detection with Neurophysiological Validation
The CDMA speech depression model generalizes across languages, favors emotional speech, and aligns with EEG markers of emotional dysregulation.
-
Metriplector: From Field Theory to Neural Architecture
Metriplector treats neural computation as coupled metriplectic field dynamics whose stress-energy tensor readout achieves competitive results on vision, control, Sudoku, language modeling, and pathfinding with small parameter counts.
-
Disentangled Latent Dynamics Manifold Fusion for Solving Parameterized PDEs
DLDMF disentangles latent dynamics for parameterized PDEs by feeding parameters into a latent embedding that initializes a parameter-conditioned Neural ODE, then uses dynamic manifold fusion with a shared decoder to reconstruct spatiotemporal fields for better generalization and extrapolation.
-
HiMAC: Hierarchical Macro-Micro Learning for Long-Horizon LLM Agents
HiMAC decomposes LLM agent tasks into macro planning and micro execution using critic-free hierarchical RL and iterative co-evolution, outperforming baselines on ALFWorld, WebShop, and Sokoban.
-
Overstating Attitudes, Ignoring Networks: LLM Biases in Simulating Misinformation Susceptibility
LLM simulations of misinformation susceptibility overstate attitudinal associations and largely ignore personal network characteristics compared to human survey data.
-
Holos: A Web-Scale LLM-Based Multi-Agent System for the Agentic Web
Holos is a five-layer LLM-based multi-agent system architecture using the Nuwa engine for agent generation, a market-driven Orchestrator for coordination, and an endogenous value cycle for incentive-compatible persistence in the Agentic Web.
-
Parameter-Efficient Conditioning for Material Generalization in Graph-Based Simulators
FiLM conditioning targeted at early message-passing layers lets pretrained GNS models generalize to new material properties using only 12 trajectories, a 5-fold data reduction versus multi-task baselines.
-
Graph-Based Alternatives to LLMs for Human Simulation
GEMS formulates close-ended human-behavior simulation as link prediction on a heterogeneous graph and matches or exceeds LLM performance with three orders of magnitude fewer parameters across three datasets and three evaluation settings.
-
Learning to accelerate distributed ADMM using graph neural networks
A GNN is trained to predict adaptive step sizes and weights for distributed ADMM by unrolling a fixed number of iterations and minimizing solution error on a problem class.
-
Pretrained Event Classification Model for High Energy Physics Analysis
A GNN pretrained on 120M simulated HEP events generalizes to unseen processes and ATLAS data; fine-tuning boosts accuracy especially with small datasets, with CKA showing preserved encoders but altered intermediate layers.
-
DragNUWA: Fine-grained Control in Video Generation by Integrating Text, Image, and Trajectory
DragNUWA integrates text, image, and trajectory controls into a diffusion video model using a Trajectory Sampler, Multiscale Fusion, and Adaptive Training to enable fine-grained open-domain video generation.
-
Geometric Deep Learning: Grids, Groups, Graphs, Geodesics, and Gauges
Geometric deep learning provides a unified mathematical framework based on grids, groups, graphs, geodesics, and gauges to explain and extend neural network architectures by incorporating physical regularities.
-
Deep Graph Library: A Graph-Centric, Highly-Performant Package for Graph Neural Networks
DGL is a graph-centric library that optimizes GNNs via generalized sparse tensor operations, transparent graph-based optimizations, and framework-neutral design, claiming superior speed and memory use over other GNN frameworks.
-
Representation Learning for Classical Planning from Partially Observed Traces
LP-GNN learns vectorized planning domain models via GNNs from partial traces and outperforms the ARMS learner on solving problems across five classical domains.
-
Graph Neural Based End-to-end Data Association Framework for Online Multiple-Object Tracking
A graph neural network framework learns affinities from appearance and motion then solves bipartite matching for online multiple-object tracking.
-
Graph-based Knowledge Distillation by Multi-head Attention Network
Multi-head attention constructs a graph of dataset relations from the teacher embedding procedure and transfers it to the student via multi-task learning, yielding 7.05% higher CIFAR-100 accuracy than the student alone and 2.46% above prior SOTA.
-
Representing Research Attention as Contextually Structured Flows
Attention flows are introduced as representations encoding the organisation and temporal evolution of research attention, outperforming signal and sequence representations on a benchmark of analogy-style structural comparison.
-
WaveGraphNet: Physics-Consistent Guided-Wave Damage Localization through Coupled Inverse-Forward Graph Learning
WaveGraphNet is a graph-based coupled inverse-forward model that localizes damage in CFRP plates from sparse guided-wave measurements with improved extrapolation to unseen locations.
-
Physics-Informed Graph Neural Network Surrogates for Turbulent Nanoparticle Dispersion in Dental Clinical Environments
ELGIN is a graph-based physics-informed surrogate model that predicts carrier flow and polydisperse particle motion in dental aerosol scenarios, achieving lower tracking errors and 37x speedup versus full OpenFOAM CFD in a preliminary single-case test.