Introduces a method to design structure-specific relational inductive biases for a base transformer architecture, enabling end-to-end transcription of documents with intrinsic structures, demonstrated on sheet music, shape drawings, and mechanical engineering drawings.
super hub Canonical reference
Graph Attention Networks
Canonical reference. 70% of citing Pith papers cite this work as background.
abstract
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key
authors
co-cited works
representative citing papers
PromptGNN-sim uses GAT-based semantically aware neighborhood selection and structure-aware LLM prompts with bi-directional contrastive alignment to outperform prior GNN, LLM, and fusion methods on text-attributed graph datasets.
Proposes equation-grounded taxonomy (unexpected AIS activity, route deviation, close approach) and LLM-guided synthesis pipeline to generate timestamp-labeled anomalies for evaluating maritime detection models.
QuADA-GS learns to predict local complexity-driven Gaussian densification from low-resolution inputs and uses Hierarchical Pointer Convolution for efficient arbitrary-scale super-resolution.
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
Introduces Hypergraph U-Nets with PHPool and PHUnpool operators derived from hierarchical clustering dendrograms for hypergraph reconstruction, classification, and anomaly detection.
An agentic multi-fidelity learning method corrects numerical artifacts in GW-BSE excited-state calculations for 2D bilayers and improves quasiparticle gaps and exciton binding energies.
Benchmark of BINN, GraphPath, and PATH on 2622 TCGA patients shows PATH best for targeted therapy, BINN for survival, none useful for radiation, with GraphPath at 0.92 AUROC on prostate targeted therapy.
EpiFormer improves epitope prediction F1 score by over 40% via early-fusion cross-attention in GNN layers and sparsity-aware objectives, while recovering known biology as emergent behavior.
LightGBM models on citation and diversity features predict exogenous diffusion of quantum computing concepts with R² up to 0.78 while endogenous reinforcement remains largely unpredictable after growth controls, with replications in other fields.
AbstainGNN is a framework that jointly models prediction and abstention in GNNs for graph classification, using a PAC-Bayesian-derived unified objective and two-stage training to achieve better accuracy at given rejection rates than prior abstention methods.
ContrastAD achieves highest mean F1 on all five MTS benchmarks and highest AUC on three by building DTW-based sparse graph snapshots and contrasting divergent pairs with a stable anchor instead of enforcing invariance.
Gaussian Sheaf Neural Networks derive a sheaf Laplacian for Gaussian node features on graphs to preserve their geometric structure during message passing.
NeighborDiv detects graph anomalies via variance of inter-neighbor feature similarities under a new Neighbor-to-Neighbor Diversity Paradigm, achieving SOTA results with zero volatility in zero-shot cross-domain settings.
Contrastive Message Passing lets GNNs apply similarity-preserving transforms to positive edges and dissimilarity-inducing transforms to negative edges via soft positive semidefinite constraints on weights, yielding gains in low-label high-homophily regimes.
GraphIP-Bench is a new unified benchmark showing GNN model extraction succeeds at moderate query budgets while most defenses fail to prevent it or retain verification signals on surrogates.
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.
Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.
Feature reconstruction in GSSL is robust to noise in text-driven biomedical graphs while relation reconstruction is sensitive, with bidirectional GNN architectures performing better on noisy data and yielding up to 7% gains over language model baselines.
LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.
A graph transformer with RL stabilizations is the first to exceed benchmarks for dynamic RMSA, supporting up to 13% more traffic load on networks up to 143 nodes.
citing papers explorer
-
A document is worth a structured record: Principled inductive bias design for document recognition
Introduces a method to design structure-specific relational inductive biases for a base transformer architecture, enabling end-to-end transcription of documents with intrinsic structures, demonstrated on sheet music, shape drawings, and mechanical engineering drawings.
-
PromptGNN-sim: Deep Fusion and Alignment of GNN and LLMs for Text-Attributed Graph Learning
PromptGNN-sim uses GAT-based semantically aware neighborhood selection and structure-aware LLM prompts with bi-directional contrastive alignment to outperform prior GNN, LLM, and fusion methods on text-attributed graph datasets.
-
Redefining Maritime Anomaly Detection via Equation-Grounded Synthetic Anomalies
Proposes equation-grounded taxonomy (unexpected AIS activity, route deviation, close approach) and LLM-guided synthesis pipeline to generate timestamp-labeled anomalies for evaluating maritime detection models.
-
Learning to Adaptively Allocate Gaussians for Arbitrary-Scale Image Super-Resolution
QuADA-GS learns to predict local complexity-driven Gaussian densification from low-resolution inputs and uses Hierarchical Pointer Convolution for efficient arbitrary-scale super-resolution.
-
Dark Matter in Draco and Bo\"otes I: Hints of a Core in an Ultra-Faint Dwarf from Simulation-Based Inference
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
-
Beyond Convolution: Advancing Hypergraph Neural Networks with Hypergraph U-Nets
Introduces Hypergraph U-Nets with PHPool and PHUnpool operators derived from hierarchical clustering dendrograms for hypergraph reconstruction, classification, and anomaly detection.
-
Agentic multi-fidelity learning of quasiparticle and excitonic properties
An agentic multi-fidelity learning method corrects numerical artifacts in GW-BSE excited-state calculations for 2D bilayers and improves quasiparticle gaps and exciton binding energies.
-
TRAPS: Therapeutic Response Analysis via Pathway-informed Stratification
Benchmark of BINN, GraphPath, and PATH on 2622 TCGA patients shows PATH best for targeted therapy, BINN for survival, none useful for radiation, with GraphPath at 0.92 AUROC on prostate targeted therapy.
-
EpiFormer: Learning Antigen-Antibody Interactions for Epitope Prediction via Geometric Deep Learning
EpiFormer improves epitope prediction F1 score by over 40% via early-fusion cross-attention in GNN layers and sparsity-aware objectives, while recovering known biology as emergent behavior.
-
Forecasting Conceptual Diffusion in Science: The Case of Quantum Computing
LightGBM models on citation and diversity features predict exogenous diffusion of quantum computing concepts with R² up to 0.78 while endogenous reinforcement remains largely unpredictable after growth controls, with replications in other fields.
-
AbstainGNN: Teaching Graph Neural Networks to Abstain for Graph Classification
AbstainGNN is a framework that jointly models prediction and abstention in GNNs for graph classification, using a PAC-Bayesian-derived unified objective and two-stage training to achieve better accuracy at given rejection rates than prior abstention methods.
-
Contrast to Detect: Dynamic Graph Contrastive Regularization for Unsupervised Anomaly Detection in Multivariate Time Series
ContrastAD achieves highest mean F1 on all five MTS benchmarks and highest AUC on three by building DTW-based sparse graph snapshots and contrasting divergent pairs with a stable anchor instead of enforcing invariance.
-
Gaussian Sheaf Neural Networks
Gaussian Sheaf Neural Networks derive a sheaf Laplacian for Gaussian node features on graphs to preserve their geometric structure during message passing.
-
NeighborDiv: Training-free Zero-shot Generalist Graph Anomaly Detection via Neighbor Diversity
NeighborDiv detects graph anomalies via variance of inter-neighbor feature similarities under a new Neighbor-to-Neighbor Diversity Paradigm, achieving SOTA results with zero volatility in zero-shot cross-domain settings.
-
Learning over Positive and Negative Edges with Contrastive Message Passing
Contrastive Message Passing lets GNNs apply similarity-preserving transforms to positive edges and dissimilarity-inducing transforms to negative edges via soft positive semidefinite constraints on weights, yielding gains in low-label high-homophily regimes.
-
GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?
GraphIP-Bench is a new unified benchmark showing GNN model extraction succeeds at moderate query budgets while most defenses fail to prevent it or retain verification signals on surrogates.
-
TopoU-Net: a U-Net architecture for topological domains
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
-
CTQWformer: A CTQW-based Transformer for Graph Classification
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
-
Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
-
SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS
SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.
-
Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models
Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.
-
Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs
Feature reconstruction in GSSL is robust to noise in text-driven biomedical graphs while relation reconstruction is sensitive, with bidirectional GNN architectures performing better on noisy data and yielding up to 7% gains over language model baselines.
-
LUMINA: A Grid Foundation Model for Benchmarking AC Optimal Power Flow Surrogate Learning
LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.
-
Graph Transformers and Stabilized Reinforcement Learning for Large-Scale Dynamic Routing Modulation and Spectrum Allocation in Elastic Optical Networks
A graph transformer with RL stabilizations is the first to exceed benchmarks for dynamic RMSA, supporting up to 13% more traffic load on networks up to 143 nodes.
-
Empowering Heterogeneous Graph Foundation Models via Decoupled Relation Alignment
DRSA provides a plug-and-play alignment framework that decouples features and relations to prevent type collapse and relation confusion in heterogeneous graph foundation models.
-
Advancing Edge Classification through High-Dimensional Causal Modeling of Node-Edge Interplay
CECF is a new causal framework for edge classification that balances high-dimensional edge features against node influences via GNN embeddings and cross-attention to achieve better performance than standard methods.
-
PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty
PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.
-
Hamiltonian Graph Inference Networks: Joint structure discovery and dynamics prediction for lattice Hamiltonian systems from trajectory data
HGIN jointly recovers interaction graphs and predicts trajectories for lattice Hamiltonian systems from data, achieving six to thirteen orders of magnitude lower long-time errors than baselines on Klein-Gordon and discrete nonlinear Schrödinger lattices.
-
Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay
A structure-aware VAE generates realistic FC matrices for replay, combined with multi-level knowledge distillation and hierarchical contextual bandit sampling, to enable continual fMRI-based brain disorder diagnosis across sequentially arriving multi-site data without catastrophic forgetting.
-
CapBench: A Multi-PDK Dataset for Machine-Learning-Based Post-Layout Capacitance Extraction
CapBench is a new multi-PDK dataset of post-layout 3D windows with high-fidelity capacitance labels and multiple ML-ready representations, plus baseline results showing CNN accuracy versus GNN speed trade-offs.
-
Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling
Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.
-
SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objective
SCOT uses Sinkhorn entropic optimal transport to learn explicit soft correspondences between unequal region sets for multi-source cross-city transfer, adding contrastive sharpening and cycle reconstruction for stability and a prototype hub for multi-source alignment.
-
Graph Topology Information Enhanced Heterogeneous Graph Representation Learning
ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.
-
Hierarchical Mesh Transformers with Topology-Guided Pretraining for Morphometric Analysis of Brain Structures
A hierarchical mesh transformer using topology-guided pretraining on simplicial complexes achieves state-of-the-art results on Alzheimer's classification, amyloid prediction, and focal cortical dysplasia detection from brain meshes.
-
EndoVGGT: GNN-Enhanced Depth Estimation for Surgical 3D Reconstruction
EndoVGGT uses a dynamic DeGAT graph attention module to improve depth estimation and non-rigid 3D reconstruction in surgery, reporting 24.6% PSNR and 9.1% SSIM gains on SCARED with zero-shot generalization to new domains.
-
GraphScout: Empowering Large Language Models with Intrinsic Exploration Ability for Agentic Graph Reasoning
GraphScout trains LLMs to autonomously synthesize structured training data from knowledge graphs via flexible exploration tools, enabling a 4B model to outperform larger LLMs by 16.7% on average with fewer inference tokens and strong cross-domain transfer.
-
Learning to Reconstruct: A Differentiable Approach to Muon Tracking at the LHC
A differentiable end-to-end model combining graph attention networks with clustering and fitting improves muon track reconstruction and pT estimation at the LHC compared to factorized approaches.
-
Cross-Paradigm Graph Backdoor Attacks with Promptable Subgraph Triggers
CP-GBA distills a queryable repository of promptable subgraph triggers via graph prompt learning to achieve transferable backdoor attacks on GNNs with state-of-the-art success rates across paradigms and defenses.
-
When LLM Agents Meet Graph Optimization: An Automated Data Quality Improvement Approach
LAGA is a unified multi-agent LLM framework that automates comprehensive quality optimization for text-attributed graphs by running detection, planning, action, and evaluation agents in a closed loop.
-
Effective Capacitance Modeling Using Graph Neural Networks
GNN-Ceff is the first graph neural network model for post-layout effective capacitance prediction in VLSI circuits, delivering up to 929x speedup over serial state-of-the-art methods with improved accuracy on real benchmarks.
-
Doloris: Dual Conditional Diffusion Implicit Bridges with Sparsity Masking Strategy for Unpaired Single-Cell Perturbation Estimation
Doloris introduces dual conditional diffusion implicit bridges plus a sparsity masking strategy to model unpaired single-cell perturbation responses and reports state-of-the-art results on public datasets.
-
Mamba-Based Graph Convolutional Networks: Tackling Over-smoothing with Selective State Space
MbaGCN combines message aggregation, selective state space transitions, and node state prediction to create a more scalable deep graph convolutional network.
-
Direction for Detection: A Survey of Automated Vulnerability Detection and all of its Pain Points
ML4AVD research remains locked into binary function-level classification of C/C++ vulnerabilities because twelve pain points in the pipeline reinforce each other through feedback loops.
-
GRAPE: Graph-Augmented Prototype Explanations for Interactive Medical Image Diagnosis
GRAPE augments prototype medical image classifiers with graph attention for co-occurrence, a mismatch safety check, and open-vocabulary anchoring to support incremental addition of findings from single examples.
-
hia-gat: A Heterogeneous Interaction-Aware Graph Attention Network For Frame-Level Traffic Conflict Risk Prediction On Freeways
HIA-GAT, a heterogeneous graph attention network with conflict-type-aware gating, reports the highest AUC for frame-level risk prediction on NGSIM I-80 and US-101 datasets, with largest gains on lateral (PET) conflicts.
-
MMGNN: Multi-level, multi-color graph neural networks for molecular property prediction
MMGNN decomposes molecular graphs into multi-color subgraphs by atom-type pairs and applies shared message-passing per subgraph, achieving top macro AUC-ROC of 0.838 on classification and best RMSE on ESOL and FreeSolv among tested models.
-
GraspLLM: Towards Zero-Shot Generalization on Text-Attributed Graphs with LLMs
GraspLLM extracts dataset-agnostic structural patterns via motif contrastive learning and aligns contextual subgraphs to LLM tokens, outperforming prior LLM-based methods on TAGs especially in zero-shot settings.
-
From Coarse to Fine: Managing Temporal Granularity in Spatio-Temporal Data for Fine-Grained Traffic Prediction
STRP is a granularity-aware model that predicts fine-grained spatio-temporal traffic from coarse inputs via tree convolution and inverse dilated convolution, outperforming baselines on six datasets in window-based and duration-based settings.
-
Knowledge-Inclusive Adaptive Physics-Informed Neural Network for Microbial Interaction Modelling
A knowledge-inclusive PINN framework integrates metagenomics literature and network structure with gLV equations to model microbial interactions, achieving up to 53% improvement over prior methods.
-
AIS-Based Vessel Trajectory Prediction Using Memory-Augmented Neural Networks
Memory-augmented neural networks produce consistent performance gains over standard deep learning baselines on AIS vessel trajectory data from the Gulf of Mexico and New York Bight.