hub

Graph Attention Networks

· 2017 · stat.ML · arXiv 1710.10903

71 Pith papers cite this work. Polarity classification is still indexing.

71 Pith papers citing it

open full Pith review browse 71 citing papers arXiv PDF

abstract

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

hub tools

JSON dossier citing papers JSON arXiv source

claims ledger

abstract We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key

co-cited works

representative citing papers

GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

cs.CR · 2026-05-12 · accept · novelty 8.0

GraphIP-Bench shows stealing GNNs is easy at moderate query budgets, most defenses fail to block or reliably trace extraction, and watermarks lose verification power on surrogates while heterophilic graphs are harder to steal.

Weather-Robust Cross-View Geo-Localization via Prototype-Based Semantic Part Discovery

cs.CV · 2026-05-12 · unverdicted · novelty 7.0

SkyPart uses learnable prototypes for patch grouping, altitude modulation only in training, graph-attention readout, and Kendall-weighted loss to set new state-of-the-art single-pass performance on SUES-200, University-1652, and DenseUAV while widening gains under weather corruptions.

TopoU-Net: a U-Net architecture for topological domains

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.

CTQWformer: A CTQW-based Transformer for Graph Classification

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型

SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.

Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.

Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Feature reconstruction in GSSL is robust to noise in text-driven biomedical graphs while relation reconstruction is sensitive, with bidirectional GNN architectures performing better on noisy data and yielding up to 7% gains over language model baselines.

LUMINA: A Grid Foundation Model for Benchmarking AC Optimal Power Flow Surrogate Learning

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.

Graph Transformers and Stabilized Reinforcement Learning for Large-Scale Dynamic Routing Modulation and Spectrum Allocation in Elastic Optical Networks

cs.NI · 2026-05-03 · conditional · novelty 7.0

Graph transformer RL for dynamic RMSA supports up to 13% more traffic than benchmarks on networks up to 143 nodes and 362 links.

Empowering Heterogeneous Graph Foundation Models via Decoupled Relation Alignment

cs.SI · 2026-05-01 · unverdicted · novelty 7.0

DRSA provides a plug-and-play alignment framework that decouples features and relations to prevent type collapse and relation confusion in heterogeneous graph foundation models.

Advancing Edge Classification through High-Dimensional Causal Modeling of Node-Edge Interplay

cs.LG · 2026-05-01 · unverdicted · novelty 7.0

CECF is a new causal framework for edge classification that balances high-dimensional edge features against node influences via GNN embeddings and cross-attention to achieve better performance than standard methods.

PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty

cs.LG · 2026-04-29 · unverdicted · novelty 7.0

PiGGO integrates a learned graph neural ODE as the continuous-time dynamics model within an extended Kalman filter to enable online virtual sensing and uncertainty-aware state estimation for nonlinear dynamic systems with unknown model form and sparse sensing.

Hamiltonian Graph Inference Networks: Joint structure discovery and dynamics prediction for lattice Hamiltonian systems from trajectory data

cs.LG · 2026-04-26 · unverdicted · novelty 7.0

HGIN jointly recovers interaction graphs and predicts trajectories for lattice Hamiltonian systems from data, achieving six to thirteen orders of magnitude lower long-time errors than baselines on Klein-Gordon and discrete nonlinear Schrödinger lattices.

Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay

q-bio.TO · 2026-04-15 · conditional · novelty 7.0

A structure-aware VAE generates realistic FC matrices for replay, combined with multi-level knowledge distillation and hierarchical contextual bandit sampling, to enable continual fMRI-based brain disorder diagnosis across sequentially arriving multi-site data without catastrophic forgetting.

CapBench: A Multi-PDK Dataset for Machine-Learning-Based Post-Layout Capacitance Extraction

cs.AR · 2026-04-13 · accept · novelty 7.0

CapBench is a new multi-PDK dataset of post-layout 3D windows with high-fidelity capacitance labels and multiple ML-ready representations, plus baseline results showing CNN accuracy versus GNN speed trade-offs.

Graph-RHO: Critical-path-aware Heterogeneous Graph Network for Long-Horizon Flexible Job-Shop Scheduling

cs.LG · 2026-04-11 · unverdicted · novelty 7.0

Graph-RHO is a critical-path-aware heterogeneous graph network for rolling horizon optimization in flexible job-shop scheduling that achieves state-of-the-art solution quality and over 30% faster solve times on large instances.

SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objective

cs.LG · 2026-04-08 · unverdicted · novelty 7.0 · 2 refs

SCOT uses Sinkhorn entropic optimal transport to learn explicit soft correspondences between unequal region sets for multi-source cross-city transfer, adding contrastive sharpening and cycle reconstruction for stability and a prototype hub for multi-source alignment.

Graph Topology Information Enhanced Heterogeneous Graph Representation Learning

cs.LG · 2026-04-07 · unverdicted · novelty 7.0

ToGRL learns high-quality graph structures from raw heterogeneous graphs via a two-stage topology extraction process and prompt tuning, outperforming prior methods on five datasets.

Hierarchical Mesh Transformers with Topology-Guided Pretraining for Morphometric Analysis of Brain Structures

cs.CV · 2026-04-06 · unverdicted · novelty 7.0

A hierarchical mesh transformer using topology-guided pretraining on simplicial complexes achieves state-of-the-art results on Alzheimer's classification, amyloid prediction, and focal cortical dysplasia detection from brain meshes.

CO-MAP: A Reinforcement Learning Approach to the Qubit Allocation Problem

quant-ph · 2026-05-13 · unverdicted · novelty 6.0

Reinforcement learning policy for qubit mapping reduces SWAP overhead by 65-85% versus standard quantum compilers on MQTBench and Queko benchmark circuits.

GRASP -- Graph-Based Anomaly Detection Through Self-Supervised Classification

cs.CR · 2026-05-08 · unverdicted · novelty 6.0

GRASP detects anomalies in system provenance graphs via self-supervised executable prediction from two-hop neighborhoods, outperforming prior PIDS on DARPA datasets by identifying all documented attacks where behaviors are learnable plus additional unlabeled suspicious activity.

Mid-Circuit Measurements for Clifford Noise Reduction in Hamiltonian Simulations

quant-ph · 2026-05-07 · conditional · novelty 6.0 · 2 refs

Mid-circuit stabilizer verification in six-qubit GSE-encoded Clifford Trotter steps reduces logical error rates by up to 54% on Barium ion hardware, with the gain vanishing if checks are deferred to circuit end.

GCCM: Enhancing Generative Graph Prediction via Contrastive Consistency Model

cs.AI · 2026-05-07 · unverdicted · novelty 6.0

GCCM prevents shortcut collapse in consistency models for graph prediction by using contrastive negative pairs and input feature perturbation, leading to better performance than deterministic baselines.

citing papers explorer

Showing 21 of 71 citing papers.

Learning to Compress and Transmit: Adaptive Rate Control for Semantic Communications over LEO Satellite-to-Ground Links cs.NI · 2026-05-11 · unverdicted · none · ref 72 · internal anchor
RL agent adaptively controls compression rate in semantic satellite communications to achieve 95% qualified image frames with no packet loss by using SNR predictions and queue management.
DCVD: Dual-Channel Cross-Modal Fusion for Joint Vulnerability Detection and Localization cs.CR · 2026-05-10 · unverdicted · none · ref 21 · internal anchor
DCVD performs joint function-level vulnerability detection and statement-level localization by extracting control-dependency and semantic features in parallel branches, fusing them with contrastive alignment and bidirectional cross-attention, and applying explicit supervision at both granularities.
Multi-Level Graph Attention Network Contrastive Learning for Knowledge-Aware Recommendation cs.IR · 2026-05-08 · unverdicted · none · ref 10 · internal anchor
A multi-level graph attention network with contrastive learning outperforms prior methods on knowledge-aware recommendation by improving generalization across three comparison perspectives.
PLMGH: What Matters in PLM-GNN Hybrids for Code Classification and Vulnerability Detection cs.SE · 2026-04-28 · unverdicted · none · ref 31 · internal anchor
Controlled experiments show PLM-GNN hybrids improve code tasks over GNN-only baselines, with PLM source having larger impact than GNN backbone.
Crystal Fractional Graph Neural Network for Energy Prediction of High-Entropy Alloys physics.comp-ph · 2026-04-26 · unverdicted · none · ref 38 · internal anchor
A crystal fractional graph neural network fuses local graph attention on 16-atom environments with global composition fractions to predict high-entropy alloy energies at RMSE levels comparable to first-principles calculations on quaternary test structures.
Robustness of Spatio-temporal Graph Neural Networks for Fault Location in Partially Observable Distribution Grids cs.LG · 2026-04-22 · unverdicted · none · ref 51 · 2 links · internal anchor
Measured-only STGNNs (RGATv2, RGSAGE) achieve up to 11 F1 points higher and 6x faster training than RNN baselines for fault location on the IEEE 123-bus feeder under partial observability.
Multi-Perspective Evidence Synthesis and Reasoning for Unsupervised Multimodal Entity Linking cs.CL · 2026-04-22 · unverdicted · none · ref 44 · internal anchor
MSR-MEL synthesizes instance-centric, group-level, lexical, and statistical evidence with LLMs and asymmetric teacher-student GNNs to outperform prior unsupervised methods on multimodal entity linking benchmarks.
AROMA: Augmented Reasoning Over a Multimodal Architecture for Virtual Cell Genetic Perturbation Modeling q-bio.QM · 2026-04-22 · unverdicted · none · ref 111 · internal anchor
AROMA combines text, graph topology, and protein sequences with augmented reasoning and two-stage optimization to deliver more accurate and interpretable predictions of genetic perturbation effects in virtual cells, outperforming baselines even in zero-shot and long-tail settings.
Inductive Subgraphs as Shortcuts: Causal Disentanglement for Heterophilic Graph Learning cs.LG · 2026-04-21 · unverdicted · none · ref 44 · internal anchor
Inductive subgraphs serve as shortcuts in heterophilic graphs, and CD-GNN disentangles spurious from causal subgraphs by blocking non-causal paths to improve robustness and accuracy.
TabEmb: Joint Semantic-Structure Embedding for Table Annotation cs.LG · 2026-04-21 · unverdicted · none · ref 121 · internal anchor
TabEmb decouples LLM-based semantic column embeddings from graph-based structural modeling to produce joint representations that improve table annotation tasks.
How Embeddings Shape Graph Neural Networks: Classical vs Quantum-Oriented Node Representations cs.LG · 2026-04-16 · unverdicted · none · ref 8 · internal anchor
Quantum-oriented embeddings deliver consistent gains on structure-driven graph datasets while classical baselines perform adequately on attribute-limited social graphs, under identical training pipelines across five TU datasets and binned QM9.
DIB-OD: Preserving the Invariant Core for Robust Heterogeneous Graph Adaptation via Decoupled Information Bottleneck and Online Distillation cs.LG · 2026-04-13 · unverdicted · none · ref 16 · internal anchor
DIB-OD isolates a stable invariant core in heterogeneous graph representations via orthogonal subspace decomposition, IB teacher-student distillation, HSIC independence, and confidence-gated regularization for improved cross-domain generalization.
Brain-Grasp: Graph-based Saliency Priors for Improved fMRI-based Visual Brain Decoding eess.IV · 2026-04-12 · unverdicted · none · ref 19 · internal anchor
Graph-informed saliency masks derived from fMRI signals are used to condition a single diffusion model, improving object structure and semantic fidelity in visual brain decoding.
From Load Tests to Live Streams: Graph Embedding-Based Anomaly Detection in Microservice Architectures cs.LG · 2026-04-07 · unverdicted · none · ref 16 · internal anchor
A GCN-GAE model learns node embeddings from directed weighted microservice graphs to flag anomalies via cosine similarity between load-test and live-event representations, with a synthetic injection framework reporting 96% precision.
Extracting Money Laundering Transactions from Quasi-Temporal Graph Representation cs.LG · 2026-04-03 · unverdicted · none · ref 33 · internal anchor
ExSTraQt uses quasi-temporal graph representations and supervised learning to detect suspicious transactions, achieving F1 score uplifts of up to 1% on real data and over 8% on synthetic datasets compared to prior AML models.
What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review cs.SE · 2026-04-01 · accept · none · ref 154 · internal anchor
Systematic review of 80 papers shows TTP extraction shifting to transformer and LLM methods but limited by narrow datasets, single-label focus, and low reproducibility.
Efficient and Scalable Granular-ball Graph Coarsening Method for Large-scale Graph Node Classification cs.LG · 2026-03-31 · unverdicted · none · ref 54 · internal anchor
A multi-granularity granular-ball coarsening algorithm reduces large graphs in linear time for faster GCN training on node classification, with experiments claiming superior performance over prior methods.
Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction cs.LG · 2026-04-29 · unverdicted · none · ref 27 · internal anchor
Large benchmark shows classical ML and GNNs outperform pretrained large models on most of 22 drug-discovery endpoints under strict cross-validation.
A Dual Cross-Attention Graph Learning Framework For Multimodal MRI-Based Major Depressive Disorder Detection cs.CV · 2026-04-11 · unverdicted · none · ref 42 · internal anchor
Dual cross-attention fusion of sMRI and rs-fMRI data achieves 84.71% accuracy in MDD detection on the REST-meta-MDD dataset, outperforming concatenation on functional atlases.
Bridging the Dimensionality Gap: A Taxonomy and Survey of 2D Vision Model Adaptation for 3D Analysis cs.CV · 2026-04-03 · unverdicted · none · ref 6 · internal anchor
The paper offers a taxonomy of 2D-to-3D adaptation strategies divided into data-centric projection, architecture-centric 3D networks, and hybrid methods that combine both.
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models cs.CL · 2026-05-08 · unverdicted · none · ref 254 · internal anchor
Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.

Graph Attention Networks

hub tools

claims ledger

co-cited works

fields

years

verdicts

representative citing papers

citing papers explorer