super hub Canonical reference

Graph Attention Networks

Adriana Romero, Arantxa Casanova, Guillem Cucurull, Yoshua Bengio · 2017 · stat.ML · arXiv 1710.10903

Canonical reference. 70% of citing Pith papers cite this work as background.

197 Pith papers citing it

Background 70% of classified citations

open full Pith review browse 197 citing papers more from Adriana Romero arXiv PDF

abstract

We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

hub tools

JSON dossier citing papers JSON arXiv source

citation-role summary

background 15 method 3 baseline 2

citation-polarity summary

background 14 use method 3 baseline 2 support 1

claims ledger

abstract We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key

authors

Adriana Romero Arantxa Casanova Guillem Cucurull Petar Veli\v{c}kovi\'c Pietro Li\`o Yoshua Bengio

co-cited works

representative citing papers

A document is worth a structured record: Principled inductive bias design for document recognition

cs.CV · 2025-07-11 · unverdicted · novelty 8.0

Introduces a method to design structure-specific relational inductive biases for a base transformer architecture, enabling end-to-end transcription of documents with intrinsic structures, demonstrated on sheet music, shape drawings, and mechanical engineering drawings.

PromptGNN-sim: Deep Fusion and Alignment of GNN and LLMs for Text-Attributed Graph Learning

cs.AI · 2026-06-29 · unverdicted · novelty 7.0

PromptGNN-sim uses GAT-based semantically aware neighborhood selection and structure-aware LLM prompts with bi-directional contrastive alignment to outperform prior GNN, LLM, and fusion methods on text-attributed graph datasets.

Redefining Maritime Anomaly Detection via Equation-Grounded Synthetic Anomalies

cs.LG · 2026-06-29 · unverdicted · novelty 7.0

Proposes equation-grounded taxonomy (unexpected AIS activity, route deviation, close approach) and LLM-guided synthesis pipeline to generate timestamp-labeled anomalies for evaluating maritime detection models.

Learning to Adaptively Allocate Gaussians for Arbitrary-Scale Image Super-Resolution

cs.CV · 2026-06-28 · unverdicted · novelty 7.0

QuADA-GS learns to predict local complexity-driven Gaussian densification from low-resolution inputs and uses Hierarchical Pointer Convolution for efficient arbitrary-scale super-resolution.

Dark Matter in Draco and Bo\"otes I: Hints of a Core in an Ultra-Faint Dwarf from Simulation-Based Inference

astro-ph.GA · 2026-06-24 · unverdicted · novelty 7.0

GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.

Beyond Convolution: Advancing Hypergraph Neural Networks with Hypergraph U-Nets

cs.LG · 2026-06-08 · unverdicted · novelty 7.0

Introduces Hypergraph U-Nets with PHPool and PHUnpool operators derived from hierarchical clustering dendrograms for hypergraph reconstruction, classification, and anomaly detection.

Agentic multi-fidelity learning of quasiparticle and excitonic properties

cond-mat.mtrl-sci · 2026-06-05 · unverdicted · novelty 7.0

An agentic multi-fidelity learning method corrects numerical artifacts in GW-BSE excited-state calculations for 2D bilayers and improves quasiparticle gaps and exciton binding energies.

TRAPS: Therapeutic Response Analysis via Pathway-informed Stratification

cs.LG · 2026-06-05 · unverdicted · novelty 7.0

Benchmark of BINN, GraphPath, and PATH on 2622 TCGA patients shows PATH best for targeted therapy, BINN for survival, none useful for radiation, with GraphPath at 0.92 AUROC on prostate targeted therapy.

EpiFormer: Learning Antigen-Antibody Interactions for Epitope Prediction via Geometric Deep Learning

q-bio.QM · 2026-06-02 · unverdicted · novelty 7.0

EpiFormer improves epitope prediction F1 score by over 40% via early-fusion cross-attention in GNN layers and sparsity-aware objectives, while recovering known biology as emergent behavior.

Forecasting Conceptual Diffusion in Science: The Case of Quantum Computing

cs.SI · 2026-06-02 · unverdicted · novelty 7.0

LightGBM models on citation and diversity features predict exogenous diffusion of quantum computing concepts with R² up to 0.78 while endogenous reinforcement remains largely unpredictable after growth controls, with replications in other fields.

AbstainGNN: Teaching Graph Neural Networks to Abstain for Graph Classification

cs.LG · 2026-05-29 · unverdicted · novelty 7.0

AbstainGNN is a framework that jointly models prediction and abstention in GNNs for graph classification, using a PAC-Bayesian-derived unified objective and two-stage training to achieve better accuracy at given rejection rates than prior abstention methods.

Contrast to Detect: Dynamic Graph Contrastive Regularization for Unsupervised Anomaly Detection in Multivariate Time Series

cs.LG · 2026-05-22 · unverdicted · novelty 7.0

ContrastAD achieves highest mean F1 on all five MTS benchmarks and highest AUC on three by building DTW-based sparse graph snapshots and contrasting divergent pairs with a stable anchor instead of enforcing invariance.

Gaussian Sheaf Neural Networks

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

Gaussian Sheaf Neural Networks derive a sheaf Laplacian for Gaussian node features on graphs to preserve their geometric structure during message passing.

NeighborDiv: Training-free Zero-shot Generalist Graph Anomaly Detection via Neighbor Diversity

cs.LG · 2026-05-20 · unverdicted · novelty 7.0

NeighborDiv detects graph anomalies via variance of inter-neighbor feature similarities under a new Neighbor-to-Neighbor Diversity Paradigm, achieving SOTA results with zero volatility in zero-shot cross-domain settings.

Learning over Positive and Negative Edges with Contrastive Message Passing

cs.LG · 2026-05-18 · unverdicted · novelty 7.0

Contrastive Message Passing lets GNNs apply similarity-preserving transforms to positive edges and dissimilarity-inducing transforms to negative edges via soft positive semidefinite constraints on weights, yielding gains in low-label high-homophily regimes.

GraphIP-Bench: How Hard Is It to Steal a Graph Neural Network, and Can We Stop It?

cs.CR · 2026-05-12 · unverdicted · novelty 7.0 · 2 refs

GraphIP-Bench is a new unified benchmark showing GNN model extraction succeeds at moderate query budgets while most defenses fail to prevent it or retain verification signals on surrogates.

TopoU-Net: a U-Net architecture for topological domains

cs.LG · 2026-05-11 · unverdicted · novelty 7.0

TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.

CTQWformer: A CTQW-based Transformer for Graph Classification

cs.LG · 2026-05-10 · unverdicted · novelty 7.0

CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.

Structural Interpretations of Protein Language Model Representations via Differentiable Graph Partitioning

cs.LG · 2026-05-09 · unverdicted · novelty 7.0

SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型

SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS

cs.LG · 2026-05-08 · unverdicted · novelty 7.0

SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.

Graphlets as Building Blocks for Structural Vocabulary in Knowledge Graph Foundation Models

cs.AI · 2026-05-07 · unverdicted · novelty 7.0

Graphlets mined as structural tokens improve zero-shot inductive and transductive link prediction in knowledge graph foundation models across 51 diverse graphs.

Robustness of Graph Self-Supervised Learning to Real-World Noise: A Case Study on Text-Driven Biomedical Graphs

cs.LG · 2026-05-06 · unverdicted · novelty 7.0

Feature reconstruction in GSSL is robust to noise in text-driven biomedical graphs while relation reconstruction is sensitive, with bidirectional GNN architectures performing better on noisy data and yielding up to 7% gains over language model baselines.

LUMINA: A Grid Foundation Model for Benchmarking AC Optimal Power Flow Surrogate Learning

cs.LG · 2026-05-04 · unverdicted · novelty 7.0

LUMINA-Bench is a standardized evaluation framework for ACOPF surrogate models that tests generalization across multiple grid topologies using accuracy and physics-constraint metrics.

Graph Transformers and Stabilized Reinforcement Learning for Large-Scale Dynamic Routing Modulation and Spectrum Allocation in Elastic Optical Networks

cs.NI · 2026-05-03 · unverdicted · novelty 7.0 · 2 refs

A graph transformer with RL stabilizations is the first to exceed benchmarks for dynamic RMSA, supporting up to 13% more traffic load on networks up to 143 nodes.

citing papers explorer

Showing 15 of 15 citing papers after filters.

TopoU-Net: a U-Net architecture for topological domains cs.LG · 2026-05-11 · unverdicted · none · ref 29 · internal anchor
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
SGC-RML: A reliable and interpretable longitudinal assessment for PD in real-world DNS cs.LG · 2026-05-08 · unverdicted · none · ref 38 · internal anchor
SGC-RML creates an 8D symptom atlas from multimodal PD data and integrates conformal calibration to deliver reliable, rejectable longitudinal assessments.
Direction for Detection: A Survey of Automated Vulnerability Detection and all of its Pain Points cs.SE · 2024-12-15 · conditional · none · ref 123 · internal anchor
ML4AVD research remains locked into binary function-level classification of C/C++ vulnerabilities because twelve pain points in the pipeline reinforce each other through feedback loops.
ACT: Anti-Crosstalk Learning for Cross-Sectional Stock Ranking via Temporal Disentanglement and Structural Purification cs.LG · 2026-04-22 · unverdicted · none · ref 38 · internal anchor
ACT disentangles temporal scales in stock sequences and purifies structural relations in graphs to achieve state-of-the-art cross-sectional stock ranking on CSI300 and CSI500 with up to 74.25% improvement.
Towards Predicting Multi-Vulnerability Attack Chains in Software Supply Chains from Software Bill of Materials Graphs cs.SE · 2026-04-04 · unverdicted · none · ref 29 · internal anchor
The paper shows that heterogeneous graph attention networks can classify vulnerable components in real SBOMs at 91% accuracy and that a simple MLP can predict documented multi-vulnerability chains with 0.93 ROC-AUC.
MMP-Refer: Multimodal Path Retrieval-augmented LLMs For Explainable Recommendation cs.IR · 2026-04-04 · conditional · none · ref 34 · internal anchor
MMP-Refer augments LLMs with multimodal retrieval paths and a trainable collaborative adapter to produce more accurate and explainable recommendations.
Mixture of Sequence: Theme-Aware Mixture-of-Experts for Long-Sequence Recommendation cs.IR · 2026-03-01 · unverdicted · none · ref 108 · internal anchor
MoS applies theme-aware routing to extract multi-scale theme-specific subsequences from noisy long user sequences, achieving state-of-the-art recommendation performance with fewer FLOPs than comparable MoE models.
Attention U-Net: Learning Where to Look for the Pancreas cs.CV · 2018-04-11 · unverdicted · none · ref 31 · internal anchor
Attention gates added to U-Net automatically focus on target organs in CT images and improve segmentation performance on abdominal datasets.
Multi-Level Graph Attention Network Contrastive Learning for Knowledge-Aware Recommendation cs.IR · 2026-05-08 · unverdicted · none · ref 10 · internal anchor
A multi-level graph attention network with contrastive learning outperforms prior methods on knowledge-aware recommendation by improving generalization across three comparison perspectives.
Robustness of Spatio-temporal Graph Neural Networks for Fault Location in Partially Observable Distribution Grids cs.LG · 2026-04-22 · unverdicted · none · ref 51 · 2 links · internal anchor
Measured-only STGNNs (RGATv2, RGSAGE) achieve up to 11 F1 points higher and 6x faster training than RNN baselines for fault location on the IEEE 123-bus feeder under partial observability.
Extracting Money Laundering Transactions from Quasi-Temporal Graph Representation cs.LG · 2026-04-03 · unverdicted · none · ref 33 · internal anchor
ExSTraQt uses quasi-temporal graph representations and supervised learning to detect suspicious transactions, achieving F1 score uplifts of up to 1% on real data and over 8% on synthetic datasets compared to prior AML models.
What Are Adversaries Doing? Automating Tactics, Techniques, and Procedures Extraction: A Systematic Review cs.SE · 2026-04-01 · accept · none · ref 154 · internal anchor
Systematic review of 80 papers shows TTP extraction shifting to transformer and LLM methods but limited by narrow datasets, single-label focus, and low reproducibility.
Explaining the Explainers in Graph Neural Networks: a Comparative Study cs.LG · 2022-10-27 · unverdicted · none · ref 105 · internal anchor
Benchmark study of ten GNN explainers on eight architectures and six datasets that isolates usable components and issues practical recommendations.
Navigating Distribution Shifts in Medical Image Analysis: A Survey eess.IV · 2024-11-05 · unverdicted · none · ref 196 · internal anchor
Survey categorizing DL methods for distribution shifts in MedIA by clinical scenarios, with analysis indicating constrained gains as domain information decreases and a shift toward uncertainty-aware modeling.
Explainable AI for Mental Disorder Detection via Social Media: A survey and outlook cs.LG · 2024-06-10 · unverdicted · none · ref 88 · internal anchor
A literature survey reviewing traditional diagnostics, AI-driven studies, and explainable AI models for mental disorder detection via online social media, including datasets, evaluation practices, issues, and future directions.

Graph Attention Networks

hub tools

citation-role summary

citation-polarity summary

claims ledger

authors

co-cited works

fields

years

verdicts

roles

polarities

representative citing papers

citing papers explorer