Introduces a method to design structure-specific relational inductive biases for a base transformer architecture, enabling end-to-end transcription of documents with intrinsic structures, demonstrated on sheet music, shape drawings, and mechanical engineering drawings.
super hub Canonical reference
Graph Attention Networks
Canonical reference. 70% of citing Pith papers cite this work as background.
abstract
We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).
hub tools
citation-role summary
citation-polarity summary
claims ledger
- abstract We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key
authors
co-cited works
representative citing papers
PromptGNN-sim uses GAT-based semantically aware neighborhood selection and structure-aware LLM prompts with bi-directional contrastive alignment to outperform prior GNN, LLM, and fusion methods on text-attributed graph datasets.
Proposes equation-grounded taxonomy (unexpected AIS activity, route deviation, close approach) and LLM-guided synthesis pipeline to generate timestamp-labeled anomalies for evaluating maritime detection models.
QuADA-GS learns to predict local complexity-driven Gaussian densification from low-resolution inputs and uses Hierarchical Pointer Convolution for efficient arbitrary-scale super-resolution.
GILP trains a parameterized backbone for valid actions and state predictions, then uses a consistency gate with LLM drafts to reduce hallucinated-state rate from 0.176 to 0.035 on GPT-4o-mini while raising success from 0.668 to 0.838.
GraphNPE recovers a significantly lower central density for Boötes I consistent with a core while Draco remains marginally cuspy, and demonstrates that higher-order velocity moments reduce bias in dynamical modeling.
DS-HGNN achieves lower RMSE for stress and displacement prediction on stiffened panels than six benchmark GNN models and matches top accuracy with 19-38% fewer training samples.
AGDN is a new GNN framework using a MixScore matrix and anisotropic graph diffusion to outperform prior methods on TSP instances across sizes and distributions.
A timestamp-aware spatio-temporal graph contrastive learning model for network intrusion detection outperforms other self-supervised methods on four datasets while matching supervised GNN performance.
SelfTICA reformulates collective-variable discovery as contrastive dynamical representation learning on time-lagged data, decoupling feature learning from slow-mode extraction to produce reusable collective variables from limited or biased trajectories.
Introduces Hypergraph U-Nets with PHPool and PHUnpool operators derived from hierarchical clustering dendrograms for hypergraph reconstruction, classification, and anomaly detection.
An agentic multi-fidelity learning method corrects numerical artifacts in GW-BSE excited-state calculations for 2D bilayers and improves quasiparticle gaps and exciton binding energies.
Benchmark of BINN, GraphPath, and PATH on 2622 TCGA patients shows PATH best for targeted therapy, BINN for survival, none useful for radiation, with GraphPath at 0.92 AUROC on prostate targeted therapy.
EpiFormer improves epitope prediction F1 score by over 40% via early-fusion cross-attention in GNN layers and sparsity-aware objectives, while recovering known biology as emergent behavior.
LightGBM models on citation and diversity features predict exogenous diffusion of quantum computing concepts with R² up to 0.78 while endogenous reinforcement remains largely unpredictable after growth controls, with replications in other fields.
AbstainGNN is a framework that jointly models prediction and abstention in GNNs for graph classification, using a PAC-Bayesian-derived unified objective and two-stage training to achieve better accuracy at given rejection rates than prior abstention methods.
ContrastAD achieves highest mean F1 on all five MTS benchmarks and highest AUC on three by building DTW-based sparse graph snapshots and contrasting divergent pairs with a stable anchor instead of enforcing invariance.
Gaussian Sheaf Neural Networks derive a sheaf Laplacian for Gaussian node features on graphs to preserve their geometric structure during message passing.
NeighborDiv detects graph anomalies via variance of inter-neighbor feature similarities under a new Neighbor-to-Neighbor Diversity Paradigm, achieving SOTA results with zero volatility in zero-shot cross-domain settings.
Contrastive Message Passing lets GNNs apply similarity-preserving transforms to positive edges and dissimilarity-inducing transforms to negative edges via soft positive semidefinite constraints on weights, yielding gains in low-label high-homophily regimes.
GraphIP-Bench is a new unified benchmark showing GNN model extraction succeeds at moderate query budgets while most defenses fail to prevent it or retain verification signals on surrogates.
TopoU-Net is a rank-path U-Net for combinatorial complexes that encodes by lifting cochains upward along incidences, decodes by transporting downward, and merges via skip connections at matched ranks.
CTQWformer fuses continuous-time quantum walks into a graph transformer and recurrent module to outperform standard GNNs and graph kernels on classification benchmarks.
SoftBlobGIN combines ESM-2 representations with protein contact graphs via a lightweight GNN and differentiable substructure pooling to achieve 92.8% accuracy on enzyme classification, raise binding-site AUROC to 0.983, and generate auditable structural explanations without retraining the language模型
citing papers explorer
-
SCOT: Multi-Source Cross-City Transfer with Optimal-Transport Soft-Correspondence Objective
SCOT uses Sinkhorn entropic optimal transport to learn explicit soft correspondences between unequal region sets for multi-source cross-city transfer, adding contrastive sharpening and cycle reconstruction for stability and a prototype hub for multi-source alignment.
-
GRASP -- Graph-Based Anomaly Detection Through Self-Supervised Classification
GRASP detects anomalies in system provenance graphs via self-supervised executable prediction from two-hop neighborhoods, outperforming prior PIDS on DARPA datasets by identifying all documented attacks where behaviors are learnable plus additional unlabeled suspicious activity.
-
Do Larger Models Really Win in Drug Discovery? A Benchmark Assessment of Model Scaling in AI-Driven Molecular Property and Activity Prediction
Benchmark across 78 endpoint-split entries finds classical ML winning 47.4% of best performances over pretrained models, GNNs, and LLMs, with performance depending on model-task-split fit rather than scale.