QAOD projects away question-aligned directions from answer representations to isolate domain-agnostic factuality signals, enabling efficient hallucination detection with top in-domain AUROC and up to 21% better OOD transfer.
Title resolution pending
16 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 16verdicts
UNVERDICTED 16representative citing papers
Fully Looped Transformer stabilizes looped training up to 12 iterations via distributed inter-loop signals and attention injection, improving downstream performance by up to 13.2%.
GravityGraphSAGE adapts GraphSAGE with a gravity-inspired decoder to outperform prior graph deep learning methods on directed link prediction across citation networks and 16 real-world graphs.
The paper introduces compositional interpretability as a category-theoretic framework that casts mechanistic explanations as commuting syntactic-semantic mappings optimized under faithfulness and complexity constraints derived from minimum description length.
LC-MAPF uses multi-round local communication between neighboring agents in a pre-trained model to outperform prior learning-based MAPF solvers on diverse unseen scenarios while preserving scalability.
KVM is a new block-recurrent compressed KV attention that turns transformers into O(N) chunked RNNs or growable sublinear-memory models while remaining implementable with standard operations.
LLMs contain identifiable COCO neurons that enable implicit self-correction against stereotypes; targeted editing of these neurons improves fairness and robustness to jailbreaks while preserving generation quality.
Structured Recurrent Mixers provide a dual parallel-recurrent representation for sequence models, claiming superior training efficiency, information capacity, and inference throughput over linear complexity alternatives.
Spherical flows on S^{d-1} with vMF noise reduce the continuity equation to a scalar ODE in cosine similarity, yielding posterior-weighted marginal velocity and score that enable ODE and predictor-corrector sampling for categorical sequences, with the posterior trained by cross-entropy and empirical
DiffMAS jointly optimizes latent communication and reasoning in multi-agent LLM systems via parameter-efficient supervised training on trajectories, yielding consistent gains over baselines on math, science, and code benchmarks.
ACT disentangles temporal scales in stock sequences and purifies structural relations in graphs to achieve state-of-the-art cross-sectional stock ranking on CSI300 and CSI500 with up to 74.25% improvement.
Poetic jailbreaks succeed because they induce distinct attention patterns in LLMs that are independent of harmful-content detection, not because models fail to recognize literary formatting.
FCP shards sequences at block level with flexible P2P communication and bin-packing to achieve near-linear scaling up to 256 GPUs and 1.13x-2.21x higher attention MFU in foundation model pre-training.
Toeplitz MLP Mixers replace attention with masked Toeplitz multiplications for sub-quadratic complexity while retaining more sequence information and outperforming on copying and in-context tasks.
PaSaMaster, a recursive self-evolving agentic literature retrieval system, reports 16.5× higher F1 than Google Scholar and 37.8% higher F1 than GPT-5.2 on PaSaMaster-Bench across 38 disciplines with zero source hallucination at ~1% cost.
Generative AI evaluation must shift from static benchmark scores to measuring sustained improvements in human capabilities within specific deployment contexts.
citing papers explorer
-
Learning to Communicate Locally for Large-Scale Multi-Agent Pathfinding
LC-MAPF uses multi-round local communication between neighboring agents in a pre-trained model to outperform prior learning-based MAPF solvers on diverse unseen scenarios while preserving scalability.