hub

arXiv:2402.03744 [cs]

Sebastian Farquhar, Jannik Kossen, Lorenz Kuhn, Yarin Gal · 2024 · Nature · DOI 10.1038/s41586-024-07421-0

15 Pith papers cite this work, alongside 581 external citations. Polarity classification is still indexing.

15 Pith papers citing it

581 external citations · Crossref

open at publisher browse 15 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

representative citing papers

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations

cs.CL · 2026-05-12 · unverdicted · novelty 8.0

REALISTA optimizes continuous combinations of valid editing directions in latent space to produce realistic adversarial prompts that elicit hallucinations more effectively than prior methods, including on large reasoning models.

Quantifying the Reconstructability of Astrophysical Methods with Large Language Models and Information Theory: A Case Study in Spectral Reconstruction

astro-ph.IM · 2026-05-11 · unverdicted · novelty 7.0

LLMs prompted with increasing levels of text on TNO spectral reconstruction from photometry reveal an entropy floor where implementation variance persists, showing text alone cannot capture all tacit expert knowledge needed for exact replication.

Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models

cs.CL · 2026-05-06 · unverdicted · novelty 7.0

SemGrad is a gradient-based uncertainty quantification technique for free-form LLM generation that operates in semantic space using a Semantic Preservation Score to select stable embeddings.

Two Calls, Two Moments, and the Vote-Accuracy Curve of Repeated LLM Inference

cs.LG · 2026-05-05 · unverdicted · novelty 7.0 · 2 refs

Two calls per example identify the first two moments of latent correctness probability, enabling exact bounds on the vote-accuracy curve for any majority-vote budget under conditional i.i.d. assumptions.

SENECA: Small-Sample Discrete Entropy Estimation via Self-Consistent Missing Mass

cs.IT · 2026-05-01 · unverdicted · novelty 7.0

SENECA uses a novel self-consistent missing mass calculation to improve discrete entropy estimates in small-sample regimes and outperforms alternatives in numerical tests.

Evaluating Tool-Using Language Agents: Judge Reliability, Propagation Cascades, and Runtime Mitigation in AgentProp-Bench

cs.AI · 2026-04-17 · conditional · novelty 7.0

AgentProp-Bench shows substring judging agrees with humans at kappa=0.049, LLM ensemble at 0.432, bad-parameter injection propagates with ~0.62 probability, rejection and recovery are independent, and a runtime fix cuts hallucinations 23pp on GPT-4o-mini but not Gemini-2.0-Flash.

RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration

cs.CL · 2026-04-17 · unverdicted · novelty 7.0

RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.

OSCAR: Orchestrated Self-verification and Cross-path Refinement

cs.AI · 2026-04-02 · unverdicted · novelty 7.0

OSCAR reduces hallucinations in diffusion language models by localizing commitment uncertainty with cross-chain entropy on parallel trajectories and applying evidence-guided remasking.

Using Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation

cs.SE · 2026-05-09 · unverdicted · novelty 6.0

Semantic distance on program execution behaviors improves uncertainty estimation for LLM code generation and outperforms prior sample-based methods across benchmarks and models.

OracleTSC: Oracle-Informed Reward Hurdle and Uncertainty Regularization for Traffic Signal Control

cs.AI · 2026-05-08 · unverdicted · novelty 6.0

OracleTSC introduces a reward hurdle and uncertainty regularization to stabilize LLM-based reinforcement learning for traffic signal control, delivering 75% lower travel time and 67% lower queue length on benchmarks plus cross-intersection generalization.

Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

Decision theory shows that LLM cascades are structurally limited by always incurring the cheap model's cost before deciding to escalate, with the best performance given by the envelope of pairwise cascades rather than fixed chains or many stages.

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.

Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics

cs.LG · 2026-05-06 · unverdicted · novelty 6.0

Symmetric spectral diagnostics on attention are structurally blind to flow direction, with asymmetry G as the sole control parameter, yielding a two-axis test that distinguishes bottleneck versus diffuse hallucination modes with opposite polarity.

The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive

cs.CR · 2026-04-28 · unverdicted · novelty 6.0

LLM token rank-frequency distributions converge to a shared Mandelbrot distribution across models and domains, enabling a microsecond-scale statistical primitive for provenance verification and black-box anomaly triage.

Hallucination Basins: A Dynamic Framework for Understanding and Controlling LLM Hallucinations

cs.CL · 2026-04-06 · unverdicted · novelty 5.0

LLM hallucinations arise from task-dependent basins in latent space, with separability varying by task and geometry-aware steering reducing their probability.

citing papers explorer

Showing 15 of 15 citing papers.

REALISTA: Realistic Latent Adversarial Attacks that Elicit LLM Hallucinations cs.CL · 2026-05-12 · unverdicted · none · ref 69
REALISTA optimizes continuous combinations of valid editing directions in latent space to produce realistic adversarial prompts that elicit hallucinations more effectively than prior methods, including on large reasoning models.
Quantifying the Reconstructability of Astrophysical Methods with Large Language Models and Information Theory: A Case Study in Spectral Reconstruction astro-ph.IM · 2026-05-11 · unverdicted · none · ref 10
LLMs prompted with increasing levels of text on TNO spectral reconstruction from photometry reveal an entropy floor where implementation variance persists, showing text alone cannot capture all tacit expert knowledge needed for exact replication.
Gradients with Respect to Semantics Preserving Embeddings Tell the Uncertainty of Large Language Models cs.CL · 2026-05-06 · unverdicted · none · ref 31
SemGrad is a gradient-based uncertainty quantification technique for free-form LLM generation that operates in semantic space using a Semantic Preservation Score to select stable embeddings.
Two Calls, Two Moments, and the Vote-Accuracy Curve of Repeated LLM Inference cs.LG · 2026-05-05 · unverdicted · none · ref 7 · 2 links
Two calls per example identify the first two moments of latent correctness probability, enabling exact bounds on the vote-accuracy curve for any majority-vote budget under conditional i.i.d. assumptions.
SENECA: Small-Sample Discrete Entropy Estimation via Self-Consistent Missing Mass cs.IT · 2026-05-01 · unverdicted · none · ref 22
SENECA uses a novel self-consistent missing mass calculation to improve discrete entropy estimates in small-sample regimes and outperforms alternatives in numerical tests.
Evaluating Tool-Using Language Agents: Judge Reliability, Propagation Cascades, and Runtime Mitigation in AgentProp-Bench cs.AI · 2026-04-17 · conditional · none · ref 7
AgentProp-Bench shows substring judging agrees with humans at kappa=0.049, LLM ensemble at 0.432, bad-parameter injection propagates with ~0.62 probability, rejection and recovery are independent, and a runtime fix cuts hallucinations 23pp on GPT-4o-mini but not Gemini-2.0-Flash.
RAGognizer: Hallucination-Aware Fine-Tuning via Detection Head Integration cs.CL · 2026-04-17 · unverdicted · none · ref 10
RAGognizer adds a detection head to LLMs for joint training on generation and token-level hallucination detection, yielding SOTA detection and fewer hallucinations in RAG while preserving output quality.
OSCAR: Orchestrated Self-verification and Cross-path Refinement cs.AI · 2026-04-02 · unverdicted · none · ref 1
OSCAR reduces hallucinations in diffusion language models by localizing commitment uncertainty with cross-chain entropy on parallel trajectories and applying evidence-guided remasking.
Using Semantic Distance to Estimate Uncertainty in LLM-Based Code Generation cs.SE · 2026-05-09 · unverdicted · none · ref 7
Semantic distance on program execution behaviors improves uncertainty estimation for LLM code generation and outperforms prior sample-based methods across benchmarks and models.
OracleTSC: Oracle-Informed Reward Hurdle and Uncertainty Regularization for Traffic Signal Control cs.AI · 2026-05-08 · unverdicted · none · ref 3
OracleTSC introduces a reward hurdle and uncertainty regularization to stabilize LLM-based reinforcement learning for traffic signal control, delivering 75% lower travel time and 67% lower queue length on benchmarks plus cross-intersection generalization.
Is Escalation Worth It? A Decision-Theoretic Characterization of LLM Cascades cs.LG · 2026-05-07 · unverdicted · none · ref 66
Decision theory shows that LLM cascades are structurally limited by always incurring the cheap model's cost before deciding to escalate, with the best performance given by the envelope of pairwise cascades rather than fixed chains or many stages.
Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits cs.CL · 2026-05-07 · unverdicted · none · ref 21
Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.
Self-Attention as Transport: Limits of Symmetric Spectral Diagnostics cs.LG · 2026-05-06 · unverdicted · none · ref 9
Symmetric spectral diagnostics on attention are structurally blind to flow direction, with asymmetry G as the sole control parameter, yielding a two-axis test that distinguishes bottleneck versus diffuse hallucination modes with opposite polarity.
The Surprising Universality of LLM Outputs: A Real-Time Verification Primitive cs.CR · 2026-04-28 · unverdicted · none · ref 2
LLM token rank-frequency distributions converge to a shared Mandelbrot distribution across models and domains, enabling a microsecond-scale statistical primitive for provenance verification and black-box anomaly triage.
Hallucination Basins: A Dynamic Framework for Understanding and Controlling LLM Hallucinations cs.CL · 2026-04-06 · unverdicted · none · ref 7
LLM hallucinations arise from task-dependent basins in latent space, with separability varying by task and geometry-aware steering reducing their probability.

arXiv:2402.03744 [cs]

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer