Hallucination detectors on LLM reasoning traces often rely on final-answer artifacts rather than reasoning validity; once controlled, lightweight lexical trajectory features suffice for robust detection.
hub
Fact-checking the output of large language models via token- level uncertainty quantification
11 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
CASPO trains LLMs via iterative direct preference optimization so that token-level confidence tracks step-wise correctness, then applies Confidence-aware Thought pruning at inference to improve both reliability and speed on reasoning benchmarks.
DisAAD trains a 1%-sized proxy model via adversarial distillation to quantify uncertainty in black-box LLMs by aligning with their output distributions.
KnowSA_CKP uses comparative knowledge probing to selectively augment LLM prompts for items with knowledge gaps, improving recommendation accuracy and context efficiency.
Adapts multi-layer token-level Mahalanobis distance with supervised linear regression to yield improved uncertainty scores for LLM truthfulness tasks.
A regression model using attention features and recurrent uncertainty scores improves selective generation in LLMs over unsupervised and supervised baselines on ten datasets and three models.
CRISTAL is a neurosymbolic framework that synthesizes interpretable probabilistic world models from language priors for full Bayesian analysis and budget-aware data acquisition, claiming Bayes-optimal accuracy on synthetic equity classification with 5 examples.
IUQ quantifies claim-level uncertainty in long-form LLM generation by combining inter-sample consistency and intra-sample faithfulness through an interrogate-then-respond approach and outperforms baselines on two datasets.
LLMs reflect users' privacy preferences in access control decisions with up to 86% agreement and can promote safer behavior, but personalization trades off higher individual match for potentially less secure results when users over-permission.
LLMs show improved accuracy on gastroenterology questions but remain overconfident in self-reported certainty across commercial, open-source, and quantized variants.
citing papers explorer
-
Confidence-Aware Alignment Makes Reasoning LLMs More Reliable
CASPO trains LLMs via iterative direct preference optimization so that token-level confidence tracks step-wise correctness, then applies Confidence-aware Thought pruning at inference to improve both reliability and speed on reasoning benchmarks.
-
Filling the Gaps: Selective Knowledge Augmentation for LLM Recommenders
KnowSA_CKP uses comparative knowledge probing to selectively augment LLM prompts for items with knowledge gaps, improving recommendation accuracy and context efficiency.