10.3-22.9% of pass@k=0 math examples across GSM8K and MATH are recovered by a deterministic six-chain regime using activation grafting, showing a sampling blind spot in difficulty estimation.
hub
Contrastive decoding: Open-ended text generation as optimization
13 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 2polarities
background 2representative citing papers
BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.
ESamp trains a test-time distiller to model LLM depth-wise representation transitions and biases decoding toward high prediction-error paths to increase semantic diversity.
TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
An inference-time technique turns BPE-based LMs into byte- or character-level models, solving the prompt boundary problem while unifying vocabularies across different tokenizers.
VCM is a training-free decoding intervention that applies PMI-driven token elevation and variance-adaptive penalization to reduce repetitive degeneration in LLM open-ended generation.
ALMs encode audio evidence but override it with text in conflicts; GACL interpolates joint and same-audio scores to repair reversals, gaining 17.8 nAUC points under a 5pp faithfulness budget.
Grounded Decoding fuses full-RAG and retrieval-only next-token distributions via normalized geometric mean from a KL-barycenter to improve factual consistency and citation quality in RAG.
MAGS learns low-dimensional subspaces from correct versus incorrect reasoning traces and applies targeted projection corrections to attention heads when they deviate from the correctness manifold during inference.
Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.
DCO is an inference-time intervention that decomposes attention head outputs orthogonally to a dynamic context anchor and suppresses outlier components via Z-score to improve contextual faithfulness in Llama models.
LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.
citing papers explorer
-
Hard or Just Unreached? Diagnosing the Sampling Blind Spot in Math-Reasoning Difficulty Estimation
10.3-22.9% of pass@k=0 math examples across GSM8K and MATH are recovered by a deterministic six-chain regime using activation grafting, showing a sampling blind spot in difficulty estimation.
-
BOOKMARKS: Efficient Active Storyline Memory for Role-playing
BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.
-
Large Language Models Explore by Latent Distilling
ESamp trains a test-time distiller to model LLM depth-wise representation transitions and biases decoding toward high prediction-error paths to increase semantic diversity.
-
Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion
TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.
-
Sampling from Your Language Model One Byte at a Time
An inference-time technique turns BPE-based LMs into byte- or character-level models, solving the prompt boundary problem while unifying vocabularies across different tokenizers.
-
Breaking the Likelihood Trap: Variance-Calibrated Modulation for Large Language Model Decoding
VCM is a training-free decoding intervention that applies PMI-driven token elevation and variance-adaptive penalization to reduce repetitive degeneration in LLM open-ended generation.
-
Beyond Text Following: Repairable Arbitration Reversals in Audio-Language Models
ALMs encode audio evidence but override it with text in conflicts; GACL interpolates joint and same-audio scores to repair reversals, gaining 17.8 nAUC points under a 5pp faithfulness budget.
-
Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG
Grounded Decoding fuses full-RAG and retrieval-only next-token distributions via normalized geometric mean from a KL-barycenter to improve factual consistency and citation quality in RAG.
-
Manifold-Guided Attention Steering
MAGS learns low-dimensional subspaces from correct versus incorrect reasoning traces and applies targeted projection corrections to attention heads when they deviate from the correctness manifold during inference.
-
Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits
Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.
-
Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization
DCO is an inference-time intervention that decomposes attention head outputs orthogonally to a dynamic context anchor and suppresses outlier components via Z-score to improve contextual faithfulness in Llama models.
-
Towards Scalable Lifelong Knowledge Editing with Selective Knowledge Suppression
LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.
- Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution