{"total":13,"items":[{"citing_arxiv_id":"2606.22511","ref_index":35,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Breaking the Likelihood Trap: Variance-Calibrated Modulation for Large Language Model Decoding","primary_cat":"cs.CL","submitted_at":"2026-06-21T14:04:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"VCM is a training-free decoding intervention that applies PMI-driven token elevation and variance-adaptive penalization to reduce repetitive degeneration in LLM open-ended generation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.19636","ref_index":15,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Hard or Just Unreached? Diagnosing the Sampling Blind Spot in Math-Reasoning Difficulty Estimation","primary_cat":"cs.LG","submitted_at":"2026-06-17T22:31:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"10.3-22.9% of pass@k=0 math examples across GSM8K and MATH are recovered by a deterministic six-chain regime using activation grafting, showing a sampling blind spot in difficulty estimation.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.05161","ref_index":19,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Beyond Text Following: Repairable Arbitration Reversals in Audio-Language Models","primary_cat":"cs.SD","submitted_at":"2026-06-03T17:57:51+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"ALMs encode audio evidence but override it with text in conflicts; GACL interpolates joint and same-audio scores to repair reversals, gaining 17.8 nAUC points under a 5pp faithfulness budget.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03022","ref_index":61,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Hallucinations as Orthogonal Noise: Inference-Time Manifold Alignment via Dynamic Contextual Orthogonalization","primary_cat":"cs.CL","submitted_at":"2026-06-02T01:56:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"DCO is an inference-time intervention that decomposes attention head outputs orthogonally to a dynamic context anchor and suppresses outlier components via Z-score to improve contextual faithfulness in Llama models.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00432","ref_index":47,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Grounded Decoding: Retrieval-Anchored Probability Fusion for Faithful RAG","primary_cat":"cs.LG","submitted_at":"2026-05-29T23:47:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Grounded Decoding fuses full-RAG and retrieval-only next-token distributions via normalized geometric mean from a KL-barycenter to improve factual consistency and citation quality in RAG.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.21770","ref_index":11,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Manifold-Guided Attention Steering","primary_cat":"cs.LG","submitted_at":"2026-05-20T22:06:08+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"MAGS learns low-dimensional subspaces from correct versus incorrect reasoning traces and applies targeted projection corrections to attention heads when they deviate from the correctness manifold during inference.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14169","ref_index":59,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"BOOKMARKS: Efficient Active Storyline Memory for Role-playing","primary_cat":"cs.CL","submitted_at":"2026-05-13T22:48:24+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"BOOKMARKS introduces searchable bookmarks as reusable answers to storyline questions, enabling active initialization and passive synchronization for more consistent role-playing agent memory than recurrent summarization.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.09781","ref_index":39,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Parameter-Efficient Neuroevolution for Diverse LLM Generation: Quality-Diversity Optimization via Prompt Embedding Evolution","primary_cat":"cs.NE","submitted_at":"2026-05-10T22:00:15+00:00","verdict":null,"verdict_confidence":null,"novelty_score":null,"formal_verification":null,"one_line_summary":null,"context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"These embeddings influence LLM behavior without modi- fying base model weights, achieving competitive task performance with orders of magnitude fewer trainable parameters. P-Tuning v2 [41] demonstrated that deep prompt tuning across layers can match full fine-tuning performance. Parameter-efficient fine-tuning (PEFT) has emerged as a major paradigm for LLM adaptation [39]. LoRA [30] introduces low-rank weight updates, while adapter modules [29] insert small trainable layers. QLoRA [11] enables efficient fine-tuning of quantized 70B+ models on single GPUs. These methods demonstrate that effective LLM adaptation is possible with minimal parameter updates-our approach extends this insight to evolutionary optimization."},{"citing_arxiv_id":"2605.05953","ref_index":29,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits","primary_cat":"cs.CL","submitted_at":"2026-05-07T10:02:27+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"sum node:C n(zsc(n)) = X c∈ch(n) wn,c Cc(zsc(c)), w n,c ≥0, X c wn,c = 1, product node:C n(zsc(n)) = Y c∈ch(n) Cc(zsc(c)), where qn(·;η n) is a tractable parametric density with learnable parameters ηn, and ch(n) denotes the children of n. We assume C is smooth and decomposable, so that C(z) :=C root(z) is a valid density onZadmitting exact, linear-time inference in|N |[29, 7]. Building upon this foundational structure, we tailor the internal node operations to capture the specific geometric properties of LLM representations as follows. Product and sum nodes. In our architecture,Product Nodesencode context-specific independence assumptions among disjoint subsets of the latent features. Conversely,Sum Nodesmodel distinct"},{"citing_arxiv_id":"2604.24927","ref_index":3,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Large Language Models Explore by Latent Distilling","primary_cat":"cs.CL","submitted_at":"2026-04-27T19:08:13+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"ESamp trains a test-time distiller to model LLM depth-wise representation transitions and biases decoding toward high prediction-error paths to increase semantic diversity.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.19089","ref_index":149,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Towards Scalable Lifelong Knowledge Editing with Selective Knowledge Suppression","primary_cat":"cs.AI","submitted_at":"2026-04-21T05:02:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"LightEdit enables scalable lifelong knowledge editing in LLMs via selective knowledge retrieval and probability suppression during decoding, outperforming prior methods on ZSRE, Counterfact, and RIPE while reducing training costs.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.18106","ref_index":41,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Efficient Low-Resource Language Adaptation via Multi-Source Dynamic Logit Fusion","primary_cat":"cs.CL","submitted_at":"2026-04-20T11:24:42+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"TriMix dynamically fuses logits from three model sources to outperform baselines and Proxy Tuning on eight low-resource languages across four model families.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.14123","ref_index":32,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Sampling from Your Language Model One Byte at a Time","primary_cat":"cs.CL","submitted_at":"2025-06-17T02:37:04+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"An inference-time technique turns BPE-based LMs into byte- or character-level models, solving the prompt boundary problem while unifying vocabularies across different tokenizers.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}