{"total":17,"items":[{"citing_arxiv_id":"2607.00267","ref_index":125,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Validating Causal Abstraction Metrics on Simulated Complex Systems","primary_cat":"cs.LG","submitted_at":"2026-06-30T23:30:29+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Authors create a benchmark across discrete/continuous and static/dynamical systems and introduce the Causal Abstraction Error (CAE) metric that reliably distinguishes valid from invalid causal abstractions when it includes faithfulness testing.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.08748","ref_index":13,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"HydraQE: OSU's Submission for the IWSLT 2026 Speech Translation Metrics Shared Task","primary_cat":"cs.CL","submitted_at":"2026-06-07T17:38:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"HydraQE is a new end-to-end speech translation QE system using Qwen3-ASR backbone, sparsemax layer mixing, bidirectional Transformer, and multi-task curriculum training on human and pseudo labels that outperforms cascaded baselines.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.03817","ref_index":28,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Rethinking the Idiomaticity Decomposability Hypothesis: Evidence from Distributional Learning","primary_cat":"cs.CL","submitted_at":"2026-06-02T15:59:22+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Language models show idiom decomposability correlates weakly with human judgments, negatively with syntactic flexibility, and contributes most strongly to representation stabilization during training alongside surprisal and frequency.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.02953","ref_index":28,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Linguistic Productivity in Large Language Models: Models Coerce, but do not Preempt","primary_cat":"cs.CL","submitted_at":"2026-06-01T23:11:39+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Larger LLMs reproduce constructional productivity via entrenchment in coercion cases with nonce words but fail to use statistical preemption to avoid overgeneralizing semantically plausible but unobserved patterns.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2606.00356","ref_index":7,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"How Far Do Auto-Interpretation Labels Generalize: A Controlled Study Across Languages, Scripts, and Rewordings","primary_cat":"cs.CL","submitted_at":"2026-05-29T20:59:28+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Auto-interpretation labels for SAE features generalize poorly across languages and scripts, missing the same semantic content up to 4x more often in Serbian than English and more in Cyrillic than Latin despite deterministic transliteration.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.19908","ref_index":11,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Where Does Authorship Signal Emerge in Encoder-Based Language Models?","primary_cat":"cs.CL","submitted_at":"2026-05-19T14:37:51+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"Different scoring mechanisms cause encoder-based authorship attribution models to consolidate authorship signals at different layers, as shown by causal interventions and gradient analysis.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.16991","ref_index":15,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Response-free item difficulty modelling for multiple-choice items with fine-tuned transformers: Component-wise representation and multi-task learning","primary_cat":"cs.CL","submitted_at":"2026-05-16T13:22:57+00:00","verdict":"CONDITIONAL","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Fine-tuned transformers with multi-task learning recover substantial wording-derived signal for item difficulty at small sample sizes typical in applied testing.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.15300","ref_index":163,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Deep Pre-Alignment for VLMs","primary_cat":"cs.CV","submitted_at":"2026-05-14T18:14:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Deep Pre-Alignment uses a small VLM perceiver instead of ViT to pre-align visual features with LLM text space, yielding 1.9-3.0 point gains on multimodal benchmarks and 32.9% less language forgetting.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.14125","ref_index":41,"ref_count":2,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Polar probe linearly decodes semantic structures from LLMs","primary_cat":"cs.CL","submitted_at":"2026-05-13T21:21:10+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"LLMs represent semantic relations geometrically via embedding distance and direction; a linear Polar Probe decodes these structures from middle-layer activations and generalizes to new entities.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2605.07622","ref_index":65,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Is She Even Relevant? When BERT Ignores Explicit Gender Cues","primary_cat":"cs.CL","submitted_at":"2026-05-08T11:48:22+00:00","verdict":"CONDITIONAL","verdict_confidence":"MODERATE","novelty_score":7.0,"formal_verification":"none","one_line_summary":"A Dutch BERT model encodes gender linearly by epoch 20 but does not dynamically update its representations when explicit female cues contradict learned stereotypical associations in short sentence templates.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2604.24374","ref_index":16,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"MIPIC: Matryoshka Representation Learning via Self-Distilled Intra-Relational and Progressive Information Chaining","primary_cat":"cs.CL","submitted_at":"2026-04-27T12:07:40+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"MIPIC trains Matryoshka representations using self-distilled intra-relational alignment and progressive information chaining, yielding competitive results on STS, NLI, and classification tasks especially at low dimensions.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"bone configurations with a representation hier- archy D={16,32,64,128,256,512,768} , we align the first 6 lower-dimensional prefixes against the full-dimensional teacher. We set ki = max(8,⌈γ i ·m⌉) , where m is the sequence length. The ratio γi increases monotonically with the dimension size: specifically, we utilize γ= [0.2,0.3,0.4,0.5,0.6,0.7] corresponding to the di- mensions [16,32,64,128,256,512] . This sched- ule ensures that the most compressed representa- tions (d= 16 ) focus on the top 20% salient tokens, while larger prefixes progressively incorporate up to 70% of the context, with a minimum floor of kmin = 8 tokens to preserve basic sentence struc- ture in short inputs. Layers and checkpoints applied in MIPICIn our framework, let L denote the set of layers"},{"citing_arxiv_id":"2604.03480","ref_index":9,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Large Language Models Align with the Human Brain during Creative Thinking","primary_cat":"q-bio.NC","submitted_at":"2026-04-03T22:02:15+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":7.0,"formal_verification":"none","one_line_summary":"LLMs show scaling and training-dependent alignment with human brain responses in creativity-related networks during divergent thinking tasks, measured via RSA on fMRI data.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2512.03676","ref_index":28,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Different types of syntactic agreement recruit the same units within large language models","primary_cat":"cs.CL","submitted_at":"2025-12-03T11:07:50+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Different types of syntactic agreement recruit overlapping units within LLMs, indicating that agreement forms a meaningful functional category across English, Russian, Chinese, and structurally similar languages.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2509.20237","ref_index":30,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models","primary_cat":"cs.CL","submitted_at":"2025-09-24T15:27:44+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":4.0,"formal_verification":"none","one_line_summary":"Fine-tuning on annotated English and Japanese dialogues improves clustering of backchannels and fillers and makes generated utterances closer to human ones.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2506.02132","ref_index":21,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Model Internal Sleuthing: Finding Lexical Identity and Inflectional Features in Modern Language Models","primary_cat":"cs.CL","submitted_at":"2025-06-02T18:01:56+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":5.0,"formal_verification":"none","one_line_summary":"Inflectional features stay linearly decodable across all layers while lexical identity weakens with depth in modern transformers.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null},{"citing_arxiv_id":"2303.08896","ref_index":11,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"SelfCheckGPT: Zero-Resource Black-Box Hallucination Detection for Generative Large Language Models","primary_cat":"cs.CL","submitted_at":"2023-03-15T19:31:21+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"SelfCheckGPT detects hallucinations by checking consistency across multiple sampled responses from black-box LLMs on WikiBio biography generation tasks.","context_count":1,"top_context_role":"background","top_context_polarity":"background","context_text":"Initial investigation showed that GPT-3 (text- davinci-003) will output either Yes or No 98% of the time, while any remaining outputs can be set to N/A. The output from prompting when comparing the i-th sentence against sample Sn is converted to score xn i through the mapping {Yes: 0.0, No: 1.0, N/A: 0.5}. The final inconsistency score is then calculated as: SPrompt(i) = 1 N NX n=1 xn i (11) SelfCheckGPT-Prompt is illustrated in Figure 1. Note that our initial investigations found that less capable models such as GPT-3 (text-curie-001) or LLaMA failed to effectively perform consistency assessment via such prompting. 6 Data and Annotation As, currently, there are no standard hallucination detection datasets available, we evaluate our hallu-"},{"citing_arxiv_id":"2010.03496","ref_index":22,"ref_count":1,"confidence":0.88,"is_internal_anchor":false,"paper_title":"Inductive Entity Representations from Text via Link Prediction","primary_cat":"cs.CL","submitted_at":"2020-10-07T16:04:06+00:00","verdict":"UNVERDICTED","verdict_confidence":"LOW","novelty_score":6.0,"formal_verification":"none","one_line_summary":"Entity representations learned from text via link prediction generalize to unseen entities and transfer to classification and retrieval with reported gains of 22% MRR, 16% accuracy, and 8.8% NDCG@10.","context_count":0,"top_context_role":null,"top_context_polarity":null,"context_text":null}],"limit":50,"offset":0}