Title resolution pending

Kawin Ethayarajh · 2019 · DOI 10.18653/v1/d19-1006

10 Pith papers cite this work. Polarity classification is still indexing.

10 Pith papers citing it

open at publisher browse 10 citing papers

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Accurate and Efficient Statistical Testing for Word Semantic Breadth

cs.CL · 2026-05-08 · unverdicted · novelty 7.0

A new permutation test uses Householder reflection to align word embedding clouds before testing dispersion differences, cutting Type-I error by 32.5% and speeding up 23x on GPU.

Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL

cs.CL · 2026-04-22 · unverdicted · novelty 7.0

Parallel-SFT mixes parallel programs across languages during SFT to produce more transferable RL initializations, yielding better zero-shot generalization to unseen programming languages.

On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability

cs.IR · 2026-04-17 · unverdicted · novelty 7.0

LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,

Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization

stat.ML · 2026-05-07 · unverdicted · novelty 6.0

Spectral analysis of activations and gradients provides new diagnostics that link batch size to representation geometry, early covariance tails to token efficiency, and spectral shifts to learning dynamics in decoder-only LLMs, backed by a mechanistic model.

Controlled Paraphrase Geometry in Sentence Embedding Space: Local Manifold Modeling and Latent Probing

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

Nonlinear polynomial models fit local paraphrase embedding clouds more accurately than linear ones and support geometrically consistent synthetic point generation, yet this geometric fidelity does not improve classification performance.

cs.CL · 2026-05-01 · unverdicted · novelty 6.0

Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.

Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs

cs.CL · 2026-04-08 · unverdicted · novelty 6.0

LLM reasoning refines unsupervised text clusters via coherence checks, redundancy removal, and label grounding, yielding better coherence and human-aligned labels on social media data.

Analyzing the Effect of Noise in LLM Fine-tuning

cs.LG · 2026-04-14 · unverdicted · novelty 5.0

Label noise hurts fine-tuning performance most while grammatical and typographical noise sometimes act as mild regularizers, with changes concentrated in task-specific layers.

Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction

cs.CL · 2026-04-12 · unverdicted · novelty 5.0

BERT embeddings encode narrative dimensions of time, space, causality, and character at the token level, as a linear probe achieves 94% accuracy versus 47% on variance-matched random embeddings, though unsupervised clusters do not align with these categories.

Model Internal Sleuthing: Finding Lexical Identity and Inflectional Features in Modern Language Models

cs.CL · 2025-06-02

citing papers explorer

Showing 10 of 10 citing papers.

Accurate and Efficient Statistical Testing for Word Semantic Breadth cs.CL · 2026-05-08 · unverdicted · none · ref 35
A new permutation test uses Householder reflection to align word embedding clouds before testing dispersion differences, cutting Type-I error by 32.5% and speeding up 23x on GPU.
Parallel-SFT: Improving Zero-Shot Cross-Programming-Language Transfer for Code RL cs.CL · 2026-04-22 · unverdicted · none · ref 29
Parallel-SFT mixes parallel programs across languages during SFT to produce more transferable RL initializations, yielding better zero-shot generalization to unseen programming languages.
On the Robustness of LLM-Based Dense Retrievers: A Systematic Analysis of Generalizability and Stability cs.IR · 2026-04-17 · unverdicted · none · ref 17
LLM-based dense retrievers generalize better when instruction-tuned but pay a specialization tax when optimized for reasoning; they resist typos and corpus poisoning better than encoder-only baselines yet remain vulnerable to semantic perturbations, with larger models and certain embedding geometry,
Spectral Lens: Activation and Gradient Spectra as Diagnostics of LLM Optimization stat.ML · 2026-05-07 · unverdicted · none · ref 37
Spectral analysis of activations and gradients provides new diagnostics that link batch size to representation geometry, early covariance tails to token efficiency, and spectral shifts to learning dynamics in decoder-only LLMs, backed by a mechanistic model.
Controlled Paraphrase Geometry in Sentence Embedding Space: Local Manifold Modeling and Latent Probing cs.CL · 2026-05-01 · unverdicted · none · ref 17
Nonlinear polynomial models fit local paraphrase embedding clouds more accurately than linear ones and support geometrically consistent synthetic point generation, yet this geometric fidelity does not improve classification performance.
Is Textual Similarity Invariant under Machine Translation? Evidence Based on the Political Manifesto Corpus cs.CL · 2026-05-01 · unverdicted · none · ref 22
Machine translation preserves embedding similarity structure for ten languages but distorts it for four in the Manifesto Corpus, via a new non-inferiority testing framework.
Reasoning-Based Refinement of Unsupervised Text Clusters with LLMs cs.CL · 2026-04-08 · unverdicted · none · ref 16
LLM reasoning refines unsupervised text clusters via coherence checks, redundancy removal, and label grounding, yielding better coherence and human-aligned labels on social media data.
Analyzing the Effect of Noise in LLM Fine-tuning cs.LG · 2026-04-14 · unverdicted · none · ref 3
Label noise hurts fine-tuning performance most while grammatical and typographical noise sometimes act as mild regularizers, with changes concentrated in task-specific layers.
Do BERT Embeddings Encode Narrative Dimensions? A Token-Level Probing Analysis of Time, Space, Causality, and Character in Fiction cs.CL · 2026-04-12 · unverdicted · none · ref 15
BERT embeddings encode narrative dimensions of time, space, causality, and character at the token level, as a linear probe achieves 94% accuracy versus 47% on variance-matched random embeddings, though unsupervised clusters do not align with these categories.
Model Internal Sleuthing: Finding Lexical Identity and Inflectional Features in Modern Language Models cs.CL · 2025-06-02 · unreviewed · ref 12

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer