Title resolution pending

Accessed: · 2026 · arXiv 2404.18824

5 Pith papers cite this work. Polarity classification is still indexing.

5 Pith papers citing it

Title metadata for this work has not finished resolving. The hub is built from the citation graph; the title resolver retries DOI and OpenAlex on its next pass.

representative citing papers

Measuring Evaluation-Context Divergence in Open-Weight LLMs: A Paired-Prompt Protocol with Pilot Evidence of Alignment-Pipeline-Specific Heterogeneity

cs.CL · 2026-05-07 · unverdicted · novelty 7.0

A new paired-prompt protocol reveals alignment-pipeline-specific heterogeneity in how open-weight LLMs respond to evaluation versus deployment framings.

AutoRISE: Agent-Driven Strategy Evolution for Red-Teaming Large Language Models

cs.CR · 2026-04-23 · unverdicted · novelty 7.0

AutoRISE evolves red-teaming attack strategies as editable executable programs via an agent, yielding 17-point higher average attack success rates than baselines across 11 models.

When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors

cs.CL · 2026-04-23 · unverdicted · novelty 7.0

New RPS and AGS metrics show within-family distilled LLM agents have 5.9 pp higher tool-use graph similarity than cross-family pairs, with some models exceeding their teachers.

MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events

cs.CL · 2026-04-16 · unverdicted · novelty 7.0

MADE creates a contamination-resistant living benchmark for multi-label classification of medical device adverse events, with evaluations revealing model-specific trade-offs in accuracy and uncertainty quantification.

SPENCE: A Syntactic Probe for Detecting Contamination in NL2SQL Benchmarks

cs.CL · 2026-04-20 · unverdicted · novelty 6.0

SPENCE shows older NL2SQL benchmarks like Spider have high performance sensitivity to syntactic changes, indicating likely training contamination, while newer ones like BIRD show little sensitivity and appear largely clean.

citing papers explorer

Showing 5 of 5 citing papers.

Measuring Evaluation-Context Divergence in Open-Weight LLMs: A Paired-Prompt Protocol with Pilot Evidence of Alignment-Pipeline-Specific Heterogeneity cs.CL · 2026-05-07 · unverdicted · none · ref 17
A new paired-prompt protocol reveals alignment-pipeline-specific heterogeneity in how open-weight LLMs respond to evaluation versus deployment framings.
AutoRISE: Agent-Driven Strategy Evolution for Red-Teaming Large Language Models cs.CR · 2026-04-23 · unverdicted · none · ref 1
AutoRISE evolves red-teaming attack strategies as editable executable programs via an agent, yielding 17-point higher average attack success rates than baselines across 11 models.
When Agents Look the Same: Quantifying Distillation-Induced Similarity in Tool-Use Behaviors cs.CL · 2026-04-23 · unverdicted · none · ref 7
New RPS and AGS metrics show within-family distilled LLM agents have 5.9 pp higher tool-use graph similarity than cross-family pairs, with some models exceeding their teachers.
MADE: A Living Benchmark for Multi-Label Text Classification with Uncertainty Quantification of Medical Device Adverse Events cs.CL · 2026-04-16 · unverdicted · none · ref 8
MADE creates a contamination-resistant living benchmark for multi-label classification of medical device adverse events, with evaluations revealing model-specific trade-offs in accuracy and uncertainty quantification.
SPENCE: A Syntactic Probe for Detecting Contamination in NL2SQL Benchmarks cs.CL · 2026-04-20 · unverdicted · none · ref 15
SPENCE shows older NL2SQL benchmarks like Spider have high performance sensitivity to syntactic changes, indicating likely training contamination, while newer ones like BIRD show little sensitivity and appear largely clean.

Title resolution pending

fields

years

verdicts

representative citing papers

citing papers explorer