PROPEL amortizes solver evaluation with a trained activation probe to optimize task generators toward a target solve rate, raising the share of learnable tasks from ~10% to ~20% in coding and SWE experiments.
Izhak Elazar, Roee Aharoni, Jonathan Berant, and Reut Tsarfaty
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via interventions.
CLUES decomposes semantic uncertainty into separate ambiguity and instability scores for clinical Text-to-SQL, with instability via Schur complement, outperforming Kernel Language Entropy on failure prediction while enabling diagnostic triage.
A new pipeline uses interpretability to characterize concepts in preference data and shape rewards via feature or data interventions during LM post-training.
citing papers explorer
-
Breaking the Solver Bottleneck: Training Task Generators at the Learnable Frontier
PROPEL amortizes solver evaluation with a trained activation probe to optimize task generators toward a target solve rate, raising the share of learnable tasks from ~10% to ~20% in coding and SWE experiments.
-
Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space
LLMs perform in-context learning as trajectories through a structured low-dimensional conceptual belief space, with the structure visible in both behavior and internal representations and causally manipulable via interventions.
-
Disentangling Ambiguity from Instability in Large Language Models: A Clinical Text-to-SQL Case Study
CLUES decomposes semantic uncertainty into separate ambiguity and instability scores for clinical Text-to-SQL, with instability via Schur complement, outperforming Kernel Language Entropy on failure prediction while enabling diagnostic triage.
-
Anatomy of Post-Training: Using Interpretability to Characterize Data and Shape the Learning Signal
A new pipeline uses interpretability to characterize concepts in preference data and shape rewards via feature or data interventions during LM post-training.