PLOT localizes causal variables in neural networks by fitting optimal transport couplings between abstract and neural intervention effect geometries, enabling fast handles or guided search.
Maheep Chaudhary and Atticus Geiger
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4verdicts
UNVERDICTED 4representative citing papers
AGCLR extends CoCoNuT with a gated concept stream for persistent memory to fix fact loss in latent reasoning, yielding improvements on reasoning benchmarks as depth increases.
RL preserves a larger fraction of base model circuits than SFT during fine-tuning on scientific QA, per a new head-level differential circuit vulnerability metric, at the cost of slower adaptation.
SAERec extracts fine-grained interpretable intents from LLM embeddings via sparse autoencoders and integrates them as priors into sequence recommendation using multi-branch attention, outperforming baselines on public datasets.
citing papers explorer
-
Why Limit the Residual Stream to Layers and Not Tokens? Persistent Memory for Continuous Latent Reasoning
AGCLR extends CoCoNuT with a gated concept stream for persistent memory to fix fact loss in latent reasoning, yielding improvements on reasoning benchmarks as depth increases.