pith. sign in

Attention is not not explanation

12 Pith papers cite this work. Polarity classification is still indexing.

12 Pith papers citing it

citation-role summary

background 1 method 1

citation-polarity summary

years

2026 12

clear filters

representative citing papers

Logit-Contribution Scoring Identifies Non-Literal Retrieval Heads

cs.CL · 2026-07-01 · unverdicted · novelty 7.0

LOCOS scores attention heads via OV-circuit output projection onto answer-token unembedding directions and identifies non-literal retrieval heads whose ablation collapses performance on non-literal benchmarks more than prior literal-copy detectors.

Listening makes Vision Clear for VLMs

cs.CV · 2026-06-22 · unverdicted · novelty 6.0

PV-TAM uses prompt-side semantics and a bias filter to improve attention-based and IoU localization metrics for vision-language models over answer-side baselines.

Rigorous Interpretation Is a Form of Evaluation

cs.CY · 2026-05-06 · unverdicted · novelty 5.0

Rigorous interpretability can function as a principled form of model evaluation if its claims are falsifiable, reproducible, and predictive.

citing papers explorer

Showing 0 of 0 citing papers after filters.

No citing papers match the current filters.