Artificial Aphasias in Lesioned Language Models

Cory Shain; Jill Kries; Laura Gwilliams; Nathan Roll

arxiv: 2605.16222 · v1 · pith:ODWWHTEEnew · submitted 2026-05-15 · 💻 cs.CL · cs.LG

Artificial Aphasias in Lesioned Language Models

Nathan Roll , Jill Kries , Laura Gwilliams , Cory Shain This is my paper

Pith reviewed 2026-05-20 18:26 UTC · model grok-4.3

classification 💻 cs.CL cs.LG

keywords aphasialanguage modelsparameter lesioningsymptom profilesfunctional organizationattention componentsfeed-forward componentslayer depth

0 comments

The pith

Lesioning parameters in language models produces aphasia-like symptoms but in patterns that differ qualitatively from human cases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a method that selectively zeros out parameters in language models and then applies the Text Aphasia Battery to diagnose the resulting deficits in generated text. Across more than one hundred thousand outputs from five billion-parameter models, every major aphasia symptom appears, yet the overall distributions and combinations of symptoms differ from those documented in humans. The work identifies consistent differences between attention-related components and feed-forward components, along with a depth effect in which early layers more often disrupt syntax and meaning while middle-to-late layers more often disrupt sound and fluency. These observations support the claim that aphasia syndromes arise from the specific details of how language is learned and processed rather than from any generic disruption to language capacity.

Core claim

Lesioning language models by zeroing parameters causes the full range of aphasia symptoms to surface when outputs are scored with the Text Aphasia Battery, but the profiles differ in distribution from human patients. Symptom patterns vary between attention components and feed-forward components and also vary with layer depth, where early layers produce more syntactic and semantic deficits and late-middle layers produce more phonological and fluency deficits. Although some lesions yield profiles that are quantitatively closer to particular human aphasia types, the qualitative mismatches indicate that such syndromes are shaped by the concrete mechanisms of learning and processing rather than a

What carries the argument

Selective zeroing of model parameters followed by symptom diagnosis with the Text Aphasia Battery, used to compare impairment profiles across attention versus feed-forward components and across layer depths.

If this is right

Symptom profiles differ systematically between attention components and feed-forward components.
Early-layer lesions disproportionately produce syntactic and semantic deficits.
Late-middle-layer lesions produce higher rates of phonological and fluency deficits.
Quantitative resemblance to some human aphasia types occurs, yet qualitative pattern differences remain.
Aphasia syndromes reflect the particular learning and processing details of the system rather than domain-invariant consequences of disruption.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same lesioning approach could be applied to other model families to test whether the observed component and depth effects generalize.
Training choices that alter how syntax or phonology is represented might shift the lesion-induced deficit patterns toward or away from human profiles.
Mapping which model parts produce which deficits could guide targeted improvements in robustness against specific language failures.

Load-bearing premise

That the Text Aphasia Battery applied to model-generated text produces symptom labels validly comparable to clinical diagnoses in humans.

What would settle it

Repeating the lesioning and scoring procedure on models trained with markedly different objectives or architectures and finding that symptom profiles now match human aphasia distributions in both quantitative similarity and qualitative clustering would undermine the conclusion.

Figures

Figures reproduced from arXiv: 2605.16222 by Cory Shain, Jill Kries, Laura Gwilliams, Nathan Roll.

**Figure 1.** Figure 1: Component lesions produce different TAB-symptom mixtures. (A) Raw stacked TABsymptom rates per component, colored by symptom class and shaded by specific symptom; bars can exceed 1 because each response can receive multiple symptoms. Error bars show 95% CIs for total symptom burden. (B) Component-specific contribution, computed as each component’s persymptom rate as a multiple of the all-output ablated m… view at source ↗

**Figure 2.** Figure 2: Depth influences the symptom profile induced by lesions. Each symptom row is min– max normalized across depth, so color marks where that symptom peaks instead of which symptom is most prevalent. higher likelihood to productions from the control population than the PWA population, with no change in the strength of this preference as a function of LM lesion condition. We also compared symptom co-occurrence p… view at source ↗

**Figure 3.** Figure 3: Matched random does not mimic targeted lesions. Gemma-3-1B-IT, layers 0–12, 100% severity, 3 seeds. (A) Mean response length. (B) Lexical diversity. Targeted lesions reduce both metrics; sparsity-matched random controls do not show the same targeted length/diversity reduction. 13 [PITH_FULL_IMAGE:figures/full_fig_p013_3.png] view at source ↗

**Figure 4.** Figure 4: FFN-vs-attention split persists across decoding settings. 5 × 5 grid of temperature and repetition penalty (Gemma-3-1B-IT). (A) Gate-minus-K off-topic rate difference: nonnegative in every cell and strictly positive in 24/25 cells. (B) FFN-Gate repetition-loop symptom rates are concentrated at low-temperature/no-penalty settings and suppressed by higher repetition penalties; they are zero at T = 0.7, repet… view at source ↗

**Figure 5.** Figure 5: Humans and lesioned LMs differ in both burden and composition. (A) Human diagnosis groups vary in mean TAB-symptom count, whereas LM components have similar overall burdens. (B) Conditional on at least one positive TAB symptom, category shares sum over triggered symptoms and are not raw prevalence rates; human outputs distribute across semantic, syntactic, and fluency categories, while lesioned-LM outputs… view at source ↗

**Figure 6.** Figure 6: ). Restricting to cross-category pairs gives r = 0.296 (p = 0.006). Removing rows with repetition-loop symptoms doubles the correlation to r = 0.662 (p = 0.0001). The alignment strengthens on the high-agreement symptoms (r = 0.470; Section I). Under nucleus sampling (T = 0.7, p = 0.9, repetition penalty 1.2), the global Mantel alignment does not persist (r = 0.035, p = 0.36). The FFN-Gate/Up component diss… view at source ↗

**Figure 7.** Figure 7: Human TAB profiles by diagnosis. (A) Raw symptom rates per diagnosis. (B) Deviations from the human corpus mean; panel A holds absolute magnitudes [PITH_FULL_IMAGE:figures/full_fig_p029_7.png] view at source ↗

**Figure 8.** Figure 8: Selected descriptive contrasts between human diagnosis profiles and LM component profiles. Anomic vs. Attention-K (cosine 0.808); Broca’s vs. FFN-Gate (0.665); Wernicke’s vs. FFN-Up (0.592). These are exploratory profile similarities, not syndrome mappings. R Decoding sensitivity checks The main text reports a 5 × 5 decoding grid ( [PITH_FULL_IMAGE:figures/full_fig_p029_8.png] view at source ↗

**Figure 9.** Figure 9: At 7B, the FFN-vs-attention contrast appears only under multi-layer ablation. FFNGate (orange) reaches a repetition-loop symptom rate of 0.41 at 25% severity; attention-K (blue) stays near baseline until 75%. T Matched-random ablation control [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗

**Figure 10.** Figure 10: Combined human and lesioned-LM TAB-symptom panel. AphasiaBank (nh = 6,000) and lesioned LMs (nℓ = 112,426). (A, B) Raw symptom rates. (C, D) Deviations from the corresponding corpus mean. Centered panels show within-corpus emphasis only. Colors are shared across all four panels; the legend groups symptoms under their TAB category headers. V Likelihood of AphasiaBank productions under model lesions As a mo… view at source ↗

**Figure 11.** Figure 11: Lesions shift overall likelihood but preserve the PWA–Control gap. Mean per-token log-probability assigned by Gemma-3-1B-IT to 50 PWA and 50 Control AphasiaBank responses under three lesion conditions, with bootstrap 95% CIs. W Activation patching 33 [PITH_FULL_IMAGE:figures/full_fig_p033_11.png] view at source ↗

**Figure 12.** Figure 12: FFN ablation perturbs the residual stream more than attention-K. Hidden-state divergence from the intact model after ablation. (A) Qwen 3B bridge probe: about 10× larger peak for FFN (red) than attention-K (blue), p = 3×10−6 . (B) Gemma 1B bridge probe: about 2× larger [PITH_FULL_IMAGE:figures/full_fig_p034_12.png] view at source ↗

**Figure 13.** Figure 13: Gate ablation starts from a larger baseline KL perturbation than attention-K in this small bridge probe. Activation patching on Qwen 3B (3 prompts; mean ± std). Gate starts from a higher baseline KL divergence at the inspected layer, while restoration curves are otherwise similar; this is an auxiliary perturbation-size check instead of evidence for a distinct circuit mechanism. X Representational analysis… view at source ↗

read the original abstract

Aphasias, selective language impairments which can arise from brain damage, reveal the functional organization of human language by providing causal links between affected brain regions and specific symptom profiles. Drawing on this literature, we introduce an aphasia-inspired technique to characterize the emergent functional organization of language models (LMs). We ``lesion'' (zero-out) model parameters and measure the effects of this intervention against clinical aphasia symptoms, as diagnosed by the Text Aphasia Battery (TAB). When applied to 112,426 outputs from five 1B-scale LMs, the full range of evaluated symptoms surface, but in distributions largely distinct from those of humans. Our method uncovers broad symptom-profile differences between attention components (query, key, value, output) and feed-forward components (up, gate, down), with weaker evidence for differences among components within the same mechanism. We also find an effect of depth, where lesions in early layers disproportionately cause syntactic and semantic symptoms while late-middle layers yield higher rates of phonological and fluency deficits. Although some LM lesions induce quantitatively more similar profiles to some human aphasia types than others, qualitative differences in symptom patterns between LMs and humans suggest that aphasia syndromes are heavily influenced by the details of learning and processing rather than being a domain-invariant consequence of disrupted language processing.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Lesioning 1B LMs and scoring outputs with an adapted aphasia battery turns up component and depth differences in symptom rates, but the claim that this shows aphasia syndromes are not domain-invariant rests on untested assumptions about the battery's validity for model text.

read the letter

The main thing here is that zeroing parameters in several 1B-scale models and running the outputs through a clinical aphasia battery produces measurable symptom profiles that vary by attention versus feed-forward components and by layer depth, yet these profiles differ qualitatively from human aphasia patterns in ways the authors link to specifics of training and processing rather than universal language organization. The work is new in its scale and in the direct use of a full clinical battery on autoregressive generations rather than simpler metrics like perplexity or next-token accuracy. It does a reasonable job documenting the broad patterns: early layers more often trigger syntactic and semantic symptoms while later-middle layers increase phonological and fluency issues, and attention sub-parts show some separation from feed-forward ones. The 112k output sample helps make those differences look stable on the surface. The softer spot is exactly the one the stress-test flags. The Text Aphasia Battery was built for human clinical speech, and nothing in the abstract shows explicit validation that its labels map to equivalent functional deficits when applied to token sequences from a lesioned transformer. Model errors can stem from sampling, vocabulary boundaries, or lack of grounded production, so the reported qualitative mismatches could be measurement artifacts rather than evidence against invariance. Without baseline controls or human rater agreement checks, the central interpretation stays provisional. This is the kind of paper that would interest interpretability researchers looking for causal probes inside LMs and anyone trying to compare artificial and biological language systems. A reader focused on mechanistic differences across model components could extract useful empirical patterns even if the human analogy needs tightening. I would send it for peer review. The measurements themselves are straightforward enough that referees can pressure-test the adaptation of the battery and the strength of the invariance claim.

Referee Report

2 major / 1 minor

Summary. The paper introduces an aphasia-inspired lesioning technique for language models, zeroing out parameters in five 1B-scale LMs and applying the Text Aphasia Battery (TAB) to 112,426 generated outputs to diagnose symptom profiles. It reports that the full range of clinical symptoms appears but in distributions distinct from human aphasias, with broad differences between attention (query/key/value/output) and feed-forward (up/gate/down) components, weaker within-mechanism differences, and depth effects (early layers more syntactic/semantic deficits; late-middle layers more phonological/fluency deficits). The central claim is that qualitative mismatches with human profiles indicate aphasia syndromes are shaped by learning/processing details rather than domain-invariant consequences of disrupted language processing.

Significance. If the TAB application yields validly comparable symptom labels, the work offers a scalable empirical method for mapping LM internal components to functional language deficits, supported by a large sample and clear component- and depth-level contrasts. This could advance interpretability research by bridging clinical linguistics and neural network analysis, while providing evidence that aphasia profiles depend on architectural and training specifics.

major comments (2)

[Abstract] The conclusion that qualitative differences imply aphasia syndromes are heavily influenced by details of learning and processing (rather than domain-invariant) is load-bearing on the assumption that TAB symptom labels applied to autoregressive LM text are functionally analogous to human clinical diagnoses. The manuscript provides no reported validation steps such as expert human concordance rates, explicit mapping of LM error types to clinical criteria, or controls for baseline generation artifacts from tokenization/sampling.
[Methods] It is unclear how the TAB was adapted for model-generated text (zeroed parameters, lack of embodied production) or whether baseline error rates in unlesioned models were subtracted or controlled when computing symptom rates; this directly affects interpretation of the reported component- and depth-level differences as evidence against invariance.

minor comments (1)

[Abstract] The abstract states 'weaker evidence for differences among components within the same mechanism' without specifying the statistical criteria, p-value thresholds, or effect-size measures used to support this assessment.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive comments. These have helped us clarify the methodological details and limitations of our approach. We address the major comments point by point below.

read point-by-point responses

Referee: [Abstract] The conclusion that qualitative differences imply aphasia syndromes are heavily influenced by details of learning and processing (rather than domain-invariant) is load-bearing on the assumption that TAB symptom labels applied to autoregressive LM text are functionally analogous to human clinical diagnoses. The manuscript provides no reported validation steps such as expert human concordance rates, explicit mapping of LM error types to clinical criteria, or controls for baseline generation artifacts from tokenization/sampling.

Authors: We recognize that direct validation of the TAB on LM-generated text, such as through expert concordance, was not performed in the original manuscript. This is a valid concern for the strength of the analogy. In the revised version, we have added a new section discussing the adaptation process and providing an explicit mapping of observed LM errors to TAB criteria, along with illustrative examples. We also clarify that symptom rates were computed as differences from unlesioned baseline models to control for generation artifacts. While we cannot retroactively add human expert ratings without new data collection, we believe these additions strengthen the presentation of our results and support the claim of qualitative differences. revision: partial
Referee: [Methods] It is unclear how the TAB was adapted for model-generated text (zeroed parameters, lack of embodied production) or whether baseline error rates in unlesioned models were subtracted or controlled when computing symptom rates; this directly affects interpretation of the reported component- and depth-level differences as evidence against invariance.

Authors: The adaptation of the TAB for model-generated text is described in the Methods section, where we explain that we applied the battery's textual diagnostic criteria to the outputs, as the symptoms are primarily linguistic and do not depend on embodied aspects. We have expanded this description in the revision to include more details on handling zeroed parameters' effects on generation. Additionally, baseline error rates from unlesioned models were indeed subtracted to isolate lesion-induced symptoms; we will make this control more explicit and discuss its implications for interpreting the component and depth effects as evidence against domain-invariant profiles. revision: yes

standing simulated objections not resolved

Providing expert human concordance rates for the application of TAB to LM-generated text, as this would require new data collection not present in the original study.

Circularity Check

0 steps flagged

No circularity: empirical symptom measurements after lesions

full rationale

The paper reports direct empirical results from zeroing parameters in 1B-scale LMs, generating 112k outputs, and scoring them with the Text Aphasia Battery for symptom rates. These rates are then compared distributionally to human aphasia profiles. No equations, fitted parameters, or derivations are invoked that would make any reported symptom frequency or qualitative difference equivalent to its own inputs by construction. The central claim about domain-invariance follows from the observed mismatches rather than from any self-definitional mapping or self-citation chain. The analysis is therefore self-contained as a set of measurements.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the validity of applying a human clinical battery to model text and on treating parameter zeroing as a functional analog of brain damage; no free parameters or invented entities are introduced.

axioms (1)

domain assumption The Text Aphasia Battery produces symptom labels on model-generated text that are meaningfully comparable to clinical aphasia diagnoses in humans.
The paper uses TAB scores to draw conclusions about similarity and domain-invariance of aphasia syndromes.

pith-pipeline@v0.9.0 · 5758 in / 1286 out tokens · 135067 ms · 2026-05-20T18:26:24.075544+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 2 internal anchors

[1]

Bulletins de la Soci

Broca, Paul , title =. Bulletins de la Soci

work page
[2]

Brains and algorithms partially converge in natural language processing , journal =

Caucheteux, Charlotte and King, Jean-R. Brains and algorithms partially converge in natural language processing , journal =. 2022 , doi =

work page 2022
[3]

Causal Scrubbing: A Method for Rigorously Testing Interpretability Hypotheses , howpublished =

Chan, Lawrence and Garriga-Alonso, Adri. Causal Scrubbing: A Method for Rigorously Testing Interpretability Hypotheses , howpublished =. 2022 , url =

work page 2022
[4]

, title =

Clark, Kevin and Khandelwal, Urvashi and Levy, Omer and Manning, Christopher D. , title =. Proceedings of the 2019. 2019 , doi =

work page 2019
[5]

Cohen, Jacob , title =

work page
[6]

2025 , eprint =

Comanici, Gheorghe and Bieber, Eric and Schaekermann, Mike and Pasupat, Ice and Sachdeva, Noveen and others , title =. 2025 , eprint =

work page 2025
[7]

Towards Automated Circuit Discovery for Mechanistic Interpretability , booktitle =

Conmy, Arthur and Mavor-Parker, Augustine and Lynch, Aengus and Heimersheim, Stefan and Garriga-Alonso, Adri. Towards Automated Circuit Discovery for Mechanistic Interpretability , booktitle =

work page
[8]

, title =

Dronkers, Nina F. , title =. Nature , volume =. 1996 , doi =

work page 1996
[9]

and Wilkins, David P

Dronkers, Nina F. and Wilkins, David P. and. Lesion Analysis of the Brain Areas Involved in Language Comprehension , journal =. 2004 , doi =

work page 2004
[10]

and Ivanova, Maria V

Dronkers, Nina F. and Ivanova, Maria V. and Baldo, Juliana V. , title =. Journal of the International Neuropsychological Society , volume =. 2017 , doi =

work page 2017
[11]

Transformer Circuits Thread , publisher =

Elhage, Nelson and Nanda, Neel and Olsson, Catherine and Henighan, Tom and Joseph, Nicholas and Mann, Ben and Askell, Amanda and Bai, Yuntao and Chen, Anna and Conerly, Tom and others , title =. Transformer Circuits Thread , publisher =. 2021 , url =

work page 2021
[12]

and Regev, Tamar I

Fedorenko, Evelina and Ivanova, Anna A. and Regev, Tamar I. , title =. Nature Reviews Neuroscience , volume =. 2024 , doi =

work page 2024
[13]

, title =

Friederici, Angela D. , title =. Physiological Reviews , volume =. 2011 , doi =

work page 2011
[14]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations , pages =

Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations , pages =. 2020 , doi =

work page 2020
[15]

Advances in Neural Information Processing Systems , volume =

Geiger, Atticus and Lu, Hanson and Icard, Thomas and Potts, Christopher , title =. Advances in Neural Information Processing Systems , volume =

work page
[16]

Gemma 3 Technical Report

Gemma 3 Technical Report , year =. 2503.19786 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[17]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages =

Geva, Mor and Caciularu, Avi and Wang, Kevin and Goldberg, Yoav , title =. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages =. 2022 , doi =

work page 2022
[18]

1993 , isbn =

Goodglass, Harold , title =. 1993 , isbn =

work page 1993
[19]

2001 , note =

Goodglass, Harold and Kaplan, Edith and Barresi, Barbara , title =. 2001 , note =

work page 2001
[20]

2026 , url =

google/gemma-4-. 2026 , url =

work page 2026
[21]

Advances in Neural Information Processing Systems , volume =

Hase, Peter and Xie, Harry and Bansal, Mohit , title =. Advances in Neural Information Processing Systems , volume =

work page
[22]

Nature Reviews Neuroscience , volume =

Hickok, Gregory and Poeppel, David , title =. Nature Reviews Neuroscience , volume =. 2007 , doi =

work page 2007
[23]

and Shallice, Tim , title =

Hinton, Geoffrey E. and Shallice, Tim , title =. Psychological Review , volume =. 1991 , doi =

work page 1991
[24]

Kertesz, Andrew , title =

work page
[25]

and Solla, Sara A

LeCun, Yann and Denker, John S. and Solla, Sara A. , title =. Advances in Neural Information Processing Systems , volume =. 1990 , publisher =

work page 1990
[26]

Aphasiology , volume =

MacWhinney, Brian and Fromm, Davida and Forbes, Margaret and Holland, Audrey , title =. Aphasiology , volume =. 2011 , doi =

work page 2011
[27]

Advances in Neural Information Processing Systems , volume =

Meng, Kevin and Bau, David and Andonian, Alex and Belinkov, Yonatan , title =. Advances in Neural Information Processing Systems , volume =

work page
[28]

2024 , month = sep, url =

Llama 3.2: Revolutionizing Edge. 2024 , month = sep, url =

work page 2024
[29]

Advances in Neural Information Processing Systems , volume =

Michel, Paul and Levy, Omer and Neubig, Graham , title =. Advances in Neural Information Processing Systems , volume =

work page
[30]

The Eleventh International Conference on Learning Representations , year =

Nanda, Neel and Chan, Lawrence and Lieberum, Tom and Smith, Jess and Steinhardt, Jacob , title =. The Eleventh International Conference on Learning Representations , year =

work page
[31]

Transformer Circuits Thread , publisher =

Olsson, Catherine and Elhage, Nelson and Nanda, Neel and Joseph, Nicholas and DasSarma, Nova and Henighan, Tom and Mann, Ben and Askell, Amanda and Bai, Yuntao and Chen, Anna and Conerly, Tom and Drain, Dawn and Ganguli, Deep and Hatfield-Dodds, Zac and Hernandez, Danny and Johnston, Scott and Jones, Andy and Kernion, Jackson and Lovitt, Liane and Ndousse...

work page 2022
[32]

and Shallice, Tim , title =

Plaut, David C. and Shallice, Tim , title =. Cognitive Neuropsychology , volume =. 1993 , doi =

work page 1993
[33]

Qwen2.5 Technical Report

Qwen2.5 Technical Report , year =. 2412.15115 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[34]

arXiv preprint arXiv:2511.20507 , year =

Roll, Nathan and Kries, Jill and Jin, Flora and Wang, Catherine and Finley, Ann Marie and Sumner, Meghan and Shain, Cory and Gwilliams, Laura , title =. arXiv preprint arXiv:2511.20507 , year =. 2511.20507 , archivePrefix =

work page arXiv
[35]

and Kanwisher, Nancy and Tenenbaum, Joshua B

Schrimpf, Martin and Blank, Idan and Tuckute, Greta and Kauf, Carina and Hosseini, Eghbal A. and Kanwisher, Nancy and Tenenbaum, Joshua B. and Fedorenko, Evelina , title =. Proceedings of the National Academy of Sciences , volume =. 2021 , doi =

work page 2021
[36]

Swinburn, Kate and Porter, Gill and Howard, David , title =

work page
[37]

and Kaiser, Lukasz and Polosukhin, Illia , title =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia , title =. Advances in Neural Information Processing Systems , volume =

work page
[38]

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages =

Voita, Elena and Talbot, David and Moiseev, Fedor and Sennrich, Rico and Titov, Ivan , title =. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages =

work page
[39]

, title =

Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel R. , title =. Proceedings of the 2018. 2018 , doi =

work page 2018
[40]

bioRxiv , year =

Wang, Chengcheng and Fan, Zhiyu and Han, Zaizhu and Bi, Yanchao and Li, Jixing , title =. bioRxiv , year =. doi:10.1101/2025.02.22.639416 , note =

work page doi:10.1101/2025.02.22.639416 2025
[41]

arXiv preprint arXiv:2601.19723 , year =

Wang, Yifan and Zheng, Jichen and Sun, Jingyuan and Zhang, Yunhao and Ye, Chunyu and Li, Jixing and Zong, Chengqing and Wang, Shaonan , title =. arXiv preprint arXiv:2601.19723 , year =. 2601.19723 , archivePrefix =

work page arXiv
[42]

, title =

Warstadt, Alex and Parrish, Alicia and Liu, Haokun and Mohananey, Anhad and Peng, Wei and Wang, Sheng-Fu and Bowman, Samuel R. , title =. Transactions of the Association for Computational Linguistics , volume =. 2020 , doi =

work page 2020
[43]

Wernicke, Carl , title =

work page
[44]

and Eriksson, Dana K

Wilson, Stephen M. and Eriksson, Dana K. and Schneck, Sarah M. and Lucanie, Jillian M. , title =. PLOS ONE , volume =. 2018 , doi =

work page 2018
[45]

and Entrup, Jillian L

Wilson, Stephen M. and Entrup, Jillian L. and Schneck, Sarah M. and Onuscheck, Caitlin F. and Levy, Deborah F. and Rahman, Maysaa and Willey, Emma and Casilio, Marianne and Yen, Melodie and Brito, Alexandra C. and Kam, Wayneho and Davis, L. Taylor and de Riesthal, Michael and Kirshner, Howard S. , title =. Brain , volume =. 2023 , doi =

work page 2023

[1] [1]

Bulletins de la Soci

Broca, Paul , title =. Bulletins de la Soci

work page

[2] [2]

Brains and algorithms partially converge in natural language processing , journal =

Caucheteux, Charlotte and King, Jean-R. Brains and algorithms partially converge in natural language processing , journal =. 2022 , doi =

work page 2022

[3] [3]

Causal Scrubbing: A Method for Rigorously Testing Interpretability Hypotheses , howpublished =

Chan, Lawrence and Garriga-Alonso, Adri. Causal Scrubbing: A Method for Rigorously Testing Interpretability Hypotheses , howpublished =. 2022 , url =

work page 2022

[4] [4]

, title =

Clark, Kevin and Khandelwal, Urvashi and Levy, Omer and Manning, Christopher D. , title =. Proceedings of the 2019. 2019 , doi =

work page 2019

[5] [5]

Cohen, Jacob , title =

work page

[6] [6]

2025 , eprint =

Comanici, Gheorghe and Bieber, Eric and Schaekermann, Mike and Pasupat, Ice and Sachdeva, Noveen and others , title =. 2025 , eprint =

work page 2025

[7] [7]

Towards Automated Circuit Discovery for Mechanistic Interpretability , booktitle =

Conmy, Arthur and Mavor-Parker, Augustine and Lynch, Aengus and Heimersheim, Stefan and Garriga-Alonso, Adri. Towards Automated Circuit Discovery for Mechanistic Interpretability , booktitle =

work page

[8] [8]

, title =

Dronkers, Nina F. , title =. Nature , volume =. 1996 , doi =

work page 1996

[9] [9]

and Wilkins, David P

Dronkers, Nina F. and Wilkins, David P. and. Lesion Analysis of the Brain Areas Involved in Language Comprehension , journal =. 2004 , doi =

work page 2004

[10] [10]

and Ivanova, Maria V

Dronkers, Nina F. and Ivanova, Maria V. and Baldo, Juliana V. , title =. Journal of the International Neuropsychological Society , volume =. 2017 , doi =

work page 2017

[11] [11]

Transformer Circuits Thread , publisher =

Elhage, Nelson and Nanda, Neel and Olsson, Catherine and Henighan, Tom and Joseph, Nicholas and Mann, Ben and Askell, Amanda and Bai, Yuntao and Chen, Anna and Conerly, Tom and others , title =. Transformer Circuits Thread , publisher =. 2021 , url =

work page 2021

[12] [12]

and Regev, Tamar I

Fedorenko, Evelina and Ivanova, Anna A. and Regev, Tamar I. , title =. Nature Reviews Neuroscience , volume =. 2024 , doi =

work page 2024

[13] [13]

, title =

Friederici, Angela D. , title =. Physiological Reviews , volume =. 2011 , doi =

work page 2011

[14] [14]

Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations , pages =

Gauthier, Jon and Hu, Jennifer and Wilcox, Ethan and Qian, Peng and Levy, Roger , title =. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations , pages =. 2020 , doi =

work page 2020

[15] [15]

Advances in Neural Information Processing Systems , volume =

Geiger, Atticus and Lu, Hanson and Icard, Thomas and Potts, Christopher , title =. Advances in Neural Information Processing Systems , volume =

work page

[16] [16]

Gemma 3 Technical Report

Gemma 3 Technical Report , year =. 2503.19786 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv

[17] [17]

Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages =

Geva, Mor and Caciularu, Avi and Wang, Kevin and Goldberg, Yoav , title =. Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing , pages =. 2022 , doi =

work page 2022

[18] [18]

1993 , isbn =

Goodglass, Harold , title =. 1993 , isbn =

work page 1993

[19] [19]

2001 , note =

Goodglass, Harold and Kaplan, Edith and Barresi, Barbara , title =. 2001 , note =

work page 2001

[20] [20]

2026 , url =

google/gemma-4-. 2026 , url =

work page 2026

[21] [21]

Advances in Neural Information Processing Systems , volume =

Hase, Peter and Xie, Harry and Bansal, Mohit , title =. Advances in Neural Information Processing Systems , volume =

work page

[22] [22]

Nature Reviews Neuroscience , volume =

Hickok, Gregory and Poeppel, David , title =. Nature Reviews Neuroscience , volume =. 2007 , doi =

work page 2007

[23] [23]

and Shallice, Tim , title =

Hinton, Geoffrey E. and Shallice, Tim , title =. Psychological Review , volume =. 1991 , doi =

work page 1991

[24] [24]

Kertesz, Andrew , title =

work page

[25] [25]

and Solla, Sara A

LeCun, Yann and Denker, John S. and Solla, Sara A. , title =. Advances in Neural Information Processing Systems , volume =. 1990 , publisher =

work page 1990

[26] [26]

Aphasiology , volume =

MacWhinney, Brian and Fromm, Davida and Forbes, Margaret and Holland, Audrey , title =. Aphasiology , volume =. 2011 , doi =

work page 2011

[27] [27]

Advances in Neural Information Processing Systems , volume =

Meng, Kevin and Bau, David and Andonian, Alex and Belinkov, Yonatan , title =. Advances in Neural Information Processing Systems , volume =

work page

[28] [28]

2024 , month = sep, url =

Llama 3.2: Revolutionizing Edge. 2024 , month = sep, url =

work page 2024

[29] [29]

Advances in Neural Information Processing Systems , volume =

Michel, Paul and Levy, Omer and Neubig, Graham , title =. Advances in Neural Information Processing Systems , volume =

work page

[30] [30]

The Eleventh International Conference on Learning Representations , year =

Nanda, Neel and Chan, Lawrence and Lieberum, Tom and Smith, Jess and Steinhardt, Jacob , title =. The Eleventh International Conference on Learning Representations , year =

work page

[31] [31]

Transformer Circuits Thread , publisher =

Olsson, Catherine and Elhage, Nelson and Nanda, Neel and Joseph, Nicholas and DasSarma, Nova and Henighan, Tom and Mann, Ben and Askell, Amanda and Bai, Yuntao and Chen, Anna and Conerly, Tom and Drain, Dawn and Ganguli, Deep and Hatfield-Dodds, Zac and Hernandez, Danny and Johnston, Scott and Jones, Andy and Kernion, Jackson and Lovitt, Liane and Ndousse...

work page 2022

[32] [32]

and Shallice, Tim , title =

Plaut, David C. and Shallice, Tim , title =. Cognitive Neuropsychology , volume =. 1993 , doi =

work page 1993

[33] [33]

Qwen2.5 Technical Report

Qwen2.5 Technical Report , year =. 2412.15115 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv

[34] [34]

arXiv preprint arXiv:2511.20507 , year =

Roll, Nathan and Kries, Jill and Jin, Flora and Wang, Catherine and Finley, Ann Marie and Sumner, Meghan and Shain, Cory and Gwilliams, Laura , title =. arXiv preprint arXiv:2511.20507 , year =. 2511.20507 , archivePrefix =

work page arXiv

[35] [35]

and Kanwisher, Nancy and Tenenbaum, Joshua B

Schrimpf, Martin and Blank, Idan and Tuckute, Greta and Kauf, Carina and Hosseini, Eghbal A. and Kanwisher, Nancy and Tenenbaum, Joshua B. and Fedorenko, Evelina , title =. Proceedings of the National Academy of Sciences , volume =. 2021 , doi =

work page 2021

[36] [36]

Swinburn, Kate and Porter, Gill and Howard, David , title =

work page

[37] [37]

and Kaiser, Lukasz and Polosukhin, Illia , title =

Vaswani, Ashish and Shazeer, Noam and Parmar, Niki and Uszkoreit, Jakob and Jones, Llion and Gomez, Aidan N. and Kaiser, Lukasz and Polosukhin, Illia , title =. Advances in Neural Information Processing Systems , volume =

work page

[38] [38]

Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages =

Voita, Elena and Talbot, David and Moiseev, Fedor and Sennrich, Rico and Titov, Ivan , title =. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics , pages =

work page

[39] [39]

, title =

Wang, Alex and Singh, Amanpreet and Michael, Julian and Hill, Felix and Levy, Omer and Bowman, Samuel R. , title =. Proceedings of the 2018. 2018 , doi =

work page 2018

[40] [40]

bioRxiv , year =

Wang, Chengcheng and Fan, Zhiyu and Han, Zaizhu and Bi, Yanchao and Li, Jixing , title =. bioRxiv , year =. doi:10.1101/2025.02.22.639416 , note =

work page doi:10.1101/2025.02.22.639416 2025

[41] [41]

arXiv preprint arXiv:2601.19723 , year =

Wang, Yifan and Zheng, Jichen and Sun, Jingyuan and Zhang, Yunhao and Ye, Chunyu and Li, Jixing and Zong, Chengqing and Wang, Shaonan , title =. arXiv preprint arXiv:2601.19723 , year =. 2601.19723 , archivePrefix =

work page arXiv

[42] [42]

, title =

Warstadt, Alex and Parrish, Alicia and Liu, Haokun and Mohananey, Anhad and Peng, Wei and Wang, Sheng-Fu and Bowman, Samuel R. , title =. Transactions of the Association for Computational Linguistics , volume =. 2020 , doi =

work page 2020

[43] [43]

Wernicke, Carl , title =

work page

[44] [44]

and Eriksson, Dana K

Wilson, Stephen M. and Eriksson, Dana K. and Schneck, Sarah M. and Lucanie, Jillian M. , title =. PLOS ONE , volume =. 2018 , doi =

work page 2018

[45] [45]

and Entrup, Jillian L

Wilson, Stephen M. and Entrup, Jillian L. and Schneck, Sarah M. and Onuscheck, Caitlin F. and Levy, Deborah F. and Rahman, Maysaa and Willey, Emma and Casilio, Marianne and Yen, Melodie and Brito, Alexandra C. and Kam, Wayneho and Davis, L. Taylor and de Riesthal, Michael and Kirshner, Howard S. , title =. Brain , volume =. 2023 , doi =

work page 2023