Beyond representational alignment with brain-guided language models for robust reasoning

Kai Du; Mingqing Xiao; Zhouchen Lin

arxiv: 2606.11893 · v1 · pith:GFFQDXJEnew · submitted 2026-06-10 · 💻 cs.LG · cs.AI· cs.CL· q-bio.NC

Beyond representational alignment with brain-guided language models for robust reasoning

Mingqing Xiao , Kai Du , Zhouchen Lin This is my paper

Pith reviewed 2026-06-27 10:20 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CLq-bio.NC

keywords brain-guided language modelsrepresentational alignmentdeductive reasoningtask-fMRIneural predictivityLLM enhancementreasoning improvement

0 comments

The pith

Task-evoked brain signals can steer large language models to higher reasoning accuracy.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that large language model representations align partially with task-fMRI signals from brain regions involved in deductive reasoning. It introduces a method to steer those model representations using the joint structure of the model and brain data, applied both at inference time and during fine-tuning. The resulting improvements in reasoning accuracy hold across ten models ranging from 1.5 billion to 72 billion parameters, transfer between reasoning types, and remain after language-only training. A sympathetic reader would care because the work moves beyond measuring alignment to using brain signals as an active source of guidance for model behavior.

Core claim

LLM internal representations explain a substantial fraction of the explainable variance in reasoning-related brain regions at the aggregate level, though predictivity drops within specific reasoning types. Steering model representations along directions induced by the joint structure of model and brain representations, applied at both inference and fine-tuning stages, produces accuracy gains that are orthogonal to language-only supervision and reach up to 13 percent absolute improvement while transferring across reasoning types.

What carries the argument

The brain-guided steering procedure that adjusts LLM representations using directions from the joint model-brain representation structure.

If this is right

Reasoning accuracy rises across models from 1.5B to 72B parameters.
Gains transfer to unseen reasoning types.
Improvements remain after additional language-only training.
The method works at both inference time and during fine-tuning.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Brain recordings may contain reasoning structure that text corpora do not fully encode.
The same steering approach could be tested on other cognitive tasks such as planning or analogy.
Future work might examine whether the method reduces the amount of text data needed to reach a given performance level.

Load-bearing premise

The joint structure of model and brain representations supplies directions that causally improve reasoning performance rather than reflecting task-specific correlations already captured by language training.

What would settle it

If the accuracy gains disappear when brain signals are replaced by random vectors of matching dimension while the rest of the steering procedure stays identical, the claim that brain signals supply useful guidance would be falsified.

read the original abstract

The correspondence between large language models (LLMs) and the neural mechanisms underlying human higher-order cognition remains insufficiently characterized. Given that language and reasoning in the human brain appear dissociable, an open question is whether LLMs align with neural signals from reasoning-related regions and whether such signals can improve them. Here, focusing on deductive reasoning, we show that LLM internal representations are not only partially aligned with task-fMRI activity but can also be directly enhanced by these signals. Using a neural-predictivity metric, we find that LLMs explain a substantial fraction of the explainable variance in reasoning-related regions at the aggregate level, whereas predictivity within specific reasoning types is lower, indicating both alignment and divergence. Building on this, we propose a brain-guided framework: we steer model representations along directions induced by the joint structure of model and brain representations, applying intervention at inference and fine-tuning during training. We demonstrate that task-evoked brain signals can directly enhance LLM reasoning, yielding gains orthogonal to language-only supervision across 10 LLMs (1.5B-72B), with transfer across reasoning types and up to 13\% absolute accuracy gain. Our results advance LLM-brain correspondences from correlation to guidance, establishing a brain-signal-driven pathway toward more robust and cognitively aligned AI.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper shows partial LLM-brain alignment in reasoning regions and proposes steering via joint directions at inference and fine-tuning, but the orthogonality to language supervision is not yet demonstrated by the controls described.

read the letter

The main thing here is a concrete attempt to turn measured alignment between LLM representations and task-evoked fMRI into an actual intervention. They report aggregate predictivity of LLM hidden states for reasoning-area BOLD signals, note that it drops when broken down by specific reasoning subtypes, then extract directions from the joint structure and apply them both at inference and during fine-tuning. The headline result is up to 13% absolute accuracy lift across ten models (1.5B–72B) plus transfer to other reasoning tasks, presented as orthogonal to language-only training.

That combination of inference-time and training-time steering on deductive reasoning is not in the prior alignment papers cited in the abstract, so the framework itself is new. Testing the same intervention across a wide size range and checking transfer are also positive steps.

The soft spot is the missing control for whether the gains are truly coming from brain-specific structure rather than task-correlated signals already implicit in the problems. Because the fMRI is task-evoked from reasoning regions, the joint directions could simply recover the deductive structure that language supervision already sees. The abstract does not describe an ablation that aligns to task labels alone or to non-brain task signals, so the orthogonality claim cannot be evaluated yet. The lower within-type predictivity is reported honestly, but it does not substitute for that control.

This is for readers working on additional training signals for reasoning robustness or on LLM-brain correspondence. The idea is worth a serious referee if the full methods include the necessary ablations and statistical checks; without them the central causal claim stays untested.

Referee Report

2 major / 2 minor

Summary. The paper claims that LLM internal representations show partial alignment with task-evoked fMRI signals from reasoning-related brain regions (higher at aggregate level, lower within specific reasoning types), and that steering model representations along directions from the joint model-brain covariance structure—applied at inference and during fine-tuning—directly enhances deductive reasoning performance. It reports gains of up to 13% absolute accuracy across 10 LLMs (1.5B–72B parameters), with transfer across reasoning types, and asserts these improvements are orthogonal to language-only supervision.

Significance. If the central results on orthogonality and causal improvement hold after appropriate controls, the work would advance the field by shifting LLM-brain studies from correlational alignment metrics to actionable guidance signals for improving reasoning robustness. The scale (multiple model sizes) and reported transfer are potential strengths, though the absence of explicit task-label controls limits immediate impact.

major comments (2)

[Abstract] Abstract and methods: the claim that gains are 'orthogonal to language-only supervision' is load-bearing for the central contribution, yet the manuscript provides no explicit control experiment aligning to task labels alone or to non-brain task-evoked signals; without this, it remains possible that the joint directions recover task structure already implicit in the deductive problems rather than supplying unique brain-derived information.
[Results] Results section on performance gains: the reported 13% absolute accuracy improvement and cross-reasoning-type transfer require demonstration that the steering vectors remain effective after regressing out task-correlated components from the fMRI data; the lower within-type predictivity noted in the abstract increases the risk that gains reflect task-specific correlations rather than joint representational structure.

minor comments (2)

[Methods] Notation for the neural-predictivity metric and the precise definition of the joint covariance directions should be clarified with an equation or pseudocode to allow replication.
[Figures] Figure captions for alignment and performance plots should include error bars, number of subjects, and statistical tests used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below and will incorporate additional controls to strengthen the evidence for orthogonality of the brain-derived gains.

read point-by-point responses

Referee: [Abstract] Abstract and methods: the claim that gains are 'orthogonal to language-only supervision' is load-bearing for the central contribution, yet the manuscript provides no explicit control experiment aligning to task labels alone or to non-brain task-evoked signals; without this, it remains possible that the joint directions recover task structure already implicit in the deductive problems rather than supplying unique brain-derived information.

Authors: We agree that the current language-only supervision baselines do not fully rule out recovery of implicit task structure. An explicit control deriving directions from task labels alone or from non-brain task-evoked signals would provide stronger evidence. We will add such controls in the revision, including a task-label-only steering baseline and, where feasible, comparison to non-neural task signals. revision: yes
Referee: [Results] Results section on performance gains: the reported 13% absolute accuracy improvement and cross-reasoning-type transfer require demonstration that the steering vectors remain effective after regressing out task-correlated components from the fMRI data; the lower within-type predictivity noted in the abstract increases the risk that gains reflect task-specific correlations rather than joint representational structure.

Authors: The observed transfer across reasoning types already suggests the effect is not limited to within-type task correlations. Nevertheless, we acknowledge the need for an explicit regression of task-correlated components from the fMRI data prior to deriving the joint directions. We will perform and report this regression analysis in the revised results to confirm that performance gains persist. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical steering gains reported without reduction to input fits by construction

full rationale

The abstract describes alignment measurement via neural-predictivity, followed by a steering intervention along joint model-brain directions applied at inference and fine-tuning, with reported accuracy gains on reasoning tasks. No equations or definitions are provided that equate the steering directions or gains to the brain data by construction, nor is any prediction shown to be a direct rename of a fitted parameter. The orthogonality claim is presented as an empirical outcome rather than a definitional identity. The derivation chain remains self-contained against external benchmarks of accuracy improvement.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; the neural-predictivity metric and joint-structure directions are referenced but not formalized.

pith-pipeline@v0.9.1-grok · 5765 in / 1005 out tokens · 34634 ms · 2026-06-27T10:20:47.140163+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 10 canonical work pages · 9 internal anchors

[1]

Emergent abilities of large language models.Transactions on Machine Learning Research, 2022

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebas- tian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large language models.Transactions on Machine Learning Research, 2022

2022
[2]

Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems (NeurIPS), 2022

2022
[3]

Emergent analogical reasoning in large language models.Nature Human Behaviour, 7(9):1526– 1541, 2023

Taylor Webb, Keith J Holyoak, and Hongjing Lu. Emergent analogical reasoning in large language models.Nature Human Behaviour, 7(9):1526– 1541, 2023

2023
[4]

Testing the general deductive reasoning capacity of large language models using ood exam- ples

Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Mehran Kazemi, Najoung Kim, and He He. Testing the general deductive reasoning capacity of large language models using ood exam- ples. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023
[5]

Mit Press, 1994

Lance J Rips.The psychology of proof: Deductive reasoning in human thinking. Mit Press, 1994

1994
[6]

The brain network for deductive reasoning: a quantitative meta-analysis of 28 neuroimaging studies.Journal of Cognitive Neuroscience, 23(11):3483–3497, 2011

J´ erˆ ome Prado, Angad Chadha, and James R Booth. The brain network for deductive reasoning: a quantitative meta-analysis of 28 neuroimaging studies.Journal of Cognitive Neuroscience, 23(11):3483–3497, 2011

2011
[7]

Oxford university press, 1994

Alfred Tarski.Introduction to Logic and to the Methodology of the Deductive Sciences, volume 24. Oxford university press, 1994

1994
[8]

The boundaries of language and thought in deductive inference.Proceedings of the National Academy of Sciences (PNAS), 106(30):12554–12559, 2009

Martin M Monti, Lawrence M Parsons, and Daniel N Osherson. The boundaries of language and thought in deductive inference.Proceedings of the National Academy of Sciences (PNAS), 106(30):12554–12559, 2009

2009
[9]

Lan- guage is primarily a tool for communication rather than thought.Nature, 630(8017):575–586, 2024

Evelina Fedorenko, Steven T Piantadosi, and Edward AF Gibson. Lan- guage is primarily a tool for communication rather than thought.Nature, 630(8017):575–586, 2024

2024
[10]

Evaluating the deductive competence of large language models

S Seals and Valerie Shalin. Evaluating the deductive competence of large language models. InNorth American Chapter of the Association for Com- putational Linguistics: Human Language Technologies (NAACL-HLT), 2024

2024
[11]

The debate over understanding in ai’s large language models.Proceedings of the National Academy of Sciences (PNAS), 120(13):e2215907120, 2023

Melanie Mitchell and David C Krakauer. The debate over understanding in ai’s large language models.Proceedings of the National Academy of Sciences (PNAS), 120(13):e2215907120, 2023

2023
[12]

Language models represent space and time

Wes Gurnee and Max Tegmark. Language models represent space and time. InInternational Conference on Learning Representations (ICLR), 2024

2024
[13]

Dissociating language and thought in large language models.Trends in Cognitive Sciences, 28(6):517–540, 2024

Kyle Mahowald, Anna A Ivanova, Idan A Blank, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. Dissociating language and thought in large language models.Trends in Cognitive Sciences, 28(6):517–540, 2024

2024
[14]

Premise REFERENCES 59 order matters in reasoning with large language models

Xinyun Chen, Ryan Andrew Chi, Xuezhi Wang, and Denny Zhou. Premise REFERENCES 59 order matters in reasoning with large language models. InInternational Conference on Machine Learning (ICML), 2024

2024
[15]

Frontier llms still struggle with simple reasoning tasks.arXiv preprint arXiv:2507.07313, 2025

Alan Malek, Jiawei Ge, Nevena Lazic, Chi Jin, Andr´ as Gy¨ orgy, and Csaba Szepesv´ ari. Frontier llms still struggle with simple reasoning tasks.arXiv preprint arXiv:2507.07313, 2025

work page arXiv 2025
[16]

Brain-score: Which artificial neural network for object recognition is most brain-like?BioRxiv, page 407007, 2018

Martin Schrimpf, Jonas Kubilius, Ha Hong, Najib J Majaj, Rishi Rajal- ingham, Elias B Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott- Roy, Franziska Geiger, et al. Brain-score: Which artificial neural network for object recognition is most brain-like?BioRxiv, page 407007, 2018

2018
[17]

Performance-optimized hierarchical models predict neural responses in higher visual cortex.Proceedings of the National Academy of Sciences (PNAS), 111(23):8619–8624, 2014

Daniel LK Yamins, Ha Hong, Charles F Cadieu, Ethan A Solomon, Dar- ren Seibert, and James J DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex.Proceedings of the National Academy of Sciences (PNAS), 111(23):8619–8624, 2014

2014
[18]

The neural architecture of language: Integrative modeling con- verges on predictive processing.Proceedings of the National Academy of Sciences (PNAS), 118(45):e2105646118, 2021

Martin Schrimpf, Idan Asher Blank, Greta Tuckute, Carina Kauf, Egh- bal A Hosseini, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. The neural architecture of language: Integrative modeling con- verges on predictive processing.Proceedings of the National Academy of Sciences (PNAS), 118(45):e2105646118, 2021

2021
[19]

Unsupervised neural network models of the ventral visual stream.Proceedings of the National Academy of Sciences (PNAS), 118(3):e2014196118, 2021

Chengxu Zhuang, Siming Yan, Aran Nayebi, Martin Schrimpf, Michael C Frank, James J DiCarlo, and Daniel LK Yamins. Unsupervised neural network models of the ventral visual stream.Proceedings of the National Academy of Sciences (PNAS), 118(3):e2014196118, 2021

2021
[20]

Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5(12):1415–1426, 2023

Aria Y Wang, Kendrick Kay, Thomas Naselaris, Michael J Tarr, and Leila Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5(12):1415–1426, 2023

2023
[21]

Brains and algorithms par- tially converge in natural language processing.Communications Biology, 5(1):134, 2022

Charlotte Caucheteux and Jean-R´ emi King. Brains and algorithms par- tially converge in natural language processing.Communications Biology, 5(1):134, 2022

2022
[22]

Shared computational principles for language processing in humans and deep language models.Nature Neuroscience, 25(3):369–380, 2022

Ariel Goldstein, Zaid Zada, Eliav Buchnik, Mariano Schain, Amy Price, Bobbi Aubrey, Samuel A Nastase, Amir Feder, Dotan Emanuel, Alon Cohen, et al. Shared computational principles for language processing in humans and deep language models.Nature Neuroscience, 25(3):369–380, 2022

2022
[23]

Shared functional specialization in transformer-based language models and the human brain.Nature Communications, 15(1):5523, 2024

Sreejan Kumar, Theodore R Sumers, Takateru Yamakoshi, Ariel Gold- stein, Uri Hasson, Kenneth A Norman, Thomas L Griffiths, Robert D Hawkins, and Samuel A Nastase. Shared functional specialization in transformer-based language models and the human brain.Nature Communications, 15(1):5523, 2024

2024
[24]

Ariel Goldstein, Haocheng Wang, Leonard Niekerken, Mariano Schain, Zaid Zada, Bobbi Aubrey, Tom Sheffer, Samuel A Nastase, Harshvard- han Gazula, Aditi Singh, et al. A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language process- ing in everyday conversations.Nature Human Behaviour, 9(5):1041–1055, 2025. 60 ...

2025
[25]

Human- like object concept representations emerge naturally in multimodal large language models.Nature Machine Intelligence, 7(6):860–875, 2025

Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, et al. Human- like object concept representations emerge naturally in multimodal large language models.Nature Machine Intelligence, 7(6):860–875, 2025

2025
[26]

Driv- ing and suppressing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024

Greta Tuckute, Aalok Sathe, Shashank Srikant, Maya Taliaferro, Mingye Wang, Martin Schrimpf, Kendrick Kay, and Evelina Fedorenko. Driv- ing and suppressing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024

2024
[27]

Learning from brains how to regularize machines

Zhe Li, Wieland Brendel, Edgar Walker, Erick Cobos, Taliah Muham- mad, Jacob Reimer, Matthias Bethge, Fabian Sinz, Zachary Pitkow, and Andreas Tolias. Learning from brains how to regularize machines. Advances in Neural Information Processing Systems (NeurIPS), 2019

2019
[28]

Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness

Joel Dapello, Kohitij Kar, Martin Schrimpf, Robert Baldwin Geary, Michael Ferguson, David Daniel Cox, and James J DiCarlo. Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness. In International Conference on Learning Representations (ICLR), 2023

2023
[29]

Improving seman- tic understanding in speech language models via brain-tuning

Omer Moussa, Dietrich Klakow, and Mariya Toneva. Improving seman- tic understanding in speech language models via brain-tuning. In International Conference on Learning Representations (ICLR), 2025

2025
[30]

Qwen2 Technical Report

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, et al. Qwen2 technical report.arXiv preprint arXiv:2407.10671, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[31]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[32]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Alma- hairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[33]

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[34]

Mistral 7B

Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, et al. Mistral 7b.arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[35]

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Abdelrahman Abouelenin, Atabak Ashfaq, Adam Atkinson, Hany Awadalla, Nguyen Bach, Jianmin Bao, Alon Benhaim, Martin Cai, Vishrav Chaudhary, Congcong Chen, et al. Phi-4-mini technical report: Compact yet powerful multimodal language models via mixture-of-loras. arXiv preprint arXiv:2503.01743, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[36]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, REFERENCES 61 Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024
[37]

The llm language network: A neuroscientific approach for identifying causally task-relevant units

Badr AlKhamissi, Greta Tuckute, Antoine Bosselut, and Martin Schrimpf. The llm language network: A neuroscientific approach for identifying causally task-relevant units. InAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

2025
[38]

From language to cognition: How llms outgrow the human language network

Badr AlKhamissi, Greta Tuckute, Yingtian Tang, Taha Osama A Binhu- raib, Antoine Bosselut, and Martin Schrimpf. From language to cognition: How llms outgrow the human language network. InAnnual Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

2025
[39]

The multiple-demand (md) system of the primate brain: mental programs for intelligent behaviour.Trends in Cognitive Sciences, 14(4):172–179, 2010

John Duncan. The multiple-demand (md) system of the primate brain: mental programs for intelligent behaviour.Trends in Cognitive Sciences, 14(4):172–179, 2010

2010
[40]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), 2017

2017
[41]

Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C

Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Tren- ton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L. Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermy...

2024
[42]

Improv- ing reasoning performance in large language models via representation engineering

Bertram Højer, Oliver Simon Jarvis, and Stefan Heinrich. Improv- ing reasoning performance in large language models via representation engineering. InInternational Conference on Learning Representations (ICLR), 2025

2025
[43]

Analysing the gen- eralisation and reliability of steering vectors

Daniel Tan, David Chanin, Aengus Lynch, Brooks Paige, Dimitrios Kanoulas, Adri` a Garriga-Alonso, and Robert Kirk. Analysing the gen- eralisation and reliability of steering vectors. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

2024
[44]

Steering off course: Reliability challenges in steering language models

Patrick Queiroz Da Silva, Hari Sethuraman, Dheeraj Rajagopal, Han- naneh Hajishirzi, and Sachin Kumar. Steering off course: Reliability challenges in steering language models. InAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

2025
[45]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek- r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[46]

62 REFERENCES Folio: Natural language reasoning with first-order logic

Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Wenfei Zhou, James Coady, David Peng, Yujie Qiao, Luke Benson, et al. 62 REFERENCES Folio: Natural language reasoning with first-order logic. InAnnual Con- ference on Empirical Methods in Natural Language Processing (EMNLP), 2024

2024
[47]

The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013

David C Van Essen, Stephen M Smith, Deanna M Barch, Timothy EJ Behrens, Essa Yacoub, Kamil Ugurbil, Wu-Minn HCP Consortium, et al. The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013

2013
[48]

Emergent world representations: Explor- ing a sequence model trained on a synthetic task

Kenneth Li, Aspen K Hopkins, David Bau, Fernanda Vi´ egas, Hanspeter Pfister, and Martin Wattenberg. Emergent world representations: Explor- ing a sequence model trained on a synthetic task. InInternational Conference on Learning Representations (ICLR), 2023

2023
[49]

A foundation model to predict and capture human cognition.Nature, 644(8078):1002–1009, 2025

Marcel Binz, Elif Akata, Matthias Bethge, Franziska Br¨ andle, Fred Call- away, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K Eckstein, No´ emi´Eltet˝ o, et al. A foundation model to predict and capture human cognition.Nature, 644(8078):1002–1009, 2025

2025
[50]

A neuroimaging dataset of deductive reasoning in school-aged children.Data in Brief, 33:106405, 2020

MN Lytle, J Prado, and JR Booth. A neuroimaging dataset of deductive reasoning in school-aged children.Data in Brief, 33:106405, 2020

2020
[51]

Towards spike-based machine intelligence with neuromorphic computing.Nature, 575(7784):607–617, 2019

Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. Towards spike-based machine intelligence with neuromorphic computing.Nature, 575(7784):607–617, 2019

2019
[52]

A long short-term memory for AI applications in spike-based neuromorphic hardware.Nature Machine Intelligence, 4(5):467–479, 2022

Arjun Rao, Philipp Plank, Andreas Wild, and Wolfgang Maass. A long short-term memory for AI applications in spike-based neuromorphic hardware.Nature Machine Intelligence, 4(5):467–479, 2022

2022
[53]

Closed- form continuous-time neural networks.Nature Machine Intelligence, 4(11):992–1003, 2022

Ramin Hasani, Mathias Lechner, Alexander Amini, Lucas Liebenwein, Aaron Ray, Max Tschaikowski, Gerald Teschl, and Daniela Rus. Closed- form continuous-time neural networks.Nature Machine Intelligence, 4(11):992–1003, 2022

2022
[54]

Biological under- pinnings for lifelong learning machines.Nature Machine Intelligence, 4(3):196–210, 2022

Dhireesha Kudithipudi, Mario Aguilar-Simon, Jonathan Babb, Maxim Bazhenov, Douglas Blackiston, Josh Bongard, Andrew P Brna, Suraj Chakravarthi Raja, Nick Cheney, Jeff Clune, et al. Biological under- pinnings for lifelong learning machines.Nature Machine Intelligence, 4(3):196–210, 2022

2022
[55]

Incorporating neuro-inspired adaptability for con- tinual learning in artificial intelligence.Nature Machine Intelligence, 5(12):1356–1368, 2023

Liyuan Wang, Xingxing Zhang, Qian Li, Mingtian Zhang, Hang Su, Jun Zhu, and Yi Zhong. Incorporating neuro-inspired adaptability for con- tinual learning in artificial intelligence.Nature Machine Intelligence, 5(12):1356–1368, 2023

2023
[56]

Hebbian learning based orthogonal projection for continual learn- ing of spiking neural networks

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, and Zhouchen Lin. Hebbian learning based orthogonal projection for continual learn- ing of spiking neural networks. InInternational Conference on Learning Representations (ICLR), 2024

2024
[57]

Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings

Jascha Achterberg, Danyal Akarca, DJ Strouse, John Duncan, and Dun- can E Astle. Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings. Nature Machine Intelligence, 5(12):1369–1381, 2023. REFERENCES 63

2023
[58]

Posi- tion: The platonic representation hypothesis

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Posi- tion: The platonic representation hypothesis. InInternational Conference on Machine Learning (ICML), 2024

2024
[59]

Inducing brain-relevant bias in natural language processing models

Dan Schwartz, Mariya Toneva, and Leila Wehbe. Inducing brain-relevant bias in natural language processing models. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

2019
[60]

Testing theory of mind in large language models and humans.Nature Human Behaviour, 8(7):1285–1295, 2024

James WA Strachan, Dalila Albergo, Giulia Borghini, Oriana Pansardi, Eugenio Scaliti, Saurabh Gupta, Krati Saxena, Alessandro Rufo, Stefano Panzeri, Guido Manzi, et al. Testing theory of mind in large language models and humans.Nature Human Behaviour, 8(7):1285–1295, 2024

2024
[61]

Language models rep- resent beliefs of self and others

Wentao Zhu, Zhining Zhang, and Yizhou Wang. Language models rep- resent beliefs of self and others. InInternational Conference on Machine Learning (ICML), 2024

2024
[62]

Improving the accuracy of single-trial fmri response estimates using glmsingle.Elife, 11:e77599, 2022

Jacob S Prince, Ian Charest, Jan W Kurzawski, John A Pyles, Michael J Tarr, and Kendrick N Kay. Improving the accuracy of single-trial fmri response estimates using glmsingle.Elife, 11:e77599, 2022

2022
[63]

New method for fmri inves- tigations of language: defining rois functionally in individual subjects

Evelina Fedorenko, Po-Jang Hsieh, Alfonso Nieto-Casta˜ n´ on, Susan Whitfield-Gabrieli, and Nancy Kanwisher. New method for fmri inves- tigations of language: defining rois functionally in individual subjects. Journal of Neurophysiology, 104(2):1177–1194, 2010

2010
[64]

Cumulative Reasoning with Large Language Models

Yifan Zhang, Jingqin Yang, Yang Yuan, and Andrew Chi-Chih Yao. Cumulative reasoning with large language models.arXiv preprint arXiv:2308.04371, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[65]

Lora: Low-rank adapta- tion of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adapta- tion of large language models. InInternational Conference on Learning Representations (ICLR), 2022

2022
[66]

https://doi.org/10.5281/ zenodo.19536182, 2026

Mingqing Xiao, Kai Du, and Zhouchen Lin. https://doi.org/10.5281/ zenodo.19536182, 2026

2026
[67]

Fractionating the neural substrates of transitive reasoning: Task-dependent contributions of spatial and verbal representations.Cerebral Cortex, 23(3):499–507, 2013

Jerome Prado, Rachna Mutreja, and James R Booth. Fractionating the neural substrates of transitive reasoning: Task-dependent contributions of spatial and verbal representations.Cerebral Cortex, 23(3):499–507, 2013

2013
[68]

Differentiable opti- mization of similarity scores between models and brains

Nathan Cloos, Moufan Li, Markus Siegel, Scott L Brincat, Earl K Miller, Guangyu Robert Yang, and Christopher J Cueva. Differentiable opti- mization of similarity scores between models and brains. InInternational Conference on Learning Representations (ICLR), 2025

2025
[69]

Inference-time intervention: Eliciting truthful answers from a language model

Kenneth Li, Oam Patel, Fernanda Vi´ egas, Hanspeter Pfister, and Martin Wattenberg. Inference-time intervention: Eliciting truthful answers from a language model. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023
[70]

Opencompass: A universal evaluation platform for foundation models

OpenCompass Contributors. Opencompass: A universal evaluation platform for foundation models. https://github.com/open-compass/ opencompass, 2023

2023

[1] [1]

Emergent abilities of large language models.Transactions on Machine Learning Research, 2022

Jason Wei, Yi Tay, Rishi Bommasani, Colin Raffel, Barret Zoph, Sebas- tian Borgeaud, Dani Yogatama, Maarten Bosma, Denny Zhou, Donald Metzler, et al. Emergent abilities of large language models.Transactions on Machine Learning Research, 2022

2022

[2] [2]

Chain-of-thought prompting elicits reasoning in large language models

Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Fei Xia, Ed Chi, Quoc V Le, Denny Zhou, et al. Chain-of-thought prompting elicits reasoning in large language models. InAdvances in Neural Information Processing Systems (NeurIPS), 2022

2022

[3] [3]

Emergent analogical reasoning in large language models.Nature Human Behaviour, 7(9):1526– 1541, 2023

Taylor Webb, Keith J Holyoak, and Hongjing Lu. Emergent analogical reasoning in large language models.Nature Human Behaviour, 7(9):1526– 1541, 2023

2023

[4] [4]

Testing the general deductive reasoning capacity of large language models using ood exam- ples

Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Mehran Kazemi, Najoung Kim, and He He. Testing the general deductive reasoning capacity of large language models using ood exam- ples. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023

[5] [5]

Mit Press, 1994

Lance J Rips.The psychology of proof: Deductive reasoning in human thinking. Mit Press, 1994

1994

[6] [6]

The brain network for deductive reasoning: a quantitative meta-analysis of 28 neuroimaging studies.Journal of Cognitive Neuroscience, 23(11):3483–3497, 2011

J´ erˆ ome Prado, Angad Chadha, and James R Booth. The brain network for deductive reasoning: a quantitative meta-analysis of 28 neuroimaging studies.Journal of Cognitive Neuroscience, 23(11):3483–3497, 2011

2011

[7] [7]

Oxford university press, 1994

Alfred Tarski.Introduction to Logic and to the Methodology of the Deductive Sciences, volume 24. Oxford university press, 1994

1994

[8] [8]

The boundaries of language and thought in deductive inference.Proceedings of the National Academy of Sciences (PNAS), 106(30):12554–12559, 2009

Martin M Monti, Lawrence M Parsons, and Daniel N Osherson. The boundaries of language and thought in deductive inference.Proceedings of the National Academy of Sciences (PNAS), 106(30):12554–12559, 2009

2009

[9] [9]

Lan- guage is primarily a tool for communication rather than thought.Nature, 630(8017):575–586, 2024

Evelina Fedorenko, Steven T Piantadosi, and Edward AF Gibson. Lan- guage is primarily a tool for communication rather than thought.Nature, 630(8017):575–586, 2024

2024

[10] [10]

Evaluating the deductive competence of large language models

S Seals and Valerie Shalin. Evaluating the deductive competence of large language models. InNorth American Chapter of the Association for Com- putational Linguistics: Human Language Technologies (NAACL-HLT), 2024

2024

[11] [11]

The debate over understanding in ai’s large language models.Proceedings of the National Academy of Sciences (PNAS), 120(13):e2215907120, 2023

Melanie Mitchell and David C Krakauer. The debate over understanding in ai’s large language models.Proceedings of the National Academy of Sciences (PNAS), 120(13):e2215907120, 2023

2023

[12] [12]

Language models represent space and time

Wes Gurnee and Max Tegmark. Language models represent space and time. InInternational Conference on Learning Representations (ICLR), 2024

2024

[13] [13]

Dissociating language and thought in large language models.Trends in Cognitive Sciences, 28(6):517–540, 2024

Kyle Mahowald, Anna A Ivanova, Idan A Blank, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. Dissociating language and thought in large language models.Trends in Cognitive Sciences, 28(6):517–540, 2024

2024

[14] [14]

Premise REFERENCES 59 order matters in reasoning with large language models

Xinyun Chen, Ryan Andrew Chi, Xuezhi Wang, and Denny Zhou. Premise REFERENCES 59 order matters in reasoning with large language models. InInternational Conference on Machine Learning (ICML), 2024

2024

[15] [15]

Frontier llms still struggle with simple reasoning tasks.arXiv preprint arXiv:2507.07313, 2025

Alan Malek, Jiawei Ge, Nevena Lazic, Chi Jin, Andr´ as Gy¨ orgy, and Csaba Szepesv´ ari. Frontier llms still struggle with simple reasoning tasks.arXiv preprint arXiv:2507.07313, 2025

work page arXiv 2025

[16] [16]

Brain-score: Which artificial neural network for object recognition is most brain-like?BioRxiv, page 407007, 2018

Martin Schrimpf, Jonas Kubilius, Ha Hong, Najib J Majaj, Rishi Rajal- ingham, Elias B Issa, Kohitij Kar, Pouya Bashivan, Jonathan Prescott- Roy, Franziska Geiger, et al. Brain-score: Which artificial neural network for object recognition is most brain-like?BioRxiv, page 407007, 2018

2018

[17] [17]

Performance-optimized hierarchical models predict neural responses in higher visual cortex.Proceedings of the National Academy of Sciences (PNAS), 111(23):8619–8624, 2014

Daniel LK Yamins, Ha Hong, Charles F Cadieu, Ethan A Solomon, Dar- ren Seibert, and James J DiCarlo. Performance-optimized hierarchical models predict neural responses in higher visual cortex.Proceedings of the National Academy of Sciences (PNAS), 111(23):8619–8624, 2014

2014

[18] [18]

The neural architecture of language: Integrative modeling con- verges on predictive processing.Proceedings of the National Academy of Sciences (PNAS), 118(45):e2105646118, 2021

Martin Schrimpf, Idan Asher Blank, Greta Tuckute, Carina Kauf, Egh- bal A Hosseini, Nancy Kanwisher, Joshua B Tenenbaum, and Evelina Fedorenko. The neural architecture of language: Integrative modeling con- verges on predictive processing.Proceedings of the National Academy of Sciences (PNAS), 118(45):e2105646118, 2021

2021

[19] [19]

Unsupervised neural network models of the ventral visual stream.Proceedings of the National Academy of Sciences (PNAS), 118(3):e2014196118, 2021

Chengxu Zhuang, Siming Yan, Aran Nayebi, Martin Schrimpf, Michael C Frank, James J DiCarlo, and Daniel LK Yamins. Unsupervised neural network models of the ventral visual stream.Proceedings of the National Academy of Sciences (PNAS), 118(3):e2014196118, 2021

2021

[20] [20]

Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5(12):1415–1426, 2023

Aria Y Wang, Kendrick Kay, Thomas Naselaris, Michael J Tarr, and Leila Wehbe. Better models of human high-level visual cortex emerge from natural language supervision with a large and diverse dataset.Nature Machine Intelligence, 5(12):1415–1426, 2023

2023

[21] [21]

Brains and algorithms par- tially converge in natural language processing.Communications Biology, 5(1):134, 2022

Charlotte Caucheteux and Jean-R´ emi King. Brains and algorithms par- tially converge in natural language processing.Communications Biology, 5(1):134, 2022

2022

[22] [22]

Shared computational principles for language processing in humans and deep language models.Nature Neuroscience, 25(3):369–380, 2022

Ariel Goldstein, Zaid Zada, Eliav Buchnik, Mariano Schain, Amy Price, Bobbi Aubrey, Samuel A Nastase, Amir Feder, Dotan Emanuel, Alon Cohen, et al. Shared computational principles for language processing in humans and deep language models.Nature Neuroscience, 25(3):369–380, 2022

2022

[23] [23]

Shared functional specialization in transformer-based language models and the human brain.Nature Communications, 15(1):5523, 2024

Sreejan Kumar, Theodore R Sumers, Takateru Yamakoshi, Ariel Gold- stein, Uri Hasson, Kenneth A Norman, Thomas L Griffiths, Robert D Hawkins, and Samuel A Nastase. Shared functional specialization in transformer-based language models and the human brain.Nature Communications, 15(1):5523, 2024

2024

[24] [24]

Ariel Goldstein, Haocheng Wang, Leonard Niekerken, Mariano Schain, Zaid Zada, Bobbi Aubrey, Tom Sheffer, Samuel A Nastase, Harshvard- han Gazula, Aditi Singh, et al. A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language process- ing in everyday conversations.Nature Human Behaviour, 9(5):1041–1055, 2025. 60 ...

2025

[25] [25]

Human- like object concept representations emerge naturally in multimodal large language models.Nature Machine Intelligence, 7(6):860–875, 2025

Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, et al. Human- like object concept representations emerge naturally in multimodal large language models.Nature Machine Intelligence, 7(6):860–875, 2025

2025

[26] [26]

Driv- ing and suppressing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024

Greta Tuckute, Aalok Sathe, Shashank Srikant, Maya Taliaferro, Mingye Wang, Martin Schrimpf, Kendrick Kay, and Evelina Fedorenko. Driv- ing and suppressing the human language network using large language models.Nature Human Behaviour, 8(3):544–561, 2024

2024

[27] [27]

Learning from brains how to regularize machines

Zhe Li, Wieland Brendel, Edgar Walker, Erick Cobos, Taliah Muham- mad, Jacob Reimer, Matthias Bethge, Fabian Sinz, Zachary Pitkow, and Andreas Tolias. Learning from brains how to regularize machines. Advances in Neural Information Processing Systems (NeurIPS), 2019

2019

[28] [28]

Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness

Joel Dapello, Kohitij Kar, Martin Schrimpf, Robert Baldwin Geary, Michael Ferguson, David Daniel Cox, and James J DiCarlo. Aligning model and macaque inferior temporal cortex representations improves model-to-human behavioral alignment and adversarial robustness. In International Conference on Learning Representations (ICLR), 2023

2023

[29] [29]

Improving seman- tic understanding in speech language models via brain-tuning

Omer Moussa, Dietrich Klakow, and Mariya Toneva. Improving seman- tic understanding in speech language models via brain-tuning. In International Conference on Learning Representations (ICLR), 2025

2025

[30] [30]

Qwen2 Technical Report

An Yang, Baosong Yang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Zhou, Chengpeng Li, Chengyuan Li, Dayiheng Liu, Fei Huang, et al. Qwen2 technical report.arXiv preprint arXiv:2407.10671, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[31] [31]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[32] [32]

Llama 2: Open Foundation and Fine-Tuned Chat Models

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Alma- hairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. Llama 2: Open foundation and fine-tuned chat models.arXiv preprint arXiv:2307.09288, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[33] [33]

The Llama 3 Herd of Models

Abhimanyu Dubey, Abhinav Jauhri, Abhinav Pandey, Abhishek Kadian, Ahmad Al-Dahle, Aiesha Letman, Akhil Mathur, Alan Schelten, Amy Yang, Angela Fan, et al. The llama 3 herd of models.arXiv preprint arXiv:2407.21783, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[34] [34]

Mistral 7B

Albert Q Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lucile Saulnier, et al. Mistral 7b.arXiv preprint arXiv:2310.06825, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[35] [35]

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Abdelrahman Abouelenin, Atabak Ashfaq, Adam Atkinson, Hany Awadalla, Nguyen Bach, Jianmin Bao, Alon Benhaim, Martin Cai, Vishrav Chaudhary, Congcong Chen, et al. Phi-4-mini technical report: Compact yet powerful multimodal language models via mixture-of-loras. arXiv preprint arXiv:2503.01743, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[36] [36]

Gemma 2: Improving Open Language Models at a Practical Size

Gemma Team, Morgane Riviere, Shreya Pathak, Pier Giuseppe Sessa, Cassidy Hardin, Surya Bhupatiraju, L´ eonard Hussenot, Thomas Mesnard, REFERENCES 61 Bobak Shahriari, Alexandre Ram´ e, et al. Gemma 2: Improving open language models at a practical size.arXiv preprint arXiv:2408.00118, 2024

work page internal anchor Pith review Pith/arXiv arXiv 2024

[37] [37]

The llm language network: A neuroscientific approach for identifying causally task-relevant units

Badr AlKhamissi, Greta Tuckute, Antoine Bosselut, and Martin Schrimpf. The llm language network: A neuroscientific approach for identifying causally task-relevant units. InAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

2025

[38] [38]

From language to cognition: How llms outgrow the human language network

Badr AlKhamissi, Greta Tuckute, Yingtian Tang, Taha Osama A Binhu- raib, Antoine Bosselut, and Martin Schrimpf. From language to cognition: How llms outgrow the human language network. InAnnual Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025

2025

[39] [39]

The multiple-demand (md) system of the primate brain: mental programs for intelligent behaviour.Trends in Cognitive Sciences, 14(4):172–179, 2010

John Duncan. The multiple-demand (md) system of the primate brain: mental programs for intelligent behaviour.Trends in Cognitive Sciences, 14(4):172–179, 2010

2010

[40] [40]

Gomez, Lukasz Kaiser, and Illia Polosukhin

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. Attention is all you need. InAdvances in Neural Information Processing Systems (NeurIPS), 2017

2017

[41] [41]

Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C

Adly Templeton, Tom Conerly, Jonathan Marcus, Jack Lindsey, Tren- ton Bricken, Brian Chen, Adam Pearce, Craig Citro, Emmanuel Ameisen, Andy Jones, Hoagy Cunningham, Nicholas L. Turner, Callum McDougall, Monte MacDiarmid, Alex Tamkin, Esin Durmus, Tristan Hume, Francesco Mosconi, C. Daniel Freeman, Theodore R. Sumers, Edward Rees, Joshua Batson, Adam Jermy...

2024

[42] [42]

Improv- ing reasoning performance in large language models via representation engineering

Bertram Højer, Oliver Simon Jarvis, and Stefan Heinrich. Improv- ing reasoning performance in large language models via representation engineering. InInternational Conference on Learning Representations (ICLR), 2025

2025

[43] [43]

Analysing the gen- eralisation and reliability of steering vectors

Daniel Tan, David Chanin, Aengus Lynch, Brooks Paige, Dimitrios Kanoulas, Adri` a Garriga-Alonso, and Robert Kirk. Analysing the gen- eralisation and reliability of steering vectors. InAdvances in Neural Information Processing Systems (NeurIPS), 2024

2024

[44] [44]

Steering off course: Reliability challenges in steering language models

Patrick Queiroz Da Silva, Hari Sethuraman, Dheeraj Rajagopal, Han- naneh Hajishirzi, and Sachin Kumar. Steering off course: Reliability challenges in steering language models. InAnnual Meeting of the Association for Computational Linguistics (ACL), 2025

2025

[45] [45]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Daya Guo, Dejian Yang, Haowei Zhang, Junxiao Song, Ruoyu Zhang, Runxin Xu, Qihao Zhu, Shirong Ma, Peiyi Wang, Xiao Bi, et al. Deepseek- r1: Incentivizing reasoning capability in llms via reinforcement learning. arXiv preprint arXiv:2501.12948, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[46] [46]

62 REFERENCES Folio: Natural language reasoning with first-order logic

Simeng Han, Hailey Schoelkopf, Yilun Zhao, Zhenting Qi, Martin Riddell, Wenfei Zhou, James Coady, David Peng, Yujie Qiao, Luke Benson, et al. 62 REFERENCES Folio: Natural language reasoning with first-order logic. InAnnual Con- ference on Empirical Methods in Natural Language Processing (EMNLP), 2024

2024

[47] [47]

The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013

David C Van Essen, Stephen M Smith, Deanna M Barch, Timothy EJ Behrens, Essa Yacoub, Kamil Ugurbil, Wu-Minn HCP Consortium, et al. The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013

2013

[48] [48]

Emergent world representations: Explor- ing a sequence model trained on a synthetic task

Kenneth Li, Aspen K Hopkins, David Bau, Fernanda Vi´ egas, Hanspeter Pfister, and Martin Wattenberg. Emergent world representations: Explor- ing a sequence model trained on a synthetic task. InInternational Conference on Learning Representations (ICLR), 2023

2023

[49] [49]

A foundation model to predict and capture human cognition.Nature, 644(8078):1002–1009, 2025

Marcel Binz, Elif Akata, Matthias Bethge, Franziska Br¨ andle, Fred Call- away, Julian Coda-Forno, Peter Dayan, Can Demircan, Maria K Eckstein, No´ emi´Eltet˝ o, et al. A foundation model to predict and capture human cognition.Nature, 644(8078):1002–1009, 2025

2025

[50] [50]

A neuroimaging dataset of deductive reasoning in school-aged children.Data in Brief, 33:106405, 2020

MN Lytle, J Prado, and JR Booth. A neuroimaging dataset of deductive reasoning in school-aged children.Data in Brief, 33:106405, 2020

2020

[51] [51]

Towards spike-based machine intelligence with neuromorphic computing.Nature, 575(7784):607–617, 2019

Kaushik Roy, Akhilesh Jaiswal, and Priyadarshini Panda. Towards spike-based machine intelligence with neuromorphic computing.Nature, 575(7784):607–617, 2019

2019

[52] [52]

A long short-term memory for AI applications in spike-based neuromorphic hardware.Nature Machine Intelligence, 4(5):467–479, 2022

Arjun Rao, Philipp Plank, Andreas Wild, and Wolfgang Maass. A long short-term memory for AI applications in spike-based neuromorphic hardware.Nature Machine Intelligence, 4(5):467–479, 2022

2022

[53] [53]

Closed- form continuous-time neural networks.Nature Machine Intelligence, 4(11):992–1003, 2022

Ramin Hasani, Mathias Lechner, Alexander Amini, Lucas Liebenwein, Aaron Ray, Max Tschaikowski, Gerald Teschl, and Daniela Rus. Closed- form continuous-time neural networks.Nature Machine Intelligence, 4(11):992–1003, 2022

2022

[54] [54]

Biological under- pinnings for lifelong learning machines.Nature Machine Intelligence, 4(3):196–210, 2022

Dhireesha Kudithipudi, Mario Aguilar-Simon, Jonathan Babb, Maxim Bazhenov, Douglas Blackiston, Josh Bongard, Andrew P Brna, Suraj Chakravarthi Raja, Nick Cheney, Jeff Clune, et al. Biological under- pinnings for lifelong learning machines.Nature Machine Intelligence, 4(3):196–210, 2022

2022

[55] [55]

Incorporating neuro-inspired adaptability for con- tinual learning in artificial intelligence.Nature Machine Intelligence, 5(12):1356–1368, 2023

Liyuan Wang, Xingxing Zhang, Qian Li, Mingtian Zhang, Hang Su, Jun Zhu, and Yi Zhong. Incorporating neuro-inspired adaptability for con- tinual learning in artificial intelligence.Nature Machine Intelligence, 5(12):1356–1368, 2023

2023

[56] [56]

Hebbian learning based orthogonal projection for continual learn- ing of spiking neural networks

Mingqing Xiao, Qingyan Meng, Zongpeng Zhang, Di He, and Zhouchen Lin. Hebbian learning based orthogonal projection for continual learn- ing of spiking neural networks. InInternational Conference on Learning Representations (ICLR), 2024

2024

[57] [57]

Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings

Jascha Achterberg, Danyal Akarca, DJ Strouse, John Duncan, and Dun- can E Astle. Spatially embedded recurrent neural networks reveal widespread links between structural and functional neuroscience findings. Nature Machine Intelligence, 5(12):1369–1381, 2023. REFERENCES 63

2023

[58] [58]

Posi- tion: The platonic representation hypothesis

Minyoung Huh, Brian Cheung, Tongzhou Wang, and Phillip Isola. Posi- tion: The platonic representation hypothesis. InInternational Conference on Machine Learning (ICML), 2024

2024

[59] [59]

Inducing brain-relevant bias in natural language processing models

Dan Schwartz, Mariya Toneva, and Leila Wehbe. Inducing brain-relevant bias in natural language processing models. InAdvances in Neural Information Processing Systems (NeurIPS), 2019

2019

[60] [60]

Testing theory of mind in large language models and humans.Nature Human Behaviour, 8(7):1285–1295, 2024

James WA Strachan, Dalila Albergo, Giulia Borghini, Oriana Pansardi, Eugenio Scaliti, Saurabh Gupta, Krati Saxena, Alessandro Rufo, Stefano Panzeri, Guido Manzi, et al. Testing theory of mind in large language models and humans.Nature Human Behaviour, 8(7):1285–1295, 2024

2024

[61] [61]

Language models rep- resent beliefs of self and others

Wentao Zhu, Zhining Zhang, and Yizhou Wang. Language models rep- resent beliefs of self and others. InInternational Conference on Machine Learning (ICML), 2024

2024

[62] [62]

Improving the accuracy of single-trial fmri response estimates using glmsingle.Elife, 11:e77599, 2022

Jacob S Prince, Ian Charest, Jan W Kurzawski, John A Pyles, Michael J Tarr, and Kendrick N Kay. Improving the accuracy of single-trial fmri response estimates using glmsingle.Elife, 11:e77599, 2022

2022

[63] [63]

New method for fmri inves- tigations of language: defining rois functionally in individual subjects

Evelina Fedorenko, Po-Jang Hsieh, Alfonso Nieto-Casta˜ n´ on, Susan Whitfield-Gabrieli, and Nancy Kanwisher. New method for fmri inves- tigations of language: defining rois functionally in individual subjects. Journal of Neurophysiology, 104(2):1177–1194, 2010

2010

[64] [64]

Cumulative Reasoning with Large Language Models

Yifan Zhang, Jingqin Yang, Yang Yuan, and Andrew Chi-Chih Yao. Cumulative reasoning with large language models.arXiv preprint arXiv:2308.04371, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[65] [65]

Lora: Low-rank adapta- tion of large language models

Edward J Hu, Yelong Shen, Phillip Wallis, Zeyuan Allen-Zhu, Yuanzhi Li, Shean Wang, Lu Wang, Weizhu Chen, et al. Lora: Low-rank adapta- tion of large language models. InInternational Conference on Learning Representations (ICLR), 2022

2022

[66] [66]

https://doi.org/10.5281/ zenodo.19536182, 2026

Mingqing Xiao, Kai Du, and Zhouchen Lin. https://doi.org/10.5281/ zenodo.19536182, 2026

2026

[67] [67]

Fractionating the neural substrates of transitive reasoning: Task-dependent contributions of spatial and verbal representations.Cerebral Cortex, 23(3):499–507, 2013

Jerome Prado, Rachna Mutreja, and James R Booth. Fractionating the neural substrates of transitive reasoning: Task-dependent contributions of spatial and verbal representations.Cerebral Cortex, 23(3):499–507, 2013

2013

[68] [68]

Differentiable opti- mization of similarity scores between models and brains

Nathan Cloos, Moufan Li, Markus Siegel, Scott L Brincat, Earl K Miller, Guangyu Robert Yang, and Christopher J Cueva. Differentiable opti- mization of similarity scores between models and brains. InInternational Conference on Learning Representations (ICLR), 2025

2025

[69] [69]

Inference-time intervention: Eliciting truthful answers from a language model

Kenneth Li, Oam Patel, Fernanda Vi´ egas, Hanspeter Pfister, and Martin Wattenberg. Inference-time intervention: Eliciting truthful answers from a language model. InAdvances in Neural Information Processing Systems (NeurIPS), 2023

2023

[70] [70]

Opencompass: A universal evaluation platform for foundation models

OpenCompass Contributors. Opencompass: A universal evaluation platform for foundation models. https://github.com/open-compass/ opencompass, 2023

2023