arxiv: 2605.06584 · v1 · submitted 2026-05-07 · 💻 cs.AI

Recognition: unknown

NeuroAgent: LLM Agents for Multimodal Neuroimaging Analysis and Research

Lujia Zhong , Yihao Xia , Jianwei Zhang , Shuo Huang , Jiaxin Yue , Mingyang Xia , Yonggang Shi

Authors on Pith no claims yet

Pith reviewed 2026-05-08 09:33 UTC · model grok-4.3

classification 💻 cs.AI

keywords neuroimaging analysisLLM agentsmultimodal preprocessingAlzheimer's classificationagentic frameworkautomated pipelinesADNI dataset

0 comments

The pith

NeuroAgent uses LLM agents to automate multimodal neuroimaging preprocessing and achieves an AUC of 0.9518 for Alzheimer's classification with four modalities.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents NeuroAgent as a system of LLM-powered agents that takes on the complex task of preparing and analyzing brain scans from multiple modalities including sMRI, fMRI, dMRI, and PET. Agents generate code for preprocessing steps, run it, recover from errors, and validate outputs through a Generate-Execute-Validate loop, cutting down on manual configuration and quality checks. Tested on 1,470 subjects from the ADNI dataset, the approach reaches high correctness rates with strong LLM backends and delivers better disease classification when all modalities are combined than when any one is used alone. This setup supports natural-language queries for further analysis after the data is ready.

Core claim

NeuroAgent's hierarchical multi-agent architecture with a feedback-driven Generate-Execute-Validate engine autonomously generates executable preprocessing code for heterogeneous neuroimaging data, detects and recovers from runtime errors, validates output integrity, and enables end-to-end multimodal analysis that yields an AUC of 0.9518 for Alzheimer's classification on pooled ADNI data, outperforming single-modality baselines.

What carries the argument

The hierarchical multi-agent architecture with a feedback-driven Generate-Execute-Validate engine that generates, executes, and validates preprocessing code across modalities while limiting human intervention to edge cases.

If this is right

Multimodal data preprocessed by the system produces higher Alzheimer's classification AUC than any single modality.
The architecture reduces manual effort for preprocessing to only edge cases via automated error recovery.
Natural language queries become feasible for downstream statistical analysis once data passes validation.
Pipeline performance scales with the capability of the underlying LLM backend up to 100% intent parsing and 84.8% end-to-end step correctness.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Similar agent ensembles could be adapted to automate preprocessing in other medical imaging domains that face comparable toolchain complexity.
Wider adoption might enable labs with limited programming resources to conduct reproducible multimodal studies.
The reliance on automated validation raises the need for ongoing checks against gold-standard pipelines to maintain scientific trust.

Load-bearing premise

That code produced and validated by the LLM agents yields scientifically valid neuroimaging data without introducing systematic artifacts or biases that would alter research conclusions.

What would settle it

A side-by-side comparison by neuroimaging experts showing that NeuroAgent-processed outputs differ meaningfully from established manual pipelines in standard quality metrics such as tissue segmentation accuracy or lead to different classification performance.

read the original abstract

Multimodal neuroimaging analysis often involves complex, modality-specific preprocessing workflows that require careful configuration, quality control, and coordination across heterogeneous toolchains. Beyond preprocessing, downstream statistical analysis and disease classification commonly require task-specific code, evaluation protocols, and data-format conventions, creating additional barriers between raw acquisitions and reproducible scientific analysis. We present NeuroAgent, an LLM-driven agentic framework that automates key preprocessing and analysis steps for heterogeneous neuroimaging data, including sMRI, fMRI, dMRI, and PET, and supports interactive downstream analysis through natural-language queries. NeuroAgent employs a hierarchical multi-agent architecture with a feedback-driven Generate-Execute-Validate engine: agents autonomously generate executable preprocessing code, detect and recover from runtime errors, and validate output integrity. We evaluate the system on 1,470 subjects pooled across all ADNI phases (CN=1,000, AD=470), where all subjects have sMRI and tabular data, with subsets also having Tau-PET (n=469), fMRI (n=278), and DTI ($n=620$). Pipeline ablation studies across multiple LLM backends show that capable models reach up to 100% intent-parsing accuracy, with the strongest backend (Qwen3.5-27B) reaching 84.8% end-to-end preprocessing step correctness. Automated recovery limits manual intervention to edge cases where human review is required via the Human-In-The-Loop interface. For Alzheimer's Disease classification using automatically preprocessed multimodal data, our agent ensemble achieves an AUC of 0.9518 with four modalities, outperforming all single-modality baselines. These results show that NeuroAgent can reduce the manual effort required for neuroimaging preprocessing and enable end-to-end automated analysis pipelines for neuroimaging research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NeuroAgent applies hierarchical LLM agents to automate neuroimaging preprocessing on ADNI data and reports usable success rates plus a high multimodal AUC, but the results hinge on unverified output quality.

read the letter

The paper introduces NeuroAgent as a multi-agent setup that generates preprocessing code for sMRI, fMRI, dMRI, and PET, runs it, recovers from errors, and validates outputs before feeding into Alzheimer's classification on 1470 ADNI subjects. The Generate-Execute-Validate loop with human fallback is the main engineering move, and they test it across several LLM backends on a real public cohort with mixed modalities available for subsets of the data. That produces the headline numbers: 100% intent parsing in some cases, 84.8% end-to-end step correctness for Qwen3.5-27B, and 0.9518 AUC when all four modalities are combined, beating the single-modality baselines. The scale and the concrete ablation across models are the parts that feel grounded rather than speculative. The domain-specific application to heterogeneous neuroimaging toolchains is new enough in the agent literature to count as a contribution, even if the underlying agent pattern itself is not. The soft spot is exactly the one the stress-test flags. The metrics track whether code executes and recovers, not whether the resulting images match what expert pipelines produce on registration accuracy, SNR, or segmentation quality. Without a side-by-side comparison to fMRIPrep, FreeSurfer, or blinded expert QC, the AUC gain could reflect systematic artifacts that happen to separate CN from AD rather than cleaner data. The abstract does not report those checks, and the human-in-the-loop interface is described but not quantified for how often it is actually needed. This is the kind of work that belongs in a reading group for people building agent tools for scientific data pipelines. It is coherent on its own terms and engages the practical pain points in multimodal neuroimaging, so it deserves referee time even though the validation gap will need addressing. I would not cite it yet in my own work until those comparisons appear.

Referee Report

3 major / 3 minor

Summary. The paper introduces NeuroAgent, an LLM-powered hierarchical multi-agent system for automating multimodal neuroimaging preprocessing (sMRI, fMRI, dMRI, PET) and downstream analysis via natural language. It employs a Generate-Execute-Validate engine for code generation, runtime error recovery, and output validation. On 1,470 ADNI subjects (with subsets having additional modalities), the system reports 100% intent parsing, up to 84.8% end-to-end preprocessing step correctness (Qwen3.5-27B backend), and an AUC of 0.9518 for Alzheimer's classification using four modalities, outperforming single-modality baselines. Automated recovery is said to limit human intervention to edge cases via a Human-In-The-Loop interface.

Significance. If the agent-generated preprocessing yields data scientifically equivalent to expert pipelines, NeuroAgent would meaningfully lower barriers to reproducible multimodal neuroimaging research by handling complex toolchains and enabling natural-language analysis. The evaluation on a large public ADNI cohort with concrete metrics across multiple LLM backends and modality ablations is a positive aspect. However, the significance is limited by the absence of direct evidence that execution success translates to valid scientific outputs.

major comments (3)

[§5] §5 (AD classification results): The central claim of AUC 0.9518 with four modalities outperforming single-modality baselines is load-bearing for the paper's contribution. This performance is reported on agent-preprocessed data, yet no quantitative validation against standard pipelines (e.g., fMRIPrep for registration accuracy, FreeSurfer for segmentation Dice scores, or DTIPrep) or blinded expert QC is provided; the 84.8% step correctness measures execution and recovery success only.
[§4.1] §4.1 (Generate-Execute-Validate engine description): The automated error recovery and output validation steps are presented as sufficient to produce usable data with minimal human correction. However, the evaluation provides no analysis of recovered error types, potential systematic artifacts introduced by LLM-generated code, or downstream impact on metrics such as SNR or alignment quality.
[Abstract and §5] Abstract and §5 (ADNI cohort details): The 1,470-subject evaluation pools data across ADNI phases with modality subsets (Tau-PET n=469, fMRI n=278, DTI n=620). Data exclusion rules, preprocessing validation criteria, and statistical tests for the multimodal AUC improvement are not specified, undermining assessment of whether the reported gains reflect valid multimodal signal or preprocessing artifacts.

minor comments (3)

[Abstract and §3] The abstract and §3 use 'end-to-end preprocessing step correctness' without a precise definition or breakdown by modality or error category; a table clarifying this metric would improve clarity.
[Related Work] Related work section omits several recent LLM-agent frameworks for scientific code generation; adding 2-3 key citations would better situate the hierarchical architecture.
[Figures] Figure captions for the agent architecture and Human-In-The-Loop interface are brief; expanding them to describe the feedback loop would aid readers unfamiliar with agentic systems.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed review, which identifies key areas where additional transparency and rigor would strengthen the manuscript. We address each major comment point by point below, committing to revisions that clarify our evaluation protocol while honestly noting the boundaries of the current study.

read point-by-point responses

Referee: [§5] §5 (AD classification results): The central claim of AUC 0.9518 with four modalities outperforming single-modality baselines is load-bearing for the paper's contribution. This performance is reported on agent-preprocessed data, yet no quantitative validation against standard pipelines (e.g., fMRIPrep for registration accuracy, FreeSurfer for segmentation Dice scores, or DTIPrep) or blinded expert QC is provided; the 84.8% step correctness measures execution and recovery success only.

Authors: We acknowledge that the 84.8% end-to-end correctness metric primarily captures successful code execution, error recovery, and basic output validation rather than direct quantitative equivalence to expert pipelines. In the revised manuscript, we will expand the Methods and Results sections to detail the exact criteria used for the correctness assessment (including visual and automated checks for gross artifacts and data integrity). We will also add an explicit limitations paragraph noting the absence of metrics such as Dice scores, registration accuracy, or blinded expert QC, and clarify that the reported multimodal AUC gains provide supporting but indirect evidence of data usability. Comprehensive quantitative benchmarking against tools like FreeSurfer or fMRIPrep was outside the scope of this work focused on agent automation. revision: partial
Referee: [§4.1] §4.1 (Generate-Execute-Validate engine description): The automated error recovery and output validation steps are presented as sufficient to produce usable data with minimal human correction. However, the evaluation provides no analysis of recovered error types, potential systematic artifacts introduced by LLM-generated code, or downstream impact on metrics such as SNR or alignment quality.

Authors: We agree that greater detail on error recovery and potential artifacts would improve the description of the Generate-Execute-Validate engine. In the revision, we will add a dedicated paragraph and supplementary table in §4.1 categorizing the observed error types (e.g., syntax errors, neuroimaging-tool-specific runtime failures, and format inconsistencies) along with recovery success rates. We will also discuss the risk of LLM-induced artifacts and report any available downstream quality indicators (such as basic alignment or intensity statistics) from the preprocessed outputs to address concerns about systematic effects on SNR or registration quality. revision: yes
Referee: [Abstract and §5] Abstract and §5 (ADNI cohort details): The 1,470-subject evaluation pools data across ADNI phases with modality subsets (Tau-PET n=469, fMRI n=278, DTI n=620). Data exclusion rules, preprocessing validation criteria, and statistical tests for the multimodal AUC improvement are not specified, undermining assessment of whether the reported gains reflect valid multimodal signal or preprocessing artifacts.

Authors: We will revise both the Abstract and §5 to explicitly state the data exclusion rules (subjects removed due to missing modalities, failed initial quality checks, or incomplete tabular data), the precise validation criteria applied during the agent's output validation step, and the statistical tests used to evaluate multimodal AUC improvement (including DeLong's test for paired AUC comparisons with p-values). These additions will allow readers to better assess whether the performance gains arise from genuine multimodal signal rather than preprocessing artifacts. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical evaluation on external dataset

full rationale

The paper describes an LLM agent framework evaluated empirically on the public ADNI dataset (1470 subjects). Reported metrics (100% intent parsing, 84.8% preprocessing correctness, AUC 0.9518) are direct performance measurements from running the system, not mathematical derivations, fitted parameters renamed as predictions, or self-referential definitions. No equations, uniqueness theorems, or ansatzes appear; the central claims rest on observed outcomes rather than reducing to inputs by construction. This is a standard systems paper with independent empirical content.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The framework rests on the assumption that current LLMs can reliably produce and debug neuroimaging-specific code; no free parameters are explicitly fitted in the abstract, but model choice and prompt engineering act as implicit tunable elements.

axioms (2)

domain assumption LLMs can generate executable, modality-specific preprocessing code that integrates with existing neuroimaging toolchains
Invoked in the Generate step of the engine and central to the 84.8% correctness claim
domain assumption Automated validation can detect and allow recovery from most runtime and output-integrity errors without introducing bias
Required for the claim that manual intervention is limited to edge cases

invented entities (1)

Hierarchical multi-agent Generate-Execute-Validate engine no independent evidence
purpose: Autonomously handle code generation, execution, error recovery, and validation for neuroimaging pipelines
Core novel component introduced by the paper; no independent evidence provided beyond reported metrics

pith-pipeline@v0.9.0 · 5634 in / 1603 out tokens · 72744 ms · 2026-05-08T09:33:38.071538+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

94 extracted references · 48 canonical work pages · 9 internal anchors

[1]

Neuroimage , volume=

Resting-state fMRI in the Human Connectome Project , author=. Neuroimage , volume=. 2013 , publisher=

2013
[2]

Neuroimage , volume=

The WU-Minn human connectome project: an overview , author=. Neuroimage , volume=. 2013 , publisher=

2013
[3]

Neuroimage , volume=

FSL , author=. Neuroimage , volume=. 2012 , publisher=

2012
[4]

Human Brain Mapping , volume=

Fast robust automated brain extraction , author=. Human Brain Mapping , volume=. 2002 , doi=

2002
[5]

and Barnes, Kelly A

Power, Jonathan D. and Barnes, Kelly A. and Snyder, Abraham Z. and Schlaggar, Bradley L. and Petersen, Steven E. , journal=. Spurious but systematic correlations in functional connectivity. 2012 , doi=

2012
[6]

NeuroImage , volume=

Unified segmentation , author=. NeuroImage , volume=. 2005 , doi=

2005
[7]

and Tustison, Nicholas J

Avants, Brian B. and Tustison, Nicholas J. and Song, Gang and Cook, Philip A. and Klein, Arno and Gee, James C. , journal=. A reproducible evaluation of. 2011 , doi=

2011
[8]

2012 , doi=

Fischl, Bruce , journal=. 2012 , doi=

2012
[9]

IEEE Transactions on Medical Imaging , volume=

elastix: A Toolbox for Intensity-Based Medical Image Registration , author=. IEEE Transactions on Medical Imaging , volume=. 2010 , doi=

2010
[10]

Advances in Neural Information Processing Systems (NeurIPS) , pages=

A Unified Approach to Interpreting Model Predictions , author=. Advances in Neural Information Processing Systems (NeurIPS) , pages=
[11]

and Nielson, Dylan M

Esteban, Oscar and Blair, Ross W. and Nielson, Dylan M. and Varada, Jan C. and Marrett, Sean and Thomas, Adam G. and Poldrack, Russell A. and Gorgolewski, Krzysztof J. , journal=. Crowdsourced. 2019 , doi=

2019
[12]

Nature methods , volume=

fMRIPrep: a robust preprocessing pipeline for functional MRI , author=. Nature methods , volume=. 2019 , publisher=

2019
[13]

Frontiers in Neuroinformatics , volume=

Nipype: A Flexible, Lightweight and Extensible Neuroimaging Data Processing Framework in Python , author=. Frontiers in Neuroinformatics , volume=. 2011 , doi=

2011
[14]

Towards expert- level medical question answering with large language models,

Towards Expert-Level Medical Question Answering with Large Language Models , author=. 2023 , publisher=. doi:10.48550/arXiv.2305.09617 , url=

work page doi:10.48550/arxiv.2305.09617 2023
[15]

GPT-4 Technical Report

GPT-4 Technical Report , author=. arXiv preprint arXiv:2303.08774 , year=

work page internal anchor Pith review arXiv
[16]

Advances in Neural Information Processing Systems , volume=

Chain-of-Thought Prompting Elicits Reasoning in Large Language Models , author=. Advances in Neural Information Processing Systems , volume=
[17]

Frontiers of Computer Science , year=

A Survey on Large Language Model based Autonomous Agents , author=. Frontiers of Computer Science , year=
[18]

Nature , volume=

Variability in the analysis of a single neuroimaging dataset by many teams , author=. Nature , volume=. 2020 , doi=

2020
[19]

ReAct: Synergizing Reasoning and Acting in Language Models

ReAct: Synergizing Reasoning and Acting in Language Models , author=. 2022 , publisher=. doi:10.48550/arXiv.2210.03629 , url=

work page internal anchor Pith review doi:10.48550/arxiv.2210.03629 2022
[20]

Toolformer: Language Models Can Teach Themselves to Use Tools

Toolformer: Language Models Can Teach Themselves to Use Tools , author=. 2023 , publisher=. doi:10.48550/arXiv.2302.04761 , url=

work page internal anchor Pith review doi:10.48550/arxiv.2302.04761 2023
[21]

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation

AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework , author=. 2023 , publisher=. doi:10.48550/arXiv.2308.08155 , url=

work page Pith review doi:10.48550/arxiv.2308.08155 2023
[22]

MedGemma Technical Report

MedGemma Technical Report , author=. 2025 , publisher=. doi:10.48550/arXiv.2507.05201 , url=

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2507.05201 2025
[23]

Sparsellm: Towards global pruning for pre-trained language models, 2024

SparseLLM: Towards Global Pruning for Pre-trained Language Models , author=. 2024 , publisher=. doi:10.48550/arXiv.2402.17946 , url=

work page doi:10.48550/arxiv.2402.17946 2024
[26]

Adagent: Llm agent for alzheimer's disease analysis with collaborative coordinator, 2025

ADAgent: LLM Agent for Alzheimer's Disease Analysis with Collaborative Coordinator , author=. 2025 , publisher=. doi:10.48550/arXiv.2506.11150 , url=

work page doi:10.48550/arxiv.2506.11150 2025
[28]

Ad-care: A guideline-grounded, modality-agnostic llm agent for real-world alzheimer's disease diagnosis with multi-cohort assessment, fairness analysis, and reader study, 2026

AD-CARE: A Guideline-grounded, Modality-agnostic LLM Agent for Real-world Alzheimer's Disease Diagnosis with Multi-cohort Assessment, Fairness Analysis, and Reader Study , author=. 2026 , publisher=. doi:10.48550/arXiv.2603.25322 , url=

work page doi:10.48550/arxiv.2603.25322 2026
[29]

Evolving medical imaging agents via experience-driven self-skill discovery, 2026

Evolving Medical Imaging Agents via Experience-driven Self-skill Discovery , author=. 2026 , publisher=. doi:10.48550/arXiv.2603.05860 , url=

work page doi:10.48550/arxiv.2603.05860 2026
[30]

Rex-mle: The autonomous agent benchmark for medical imaging challenges, 2025

ReX-MLE: The Autonomous Agent Benchmark for Medical Imaging Challenges , author=. 2025 , publisher=. doi:10.48550/arXiv.2512.17838 , url=

work page doi:10.48550/arxiv.2512.17838 2025
[31]

A co-evolving agentic ai system for medical imaging analysis, 2025

A co-evolving agentic AI system for medical imaging analysis , author=. 2025 , publisher=. doi:10.48550/arXiv.2509.20279 , url=

work page doi:10.48550/arxiv.2509.20279 2025
[32]

Aura: A multi-modal medical agent for understanding, reasoning & annotation, 2025

AURA: A Multi-Modal Medical Agent for Understanding, Reasoning & Annotation , author=. 2025 , publisher=. doi:10.48550/arXiv.2507.16940 , url=

work page doi:10.48550/arxiv.2507.16940 2025
[33]

Medmaslab: A unified orchestration framework for benchmarking multimodal medical multi-agent systems, 2026

MedMASLab: A Unified Orchestration Framework for Benchmarking Multimodal Medical Multi-Agent Systems , author=. 2026 , publisher=. doi:10.48550/arXiv.2603.09909 , url=

work page doi:10.48550/arxiv.2603.09909 2026
[34]

Ad-reasoning: Multimodal guideline-guided reasoning for alzheimer's disease diagnosis, 2026

AD-Reasoning: Multimodal Guideline-Guided Reasoning for Alzheimer's Disease Diagnosis , author=. 2026 , publisher=. doi:10.48550/arXiv.2603.24059 , url=

work page doi:10.48550/arxiv.2603.24059 2026
[35]

Annual review of biomedical engineering , volume=

Deep learning in medical image analysis , author=. Annual review of biomedical engineering , volume=. 2017 , publisher=

2017
[36]

Alzheimer's & Dementia , volume=

Revised criteria for diagnosis and staging of Alzheimer's disease: Alzheimer's Association Workgroup , author=. Alzheimer's & Dementia , volume=. 2024 , doi=

2024
[37]

Alzheimer's & Dementia , volume=

NIA-AA Research Framework: Toward a biological definition of Alzheimer's disease , author=. Alzheimer's & Dementia , volume=. 2018 , doi=

2018
[38]

Insights into Imaging , volume=

Multimodality imaging of neurodegenerative disorders with a focus on multiparametric magnetic resonance and molecular imaging , author=. Insights into Imaging , volume=. 2023 , doi=

2023
[39]

Cognitive Neurodynamics , volume=

Machine learning with multimodal neuroimaging data to classify stages of Alzheimer's disease: a systematic review and meta-analysis , author=. Cognitive Neurodynamics , volume=. 2023 , doi=

2023
[40]

Alzheimer's & Dementia , volume=

Design and validation of the ADNI MR protocol , author=. Alzheimer's & Dementia , volume=. 2024 , publisher=

2024
[41]

Frontiers in Neuroinformatics , volume =

Diffusion MRI Indices and Their Relation to Cognitive Impairment in Brain Aging: The Updated Multi-Protocol Approach in ADNI3 , author =. Frontiers in Neuroinformatics , volume =. 2019 , doi =

2019
[42]

Alzheimer's & Dementia , year =

Design and Validation of the ADNI4 MRI Protocol , author =. Alzheimer's & Dementia , year =
[43]

Journal of Magnetic Resonance, Series B , volume =

MR Diffusion Tensor Spectroscopy and Imaging , author =. Journal of Magnetic Resonance, Series B , volume =. 1994 , doi =

1994
[44]

Magnetic Resonance in Medicine , volume =

Diffusional Kurtosis Imaging: The Quantification of Non-Gaussian Water Diffusion by Means of Magnetic Resonance Imaging , author =. Magnetic Resonance in Medicine , volume =. 2005 , doi =

2005
[45]

NMR in Biomedicine , volume =

MRI Quantification of Non-Gaussian Water Diffusion by Kurtosis Analysis , author =. NMR in Biomedicine , volume =. 2010 , doi =

2010
[46]

NeuroImage , volume =

NODDI: Practical In Vivo Neurite Orientation Dispersion and Density Imaging of the Human Brain , author =. NeuroImage , volume =. 2012 , doi =

2012
[47]

Magnetic Resonance in Medicine , volume =

Generalized Autocalibrating Partially Parallel Acquisitions (GRAPPA) , author =. Magnetic Resonance in Medicine , volume =. 2002 , doi =

2002
[48]

Magnetic Resonance in Medicine , volume =

SENSE: Sensitivity Encoding for Fast MRI , author =. Magnetic Resonance in Medicine , volume =. 1999 , doi =

1999
[49]

Magnetic Resonance in Medicine , volume =

Blipped-Controlled Aliasing in Parallel Imaging for Simultaneous Multislice Echo Planar Imaging with Reduced g-Factor Penalty , author =. Magnetic Resonance in Medicine , volume =. 2012 , doi =

2012
[50]

Journal of Magnetic Resonance Imaging , volume =

Three-Dimensional Magnetization-Prepared Rapid Gradient-Echo Imaging (3D MP RAGE) , author =. Journal of Magnetic Resonance Imaging , volume =. 1990 , doi =

1990
[51]

Neuroimage , volume=

How to correct susceptibility distortions in spin-echo echo-planar images: application to diffusion tensor imaging , author=. Neuroimage , volume=. 2003 , publisher=

2003
[52]

Neuroimage , volume=

An integrated framework for correction of susceptibility, eddy currents, and motion artifacts in diffusion MRI , author=. Neuroimage , volume=. 2016 , publisher=

2016
[53]

Neuroimage , volume=

Robust determination of the fibre orientation distribution in diffusion MRI: non-negativity constrained super-resolved spherical deconvolution , author=. Neuroimage , volume=. 2007 , publisher=

2007
[54]

Magnetic Resonance in Medicine , volume=

Blipped-controlled aliasing in parallel imaging for simultaneous multislice echo planar imaging with reduced g-factor penalty , author=. Magnetic Resonance in Medicine , volume=. 2012 , publisher=

2012
[55]

Magnetic Resonance Imaging , volume=

Simultaneous multi-slice MRI with non-Cartesian trajectories , author=. Magnetic Resonance Imaging , volume=. 2021 , publisher=

2021
[56]

Neuroimage , volume=

White matter characterization with diffusional kurtosis imaging , author=. Neuroimage , volume=. 2011 , publisher=

2011
[58]

Design and validation of the adni mr protocol

Arvin Arani, Bret Borowski, John Felmlee, Robert I Reid, David L Thomas, Jeffrey L Gunter, Lara Stables, Randy L Buckner, Youngkyoo Jung, Duygu Tosun, et al. Design and validation of the adni mr protocol. Alzheimer's & Dementia, 20 0 (9): 0 6615--6621, 2024

2024
[59]

Bernstein, Brian J

Arvin Arani, Matthew A. Bernstein, Brian J. Borowski, Clifford R. Jack, and Michael W. Weiner. Design and validation of the adni4 mri protocol. Alzheimer's & Dementia, 2025. In press

2025
[60]

John Ashburner and Karl J. Friston. Unified segmentation. NeuroImage, 26 0 (3): 0 839--851, 2005. doi:10.1016/j.neuroimage.2005.02.018

work page doi:10.1016/j.neuroimage.2005.02.018 2005
[61]

Avants, Nicholas J

Brian B. Avants, Nicholas J. Tustison, Gang Song, Philip A. Cook, Arno Klein, and James C. Gee. A reproducible evaluation of ANTs similarity metric performance in brain image registration. NeuroImage, 54 0 (3): 0 2033--2044, 2011. doi:10.1016/j.neuroimage.2010.09.025

work page doi:10.1016/j.neuroimage.2010.09.025 2033
[62]

Sparsellm: Towards global pruning for pre-trained language models, 2024

Guangji Bai, Yijiang Li, Chen Ling, Kibaek Kim, and Liang Zhao. Sparsellm: Towards global pruning for pre-trained language models, 2024. URL https://arxiv.org/abs/2402.17946

work page arXiv 2024
[63]

29 Dexter Kozen

Peter J. Basser, James Mattiello, and Denis LeBihan. Mr diffusion tensor spectroscopy and imaging. Journal of Magnetic Resonance, Series B, 103 0 (3): 0 247--254, 1994. doi:10.1006/jmrb.1994.1037

work page doi:10.1006/jmrb.1994.1037 1994
[64]

Raymond J

Rotem Botvinik-Nezer, Felix Holzmeister, Colin F. Camerer, Anna Dreber, Juergen Huber, Magnus Johannesson, Michael Kirchler, Roni Iwanir, Jeanette A. Mumford, R. Alison Adcock, et al. Variability in the analysis of a single neuroimaging dataset by many teams. Nature, 582 0 (7810): 0 84--88, 2020. doi:10.1038/s41586-020-2314-9. URL https://doi.org/10.1038/...

work page doi:10.1038/s41586-020-2314-9 2020
[65]

MONAI: An open-source framework for deep learning in healthcare

M Jorge Cardoso, Wenqi Li, Richard Brown, Nic Ma, Eric Kerfoot, Yiheng Wang, Benjamin Murrey, Andriy Myronenko, Can Zhao, Dong Yang, et al. Monai: An open-source framework for deep learning in healthcare. arXiv preprint arXiv:2211.02701, 2022

work page internal anchor Pith review arXiv 2022
[66]

Ad-reasoning: Multimodal guideline-guided reasoning for alzheimer's disease diagnosis, 2026

Qiuhui Chen et al. Ad-reasoning: Multimodal guideline-guided reasoning for alzheimer's disease diagnosis, 2026. URL https://arxiv.org/abs/2603.24059

work page arXiv 2026
[67]

Blair, Dylan M

Oscar Esteban, Ross W. Blair, Dylan M. Nielson, Jan C. Varada, Sean Marrett, Adam G. Thomas, Russell A. Poldrack, and Krzysztof J. Gorgolewski. Crowdsourced MRI quality metrics and expert quality annotations for training of humans and machines. Scientific Data, 6 0 (1): 0 30, 2019 a . doi:10.1038/s41597-019-0035-4

work page doi:10.1038/s41597-019-0035-4 2019
[68]

fmriprep: a robust preprocessing pipeline for functional mri

Oscar Esteban et al. fmriprep: a robust preprocessing pipeline for functional mri. Nature methods, 16 0 (1): 0 111--116, 2019 b

2019
[69]

Evolving medical imaging agents via experience-driven self-skill discovery, 2026

Lin Fan et al. Evolving medical imaging agents via experience-driven self-skill discovery, 2026. URL https://arxiv.org/abs/2603.05860

work page arXiv 2026
[70]

Aura: A multi-modal medical agent for understanding, reasoning & annotation, 2025

Nima Fathi, Amar Kumar, and Tal Arbel. Aura: A multi-modal medical agent for understanding, reasoning & annotation, 2025. URL https://arxiv.org/abs/2507.16940

work page arXiv 2025
[71]

2012 , journal =

Bruce Fischl. FreeSurfer . NeuroImage, 62 0 (2): 0 774--781, 2012. doi:10.1016/j.neuroimage.2012.01.021

work page doi:10.1016/j.neuroimage.2012.01.021 2012
[72]

Burns, Cindee Madison, Dav Clark, Yaroslav O

Krzysztof Gorgolewski, Christopher D. Burns, Cindee Madison, Dav Clark, Yaroslav O. Halchenko, Michael L. Waskom, and Satrajit S. Ghosh. Nipype: A flexible, lightweight and extensible neuroimaging data processing framework in python. Frontiers in Neuroinformatics, 5, 2011. doi:10.3389/fninf.2011.00013. URL https://doi.org/10.3389/fninf.2011.00013

work page doi:10.3389/fninf.2011.00013 2011
[73]

Adagent: Llm agent for alzheimer's disease analysis with collaborative coordinator, 2025

Wenlong Hou, Guangqian Yang, Ye Du, Yeung Lau, Lihao Liu, Junjun He, Ling Long, and Shujun Wang. Adagent: Llm agent for alzheimer's disease analysis with collaborative coordinator, 2025. URL https://arxiv.org/abs/2506.11150

work page arXiv 2025
[74]

Ad-care: A guideline-grounded, modality-agnostic llm agent for real-world alzheimer's disease diagnosis with multi-cohort assessment, fairness analysis, and reader study, 2026

Wenlong Hou, Sheng Bi, Guangqian Yang, Lihao Liu, Ye Du, Hanxiao Xue, Juncheng Wang, Yuxiang Feng, Yue Xun, Nanxi Yu, et al. Ad-care: A guideline-grounded, modality-agnostic llm agent for real-world alzheimer's disease diagnosis with multi-cohort assessment, fairness analysis, and reader study, 2026. URL https://arxiv.org/abs/2603.25322

work page arXiv 2026
[75]

Alzheimer’s and Dementia14, 535–562 (4 2018)

Clifford R. Jack, David A. Bennett, Kaj Blennow, Maria C. Carrillo, Billy Dunn, Samantha Budd Haeberlein, David M. Holtzman, William Jagust, Frank Jessen, Jason Karlawish, et al. Nia-aa research framework: Toward a biological definition of alzheimer's disease. Alzheimer's & Dementia, 14 0 (4): 0 535--562, 2018. doi:10.1016/j.jalz.2018.02.018

work page doi:10.1016/j.jalz.2018.02.018 2018
[76]

Clifford R. Jack, J. Scott Andrews, Thomas G. Beach, Teresa Buracchio, Billy Dunn, Ana Graf, Oskar Hansson, Carole Ho, William Jagust, Eric McDade, et al. Revised criteria for diagnosis and staging of alzheimer's disease: Alzheimer's association workgroup. Alzheimer's & Dementia, 20 0 (8): 0 5143--5169, 2024. doi:10.1002/alz.13859

work page doi:10.1002/alz.13859 2024
[77]

Mark Jenkinson, Christian F Beckmann, Timothy EJ Behrens, Mark W Woolrich, and Stephen M Smith. Fsl. Neuroimage, 62 0 (2): 0 782--790, 2012

2012
[78]

Rex-mle: The autonomous agent benchmark for medical imaging challenges, 2025

Roshan Kenia et al. Rex-mle: The autonomous agent benchmark for medical imaging challenges, 2025. URL https://arxiv.org/abs/2512.17838

work page arXiv 2025
[79]

Viergever, and Josien P

Stefan Klein, Marius Staring, Keelin Murphy, Max A. Viergever, and Josien P. W. Pluim. elastix: A toolbox for intensity-based medical image registration. IEEE Transactions on Medical Imaging, 29 0 (1): 0 196--205, 2010. doi:10.1109/TMI.2009.2035616

work page doi:10.1109/tmi.2009.2035616 2010
[80]

A co-evolving agentic ai system for medical imaging analysis, 2025

Songhao Li, Jonathan Xu, Tiancheng Bao, Yuxuan Liu, Yuchen Liu, Yihang Liu, Lilin Wang, Wenhui Lei, Sheng Wang, Yinuo Xu, Yan Cui, Jialu Yao, Shunsuke Koga, and Zhi Huang. A co-evolving agentic ai system for medical imaging analysis, 2025. URL https://arxiv.org/abs/2509.20279

work page arXiv 2025
[81]

James Ryan Loftus, Savita Puri, and Steven P. Meyers. Multimodality imaging of neurodegenerative disorders with a focus on multiparametric magnetic resonance and molecular imaging. Insights into Imaging, 14 0 (1), 2023. doi:10.1186/s13244-022-01358-6

work page doi:10.1186/s13244-022-01358-6 2023
[82]

Lundberg and Su-In Lee

Scott M. Lundberg and Su-In Lee. A unified approach to interpreting model predictions. In Advances in Neural Information Processing Systems (NeurIPS), pp.\ 4765--4774, 2017

2017
[83]

Luo, X., Rechardt, A., Sun, G., Nejad, K

Xiaoliang Luo, Akilles Rechardt, Guangzhi Sun, Kevin K. Nejad, Felipe Y \'a \ n ez, Bati Yilmaz, Kangjoo Lee, Alexandra O. Cohen, Valentina Borghesani, Anton Pashkov, et al. Large language models surpass human experts in predicting neuroscience results. Nature Human Behaviour, 9 0 (2): 0 305--315, nov 2024. doi:10.1038/s41562-024-02046-9. URL https://doi....

work page doi:10.1038/s41562-024-02046-9 2024
[84]

Machine learning with multimodal neuroimaging data to classify stages of alzheimer's disease: a systematic review and meta-analysis

Modupe Odusami, Rytis Maskeli \=u nas, Robertas Dama s evi c ius, and Sanjay Misra. Machine learning with multimodal neuroimaging data to classify stages of alzheimer's disease: a systematic review and meta-analysis. Cognitive Neurodynamics, 18 0 (3): 0 775--794, 2023. doi:10.1007/s11571-023-09993-5

work page doi:10.1007/s11571-023-09993-5 2023

Showing first 80 references.