NeuroBridge: Bridging Multi-Task MRI Knowledge for Neurodegenerative Disease Diagnosis

Chad W. Farris; Guoyao Shen; Mengyu Li; Xin Zhang

arxiv: 2607.01401 · v1 · pith:ST74AMRPnew · submitted 2026-07-01 · 💻 cs.LG · cs.AI· cs.CV

NeuroBridge: Bridging Multi-Task MRI Knowledge for Neurodegenerative Disease Diagnosis

Mengyu Li , Guoyao Shen , Chad W. Farris , Xin Zhang This is my paper

Pith reviewed 2026-07-03 21:22 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CV

keywords multi-task learningMRIAlzheimer's diseasehippocampal segmentationneurodegenerative diseasedeep learningmedical imagingopportunistic screening

0 comments

The pith

Multi-task learning on MRI that adds hippocampal segmentation, atrophy classification and reconstruction improves Alzheimer's diagnosis accuracy over single-task baselines.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops NeuroBridge to handle subtle and heterogeneous structural changes in brain MRI for classifying Alzheimer's disease, mild cognitive impairment and related dementias. It combines large-scale self-supervised pretraining with three auxiliary tasks—hippocampal segmentation, hippocampal atrophy classification and reconstruction—then applies gated fusion during fine-tuning for the primary diagnosis objective. Evaluated on ADNI and OASIS cohorts, the method records the highest accuracies reported, including 88.17 percent for AD versus cognitively normal on ADNI and 82.78 percent on OASIS, with largest gains in MCI and mixed settings plus effective cross-cohort transfer. A sympathetic reader cares because routine MRI scans could support more reliable early detection and probability-based screening without requiring new hardware or separate models.

Core claim

NeuroBridge integrates self-supervised MRI pretraining with hippocampal segmentation, hippocampal atrophy classification and reconstruction objectives, followed by gated fusion fine-tuning, and thereby achieves the highest performance across evaluated classification tasks while demonstrating strong cross-cohort generalization, systematic associations between predicted-class probability and accuracy, and the feasibility of probability-based opportunistic screening.

What carries the argument

Gated fusion fine-tuning that merges representations learned from the auxiliary clinical tasks with the primary diagnosis objective.

If this is right

Accuracy reaches 88.17 percent for AD versus cognitively normal controls on ADNI and 82.78 percent on OASIS.
The largest improvements appear in MCI-related and mixed-diagnosis classification settings.
Models trained on one cohort transfer effectively to the other cohort.
Predicted-class probabilities show systematic correlation with actual diagnostic accuracy.
Probability thresholds enable opportunistic screening on existing MRI scans.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same auxiliary-task structure might be tested on other neurodegenerative conditions that also affect the hippocampus.
Probability outputs could be used to prioritize follow-up clinical review without additional imaging.
If the gated fusion step proves robust, similar multi-task pretraining could be applied to other MRI-based diagnostic problems.

Load-bearing premise

The auxiliary tasks of hippocampal segmentation, atrophy classification and reconstruction supply clinically relevant signals that meaningfully improve the primary diagnosis task.

What would settle it

An ablation study that removes the auxiliary tasks, retrains on identical ADNI and OASIS data splits, and obtains equal or higher accuracy on the same AD-versus-CN and MCI tasks would falsify the claim that the multi-task setup drives the observed gains.

read the original abstract

INTRODUCTION: Accurate MRI-based identification of Alzheimer's disease (AD), mild cognitive impairment (MCI), and related dementias remains challenging because disease-related structural changes are often subtle and heterogeneous. We developed NeuroBridge, a clinically guided multi-task MRI framework for neurodegenerative disease diagnosis. METHODS: NeuroBridge integrates large-scale self-supervised MRI pretraining with hippocampal segmentation, hippocampal atrophy classification, and reconstruction objectives, followed by gated fusion fine-tuning. Performance was evaluated across ADNI and OASIS cohorts, including cross-cohort transfer, probability-based analysis, and opportunistic screening. RESULTS: NeuroBridge achieved the highest performance across evaluated classification tasks, reaching 88.17% accuracy for AD versus cognitively normal controls in ADNI and 82.78% in OASIS. The largest gains occurred in MCI-related and mixed-diagnosis settings. The framework demonstrated strong cross-cohort generalization, systematic associations between predicted-class probability and accuracy, and the feasibility of probability-based opportunistic screening. DISCUSSION: Clinically guided multi-task representation learning improves neurodegenerative MRI diagnosis beyond conventional single-task approaches. NeuroBridge provides a robust and scalable framework for dementia assessment and MRI-based opportunistic screening.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NeuroBridge reports decent numbers on ADNI and OASIS for AD diagnosis via multi-task MRI learning, but the abstract supplies no ablations to show the hippocampal auxiliary tasks drive any gains beyond self-supervised pretraining.

read the letter

The main takeaway is that NeuroBridge combines self-supervised pretraining with hippocampal segmentation, atrophy classification, and reconstruction, then uses gated fusion for fine-tuning, and reports 88.17% accuracy on AD vs normal in ADNI and 82.78% in OASIS plus some cross-cohort results. The abstract frames this as evidence that clinically guided multi-task learning beats single-task approaches.

The setup itself is a straightforward extension of existing multi-task medical imaging work. Evaluating on two standard cohorts, checking cross-cohort transfer, and looking at how prediction probabilities relate to accuracy are practical steps that could matter for screening applications.

The soft spot is the missing isolation of the auxiliary tasks. The central claim credits those specific clinical objectives for the performance and generalization, yet the abstract gives no ablation or comparison that removes them while keeping the pretraining and fusion. If the gains come mainly from the pretraining scale or the fusion mechanism, the multi-task story does not hold. The lack of baseline details, statistical tests, error bars, or cohort demographics makes the numbers hard to assess on their own.

This paper is aimed at researchers already working on multi-task or self-supervised methods for neuroimaging. Someone looking for architecture ideas in dementia MRI diagnosis might extract useful pieces from the full methods, but the evidence for the key clinical-task benefit is not there yet.

I would send it for peer review. The topic is relevant and the approach is coherent on its face, so referees can request the ablations and fuller reporting. The current abstract alone does not make a strong case.

Referee Report

2 major / 2 minor

Summary. The paper introduces NeuroBridge, a multi-task MRI framework that combines large-scale self-supervised pretraining with auxiliary objectives (hippocampal segmentation, atrophy classification, and reconstruction) and gated fusion fine-tuning for neurodegenerative disease diagnosis. It reports state-of-the-art accuracies of 88.17% (ADNI) and 82.78% (OASIS) for AD vs. cognitively normal classification, along with strong cross-cohort generalization, probability-accuracy correlations, and feasibility for opportunistic screening, attributing gains to clinically guided multi-task learning over single-task baselines.

Significance. If the performance gains and generalization claims hold after proper controls, the work would demonstrate a practical way to inject domain-specific clinical signals into representation learning for MRI-based dementia diagnosis, with potential downstream value for scalable screening. The multi-cohort evaluation and probability-based analysis are positive elements, but the current lack of isolation for the auxiliary-task contributions limits the strength of the central methodological claim.

major comments (2)

[Results] Results section (and abstract): The central claim that 'clinically guided multi-task representation learning improves ... beyond conventional single-task approaches' is load-bearing but unsupported by ablation experiments. No quantitative comparison isolates the contribution of the hippocampal segmentation, atrophy classification, and reconstruction auxiliaries versus self-supervised pretraining or gated fusion alone; without these, the attribution of the reported 88.17% / 82.78% accuracies and cross-cohort gains specifically to the clinical tasks cannot be assessed.
[Methods / Results] Methods and Results sections: The reported accuracies lack accompanying baseline details, statistical tests (e.g., McNemar or paired t-tests), error bars, cohort demographics, exclusion criteria, or hyperparameter sensitivity analysis. These omissions make it impossible to evaluate whether the claimed superiority over single-task approaches is robust or merely reflects differences in training scale or data splits.

minor comments (2)

[Abstract] Abstract: The phrase 'the largest gains occurred in MCI-related and mixed-diagnosis settings' is stated without accompanying per-task numbers or tables, reducing clarity.
[Methods] Notation: The gated fusion mechanism is described at a high level but would benefit from an explicit equation or diagram showing how task-specific features are combined before the final classifier.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on our manuscript. We address each major comment point by point below and commit to revisions that strengthen the empirical support for our claims.

read point-by-point responses

Referee: [Results] Results section (and abstract): The central claim that 'clinically guided multi-task representation learning improves ... beyond conventional single-task approaches' is load-bearing but unsupported by ablation experiments. No quantitative comparison isolates the contribution of the hippocampal segmentation, atrophy classification, and reconstruction auxiliaries versus self-supervised pretraining or gated fusion alone; without these, the attribution of the reported 88.17% / 82.78% accuracies and cross-cohort gains specifically to the clinical tasks cannot be assessed.

Authors: We acknowledge that the manuscript presents comparisons to single-task baselines but does not include explicit ablation experiments that isolate the individual contributions of the hippocampal segmentation, atrophy classification, and reconstruction auxiliaries from the self-supervised pretraining and gated fusion stages. This gap limits the precision with which performance gains can be attributed specifically to the clinically guided components. We will add these ablation studies, including quantitative results for variants with and without each auxiliary task, to the revised Results section. revision: yes
Referee: [Methods / Results] Methods and Results sections: The reported accuracies lack accompanying baseline details, statistical tests (e.g., McNemar or paired t-tests), error bars, cohort demographics, exclusion criteria, or hyperparameter sensitivity analysis. These omissions make it impossible to evaluate whether the claimed superiority over single-task approaches is robust or merely reflects differences in training scale or data splits.

Authors: We agree that additional methodological and statistical details are required for a rigorous evaluation. In the revised manuscript we will expand the Methods and Results sections to include full descriptions of all baselines, statistical significance tests (McNemar and paired t-tests), error bars derived from multiple runs, complete cohort demographics and exclusion criteria, and hyperparameter sensitivity analyses. These additions will directly address concerns about robustness and potential confounding factors such as training scale or data splits. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on external cohorts are independent of model definitions

full rationale

The paper's central claims consist of measured classification accuracies (88.17% AD vs CN on ADNI, 82.78% on OASIS) and cross-cohort generalization obtained by training the described multi-task framework on the named public datasets and evaluating on held-out splits. These quantities are not algebraically equivalent to any internal parameters, loss terms, or self-citations; they are external empirical outcomes. The auxiliary tasks (segmentation, atrophy classification, reconstruction) are distinct objectives whose contribution is asserted via experimental comparison rather than by definitional identity. No derivation step reduces to a fitted input renamed as prediction or to a self-citation chain that itself lacks independent verification. The work is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Based solely on the abstract, no explicit free parameters, invented entities, or non-standard axioms are stated. The approach rests on the standard domain assumption that MRI contains extractable structural signals for disease classification when auxiliary tasks are chosen appropriately.

axioms (1)

domain assumption MRI scans contain structural information sufficient for distinguishing disease states when combined with appropriate learning objectives.
Implicit in the decision to use MRI for diagnosis and the choice of hippocampal tasks.

pith-pipeline@v0.9.1-grok · 5741 in / 1485 out tokens · 31591 ms · 2026-07-03T21:22:07.976001+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

53 extracted references · 50 canonical work pages · 5 internal anchors

[1]

Impact of dementia: health disparities, population trends, care interventions, and economic costs

Aranda MP, Kremer IN, Hinton L, et al. Impact of dementia: health disparities, population trends, care interventions, and economic costs. J Am Geriatr Soc. 2021;69:1774-1783. doi:10.1111/jgs.17345

work page doi:10.1111/jgs.17345 2021
[2]

NIA-AA Research Framework: toward a biological definition of Alzheimer’s disease

Jack CR Jr, Bennett DA, Blennow K, et al. NIA-AA Research Framework: toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14:535-562. doi:10.1016/j.jalz.2018.02.018

work page doi:10.1016/j.jalz.2018.02.018 2018
[3]

Sperling RA, Aisen PS, Beckett LA, et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7:280-292. doi:10.1016/j.jalz.2011.03.003

work page doi:10.1016/j.jalz.2011.03.003 2011
[4]

Mild cognitive impairment

Gauthier S, Reisberg B, Zaudig M, et al. Mild cognitive impairment. Lancet. 2006;367:1262-1270. doi:10.1016/S0140-6736(06)68542-5

work page doi:10.1016/s0140-6736(06)68542-5 2006
[5]

Mild cognitive impairment: clinical characterization and outcome

Petersen RC, Smith GE, Waring SC, et al. Mild cognitive impairment: clinical characterization and outcome. Arch Neurol. 1999;56:303-308. doi:10.1001/archneur.56.3.303

work page doi:10.1001/archneur.56.3.303 1999
[6]

Focusing on earlier diagnosis of Alzheimer’s disease

Frederiksen KS, Arus XM, Zetterberg H, et al. Focusing on earlier diagnosis of Alzheimer’s disease. Future Neurol. 2024;19:2337452. doi:10.2217/fnl-2023-0024

work page doi:10.2217/fnl-2023-0024 2024
[7]

What Alzheimer’s disease can learn from oncology’s early-detection revolution: toward scalable, cost-effective dementia diagnostics

Sun Y. What Alzheimer’s disease can learn from oncology’s early-detection revolution: toward scalable, cost-effective dementia diagnostics. Alzheimers Dement (Amst). 2026;18. doi:10.1002/dad2.70306 43

work page doi:10.1002/dad2.70306 2026
[8]

Expected and diagnosed rates of mild cognitive impairment and dementia in the US Medicare population: observational analysis

Mattke S, Jun H, Chen E, et al. Expected and diagnosed rates of mild cognitive impairment and dementia in the US Medicare population: observational analysis. Alzheimers Res Ther. 2023;15:128. doi:10.1186/s13195-023-01272-z

work page doi:10.1186/s13195-023-01272-z 2023
[9]

Correlates of missed or late versus timely diagnosis of dementia in healthcare settings

Chen Y, Power MC, Grodstein F, et al. Correlates of missed or late versus timely diagnosis of dementia in healthcare settings. Alzheimers Dement. 2024;20:5551-5560. doi:10.1002/alz.14067

work page doi:10.1002/alz.14067 2024
[10]

Prevalence and determinants of undetected dementia in the community: a systematic literature review and meta-analysis

Lang L, Clifford A, Wei L, et al. Prevalence and determinants of undetected dementia in the community: a systematic literature review and meta-analysis. BMJ Open. 2017;7. doi:10.1136/bmjopen-2016-011146

work page doi:10.1136/bmjopen-2016-011146 2017
[11]

Time to diagnosis in dementia: a systematic review with meta-analysis

Kusoro O, Roche M, Del-Pino-Casado R, et al. Time to diagnosis in dementia: a systematic review with meta-analysis. Int J Geriatr Psychiatry. 2025;40. doi:10.1002/gps.70129

work page doi:10.1002/gps.70129 2025
[12]

On the Opportunities and Risks of Foundation Models

Bommasani R, Hudson DA, Adeli E, et al. On the opportunities and risks of foundation models. arXiv [Preprint]. Published August 16, 2021. doi:10.48550/arXiv.2108.07258

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258 2021
[13]

Foundation models for generalist medical artificial intelligence

Moor M, Banerjee O, Abad ZS, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616:259-265. doi:10.1038/s41586-023- 05881-4

work page doi:10.1038/s41586-023- 2023
[14]

Masked Autoencoders Are Scalable Vision Learners

He K, Chen X, Xie S, et al. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022:16000-16009. doi:10.48550/arXiv.2111.06377 44

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2111.06377 2022
[15]

Overcoming data scarcity in biomedical imaging with a foundational multi-task model

Schäfer R, Nicke T, Höfener H, et al. Overcoming data scarcity in biomedical imaging with a foundational multi-task model. Nat Comput Sci. 2024;4:495-509. doi:10.1038/s43588-024-00662-z

work page doi:10.1038/s43588-024-00662-z 2024
[16]

Foundation model for cancer imaging biomarkers

Pai S, Bontempi D, Hadzic I, et al. Foundation model for cancer imaging biomarkers. Nat Mach Intell. 2024;6:354-367. doi:10.1038/s42256-024-00807-9

work page doi:10.1038/s42256-024-00807-9 2024
[17]

A generalizable foundation model for analysis of human brain MRI

Tak D, Garomsa BA, Zapaishchykova A, et al. A generalizable foundation model for analysis of human brain MRI. Nat Neurosci. Published online February 5,
[18]

doi:10.1038/s41593-026-02202-6

work page doi:10.1038/s41593-026-02202-6
[19]

The clinical use of structural MRI in Alzheimer disease

Frisoni GB, Fox NC, Jack CR Jr, et al. The clinical use of structural MRI in Alzheimer disease. Nat Rev Neurol. 2010;6:67-77. doi:10.1038/nrneurol.2009.215

work page doi:10.1038/nrneurol.2009.215 2010
[20]

Imaging biomarkers of dementia: recommended visual rating scales with teaching cases

Wahlund LO, Westman E, van Westen D, et al. Imaging biomarkers of dementia: recommended visual rating scales with teaching cases. Insights Imaging. 2017;8:79-90. doi:10.1007/s13244-016-0521-6

work page doi:10.1007/s13244-016-0521-6 2017
[21]

General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis

Lu S, Chen Y, Chen Y, et al. General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis. Nat Commun. 2025;16:2097. doi:10.1038/s41467-025-57427-z

work page doi:10.1038/s41467-025-57427-z 2025
[22]

Attention Is All You Need

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017;30. doi:10.48550/arXiv.1706.03762

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1706.03762 2017
[23]

The limits of fair medical imaging AI in real-world generalization

Yang Y, Zhang H, Gichoya JW, et al. The limits of fair medical imaging AI in real-world generalization. Nat Med. 2024;30:2838-2848. doi:10.1038/s41591-024- 03113-4 45

work page doi:10.1038/s41591-024- 2024
[24]

Tackling prediction uncertainty in machine learning for healthcare

Chua M, Kim D, Choi J, et al. Tackling prediction uncertainty in machine learning for healthcare. Nat Biomed Eng. 2023;7:711-718. doi:10.1038/s41551-022- 00988-x

work page doi:10.1038/s41551-022- 2023
[25]

Tailored for real-world: a whole- slide image classification system validated on uncurated multisite data emulating the prospective pathology workload

Ianni JD, Soans RE, Sankarapandian S, et al. Tailored for real-world: a whole- slide image classification system validated on uncurated multisite data emulating the prospective pathology workload. Sci Rep. 2020;10:3217. doi:10.1038/s41598-020- 59985-2

work page doi:10.1038/s41598-020- 2020
[26]

Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models

Young AT, Fernandez K, Pfau J, et al. Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models. NPJ Digit Med. 2021;4:10. doi:10.1038/s41746-020-00380-6

work page doi:10.1038/s41746-020-00380-6 2021
[27]

Opportunistic screening: Radiology Scientific Expert Panel

Pickhardt PJ, Summers RM, Garrett JW, et al. Opportunistic screening: Radiology Scientific Expert Panel. Radiology. 2023;307. doi:10.1148/radiol.222044

work page doi:10.1148/radiol.222044 2023
[28]

Medial temporal lobe atrophy is underreported and may have important clinical correlates in medical inpatients

Torisson G, van Westen D, Stavenow L, et al. Medial temporal lobe atrophy is underreported and may have important clinical correlates in medical inpatients. BMC Geriatr. 2015;15:65. doi:10.1186/s12877-015-0066-4

work page doi:10.1186/s12877-015-0066-4 2015
[29]

Radiological reporting of brain atrophy in MRI: real-life comparison between narrative reports, semiquantitative scales, and automated software-based volumetry

Bruno F, Fagotti C, Saltarelli G, et al. Radiological reporting of brain atrophy in MRI: real-life comparison between narrative reports, semiquantitative scales, and automated software-based volumetry. Diagnostics (Basel). 2025;15:1246. doi:10.3390/diagnostics15101246

work page doi:10.3390/diagnostics15101246 2025
[30]

Structural imaging findings on non- enhanced computed tomography are severely underreported in the primary care diagnostic work-up of subjective cognitive decline

Håkansson C, Torisson G, Londos E, et al. Structural imaging findings on non- enhanced computed tomography are severely underreported in the primary care diagnostic work-up of subjective cognitive decline. Neuroradiology. 2019;61:397-404. doi:10.1007/s00234-019-02156-6 46

work page doi:10.1007/s00234-019-02156-6 2019
[31]

Automated opportunistic osteoporotic fracture risk assessment using computed tomography scans to aid in FRAX underutilization

Dagan N, Elnekave E, Barda N, et al. Automated opportunistic osteoporotic fracture risk assessment using computed tomography scans to aid in FRAX underutilization. Nat Med. 2020;26:77-82. doi:10.1038/s41591-019-0720-z

work page doi:10.1038/s41591-019-0720-z 2020
[32]

Incidental coronary artery calcium: opportunistic screening of previous nongated chest computed tomography scans to improve statin rates—the NOTIFY -1 project

Sandhu AT, Rodriguez F, Ngo S, et al. Incidental coronary artery calcium: opportunistic screening of previous nongated chest computed tomography scans to improve statin rates—the NOTIFY -1 project. Circulation. 2023;147:703-714. doi:10.1161/CIRCULATIONAHA.122.062746

work page doi:10.1161/circulationaha.122.062746 2023
[33]

Automated CT biomarkers for opportunistic prediction of future cardiovascular events and mortality in an asymptomatic screening population: a retrospective cohort study

Pickhardt PJ, Graffy PM, Zea R, et al. Automated CT biomarkers for opportunistic prediction of future cardiovascular events and mortality in an asymptomatic screening population: a retrospective cohort study. Lancet Digit Health. 2020;2. doi:10.1016/S2589-7500(20)30025-X

work page doi:10.1016/s2589-7500(20)30025-x 2020
[34]

RadImageNet: an open radiologic deep learning research dataset for effective transfer learning

Mei X, Liu Z, Robson PM, et al. RadImageNet: an open radiologic deep learning research dataset for effective transfer learning. Radiol Artif Intell. 2022;4. doi:10.1148/ryai.210315

work page doi:10.1148/ryai.210315 2022
[35]

Jenkinson M, Beckmann CF, Behrens TE, et al. FSL. Neuroimage. 2012;62:782-790. doi:10.1016/j.neuroimage.2011.09.015

work page doi:10.1016/j.neuroimage.2011.09.015 2012
[36]

Non-linear registration, aka spatial normalisation

Andersson JL, Jenkinson M, Smith S. Non-linear registration, aka spatial normalisation. FMRIB Technical Report TR07JA2. FMRIB Analysis Group, University of Oxford; 2007. Accessed [June, 2026]. https://www.fmrib.ox.ac.uk/datasets/techrep/tr07ja2/tr07ja2.pdf

2007
[37]

Assessing and tuning brain decoders: cross-validation, caveats, and guidelines

Varoquaux G, Raamana PR, Engemann DA, et al. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. Neuroimage. 2017;145:166-179. doi:10.1016/j.neuroimage.2016.10.038 47

work page doi:10.1016/j.neuroimage.2016.10.038 2017
[38]

A guide to cross-validation for artificial intelligence in medical imaging

Bradshaw TJ, Huemann Z, Hu J, Rahmim A. A guide to cross-validation for artificial intelligence in medical imaging. Radiol Artif Intell. 2023;5. doi:10.1148/ryai.220232

work page doi:10.1148/ryai.220232 2023
[39]

Monte Carlo cross-validation for a study with binary outcome and limited sample size

Shan G. Monte Carlo cross-validation for a study with binary outcome and limited sample size. BMC Med Inform Decis Mak. 2022;22:270. doi:10.1186/s12911- 022-02016-z

work page doi:10.1186/s12911- 2022
[40]

Few-shot deployment of pretrained MRI transformers in brain imaging tasks

Li M, Shen G, Farris CW, Zhang X. Few-shot deployment of pretrained MRI transformers in brain imaging tasks. Front Artif Intell. 2026;9:1771088. doi:10.3389/frai.2026.1771088

work page doi:10.3389/frai.2026.1771088 2026
[41]

Decoupled Weight Decay Regularization

Loshchilov I, Hutter F. Decoupled weight decay regularization. In: International Conference on Learning Representations; 2019. doi:10.48550/arXiv.1711.05101

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.05101 2019
[42]

Image quality assessment: from error visibility to structural similarity

Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13:600-

2004
[43]

doi:10.1109/TIP.2003.819861

work page doi:10.1109/tip.2003.819861 2003
[44]

Mean squared error: love it or leave it? A new look at signal fidelity measures

Wang Z, Bovik AC. Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Process Mag. 2009;26:98-117. doi:10.1109/MSP.2008.930649

work page doi:10.1109/msp.2008.930649 2009
[45]

Scope of validity of PSNR in image/video quality assessment

Huynh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video quality assessment. Electron Lett. 2008;44:800-801. doi:10.1049/el:20080522

work page doi:10.1049/el:20080522 2008
[46]

Densely connected convolutional networks

Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:4700-4708. doi:10.1109/CVPR.2017.243 48

work page doi:10.1109/cvpr.2017.243 2017
[47]

MedViT: a robust vision transformer for generalized medical image classification

Manzari ON, Ahmadabadi H, Kashiani H, et al. MedViT: a robust vision transformer for generalized medical image classification. Comput Biol Med. 2023;157:106791. doi:10.1016/j.compbiomed.2023.106791

work page doi:10.1016/j.compbiomed.2023.106791 2023
[48]

Deep residual learning for image recognition

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778. doi:10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016
[49]

Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018:6546-6555. doi:10.48550/arXiv.1711.09577

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.09577 2018
[50]

Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study

Zech JR, Badgeley MA, Liu M, et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15. doi:10.1371/journal.pmed.1002683

work page doi:10.1371/journal.pmed.1002683 2018
[51]

Second opinion needed: communicating uncertainty in medical machine learning

Kompa B, Snoek J, Beam AL. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021;4:4. doi:10.1038/s41746-020-00367-3

work page doi:10.1038/s41746-020-00367-3 2021
[52]

Effects of artificial intelligence implementation on efficiency in medical imaging: a systematic literature review and meta-analysis

Wenderott K, Krups J, Zaruchas F, Weigl M. Effects of artificial intelligence implementation on efficiency in medical imaging: a systematic literature review and meta-analysis. NPJ Digit Med. 2024;7:265. doi:10.1038/s41746-024-01248-9

work page doi:10.1038/s41746-024-01248-9 2024
[53]

Structural magnetic resonance imaging in the practical assessment of dementia: beyond exclusion

Scheltens P, Fox N, Barkhof F, De Carli C. Structural magnetic resonance imaging in the practical assessment of dementia: beyond exclusion. Lancet Neurol. 2002;1:13-21. doi:10.1016/S1474-4422(02)00002-9

work page doi:10.1016/s1474-4422(02)00002-9 2002

[1] [1]

Impact of dementia: health disparities, population trends, care interventions, and economic costs

Aranda MP, Kremer IN, Hinton L, et al. Impact of dementia: health disparities, population trends, care interventions, and economic costs. J Am Geriatr Soc. 2021;69:1774-1783. doi:10.1111/jgs.17345

work page doi:10.1111/jgs.17345 2021

[2] [2]

NIA-AA Research Framework: toward a biological definition of Alzheimer’s disease

Jack CR Jr, Bennett DA, Blennow K, et al. NIA-AA Research Framework: toward a biological definition of Alzheimer’s disease. Alzheimers Dement. 2018;14:535-562. doi:10.1016/j.jalz.2018.02.018

work page doi:10.1016/j.jalz.2018.02.018 2018

[3] [3]

Sperling RA, Aisen PS, Beckett LA, et al. Toward defining the preclinical stages of Alzheimer’s disease: recommendations from the National Institute on Aging-Alzheimer’s Association workgroups on diagnostic guidelines for Alzheimer’s disease. Alzheimers Dement. 2011;7:280-292. doi:10.1016/j.jalz.2011.03.003

work page doi:10.1016/j.jalz.2011.03.003 2011

[4] [4]

Mild cognitive impairment

Gauthier S, Reisberg B, Zaudig M, et al. Mild cognitive impairment. Lancet. 2006;367:1262-1270. doi:10.1016/S0140-6736(06)68542-5

work page doi:10.1016/s0140-6736(06)68542-5 2006

[5] [5]

Mild cognitive impairment: clinical characterization and outcome

Petersen RC, Smith GE, Waring SC, et al. Mild cognitive impairment: clinical characterization and outcome. Arch Neurol. 1999;56:303-308. doi:10.1001/archneur.56.3.303

work page doi:10.1001/archneur.56.3.303 1999

[6] [6]

Focusing on earlier diagnosis of Alzheimer’s disease

Frederiksen KS, Arus XM, Zetterberg H, et al. Focusing on earlier diagnosis of Alzheimer’s disease. Future Neurol. 2024;19:2337452. doi:10.2217/fnl-2023-0024

work page doi:10.2217/fnl-2023-0024 2024

[7] [7]

What Alzheimer’s disease can learn from oncology’s early-detection revolution: toward scalable, cost-effective dementia diagnostics

Sun Y. What Alzheimer’s disease can learn from oncology’s early-detection revolution: toward scalable, cost-effective dementia diagnostics. Alzheimers Dement (Amst). 2026;18. doi:10.1002/dad2.70306 43

work page doi:10.1002/dad2.70306 2026

[8] [8]

Expected and diagnosed rates of mild cognitive impairment and dementia in the US Medicare population: observational analysis

Mattke S, Jun H, Chen E, et al. Expected and diagnosed rates of mild cognitive impairment and dementia in the US Medicare population: observational analysis. Alzheimers Res Ther. 2023;15:128. doi:10.1186/s13195-023-01272-z

work page doi:10.1186/s13195-023-01272-z 2023

[9] [9]

Correlates of missed or late versus timely diagnosis of dementia in healthcare settings

Chen Y, Power MC, Grodstein F, et al. Correlates of missed or late versus timely diagnosis of dementia in healthcare settings. Alzheimers Dement. 2024;20:5551-5560. doi:10.1002/alz.14067

work page doi:10.1002/alz.14067 2024

[10] [10]

Prevalence and determinants of undetected dementia in the community: a systematic literature review and meta-analysis

Lang L, Clifford A, Wei L, et al. Prevalence and determinants of undetected dementia in the community: a systematic literature review and meta-analysis. BMJ Open. 2017;7. doi:10.1136/bmjopen-2016-011146

work page doi:10.1136/bmjopen-2016-011146 2017

[11] [11]

Time to diagnosis in dementia: a systematic review with meta-analysis

Kusoro O, Roche M, Del-Pino-Casado R, et al. Time to diagnosis in dementia: a systematic review with meta-analysis. Int J Geriatr Psychiatry. 2025;40. doi:10.1002/gps.70129

work page doi:10.1002/gps.70129 2025

[12] [12]

On the Opportunities and Risks of Foundation Models

Bommasani R, Hudson DA, Adeli E, et al. On the opportunities and risks of foundation models. arXiv [Preprint]. Published August 16, 2021. doi:10.48550/arXiv.2108.07258

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2108.07258 2021

[13] [13]

Foundation models for generalist medical artificial intelligence

Moor M, Banerjee O, Abad ZS, et al. Foundation models for generalist medical artificial intelligence. Nature. 2023;616:259-265. doi:10.1038/s41586-023- 05881-4

work page doi:10.1038/s41586-023- 2023

[14] [14]

Masked Autoencoders Are Scalable Vision Learners

He K, Chen X, Xie S, et al. Masked autoencoders are scalable vision learners. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 2022:16000-16009. doi:10.48550/arXiv.2111.06377 44

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2111.06377 2022

[15] [15]

Overcoming data scarcity in biomedical imaging with a foundational multi-task model

Schäfer R, Nicke T, Höfener H, et al. Overcoming data scarcity in biomedical imaging with a foundational multi-task model. Nat Comput Sci. 2024;4:495-509. doi:10.1038/s43588-024-00662-z

work page doi:10.1038/s43588-024-00662-z 2024

[16] [16]

Foundation model for cancer imaging biomarkers

Pai S, Bontempi D, Hadzic I, et al. Foundation model for cancer imaging biomarkers. Nat Mach Intell. 2024;6:354-367. doi:10.1038/s42256-024-00807-9

work page doi:10.1038/s42256-024-00807-9 2024

[17] [17]

A generalizable foundation model for analysis of human brain MRI

Tak D, Garomsa BA, Zapaishchykova A, et al. A generalizable foundation model for analysis of human brain MRI. Nat Neurosci. Published online February 5,

[18] [18]

doi:10.1038/s41593-026-02202-6

work page doi:10.1038/s41593-026-02202-6

[19] [19]

The clinical use of structural MRI in Alzheimer disease

Frisoni GB, Fox NC, Jack CR Jr, et al. The clinical use of structural MRI in Alzheimer disease. Nat Rev Neurol. 2010;6:67-77. doi:10.1038/nrneurol.2009.215

work page doi:10.1038/nrneurol.2009.215 2010

[20] [20]

Imaging biomarkers of dementia: recommended visual rating scales with teaching cases

Wahlund LO, Westman E, van Westen D, et al. Imaging biomarkers of dementia: recommended visual rating scales with teaching cases. Insights Imaging. 2017;8:79-90. doi:10.1007/s13244-016-0521-6

work page doi:10.1007/s13244-016-0521-6 2017

[21] [21]

General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis

Lu S, Chen Y, Chen Y, et al. General lightweight framework for vision foundation model supporting multi-task and multi-center medical image analysis. Nat Commun. 2025;16:2097. doi:10.1038/s41467-025-57427-z

work page doi:10.1038/s41467-025-57427-z 2025

[22] [22]

Attention Is All You Need

Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need. Adv Neural Inf Process Syst. 2017;30. doi:10.48550/arXiv.1706.03762

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1706.03762 2017

[23] [23]

The limits of fair medical imaging AI in real-world generalization

Yang Y, Zhang H, Gichoya JW, et al. The limits of fair medical imaging AI in real-world generalization. Nat Med. 2024;30:2838-2848. doi:10.1038/s41591-024- 03113-4 45

work page doi:10.1038/s41591-024- 2024

[24] [24]

Tackling prediction uncertainty in machine learning for healthcare

Chua M, Kim D, Choi J, et al. Tackling prediction uncertainty in machine learning for healthcare. Nat Biomed Eng. 2023;7:711-718. doi:10.1038/s41551-022- 00988-x

work page doi:10.1038/s41551-022- 2023

[25] [25]

Tailored for real-world: a whole- slide image classification system validated on uncurated multisite data emulating the prospective pathology workload

Ianni JD, Soans RE, Sankarapandian S, et al. Tailored for real-world: a whole- slide image classification system validated on uncurated multisite data emulating the prospective pathology workload. Sci Rep. 2020;10:3217. doi:10.1038/s41598-020- 59985-2

work page doi:10.1038/s41598-020- 2020

[26] [26]

Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models

Young AT, Fernandez K, Pfau J, et al. Stress testing reveals gaps in clinic readiness of image-based diagnostic artificial intelligence models. NPJ Digit Med. 2021;4:10. doi:10.1038/s41746-020-00380-6

work page doi:10.1038/s41746-020-00380-6 2021

[27] [27]

Opportunistic screening: Radiology Scientific Expert Panel

Pickhardt PJ, Summers RM, Garrett JW, et al. Opportunistic screening: Radiology Scientific Expert Panel. Radiology. 2023;307. doi:10.1148/radiol.222044

work page doi:10.1148/radiol.222044 2023

[28] [28]

Medial temporal lobe atrophy is underreported and may have important clinical correlates in medical inpatients

Torisson G, van Westen D, Stavenow L, et al. Medial temporal lobe atrophy is underreported and may have important clinical correlates in medical inpatients. BMC Geriatr. 2015;15:65. doi:10.1186/s12877-015-0066-4

work page doi:10.1186/s12877-015-0066-4 2015

[29] [29]

Radiological reporting of brain atrophy in MRI: real-life comparison between narrative reports, semiquantitative scales, and automated software-based volumetry

Bruno F, Fagotti C, Saltarelli G, et al. Radiological reporting of brain atrophy in MRI: real-life comparison between narrative reports, semiquantitative scales, and automated software-based volumetry. Diagnostics (Basel). 2025;15:1246. doi:10.3390/diagnostics15101246

work page doi:10.3390/diagnostics15101246 2025

[30] [30]

Structural imaging findings on non- enhanced computed tomography are severely underreported in the primary care diagnostic work-up of subjective cognitive decline

Håkansson C, Torisson G, Londos E, et al. Structural imaging findings on non- enhanced computed tomography are severely underreported in the primary care diagnostic work-up of subjective cognitive decline. Neuroradiology. 2019;61:397-404. doi:10.1007/s00234-019-02156-6 46

work page doi:10.1007/s00234-019-02156-6 2019

[31] [31]

Automated opportunistic osteoporotic fracture risk assessment using computed tomography scans to aid in FRAX underutilization

Dagan N, Elnekave E, Barda N, et al. Automated opportunistic osteoporotic fracture risk assessment using computed tomography scans to aid in FRAX underutilization. Nat Med. 2020;26:77-82. doi:10.1038/s41591-019-0720-z

work page doi:10.1038/s41591-019-0720-z 2020

[32] [32]

Incidental coronary artery calcium: opportunistic screening of previous nongated chest computed tomography scans to improve statin rates—the NOTIFY -1 project

Sandhu AT, Rodriguez F, Ngo S, et al. Incidental coronary artery calcium: opportunistic screening of previous nongated chest computed tomography scans to improve statin rates—the NOTIFY -1 project. Circulation. 2023;147:703-714. doi:10.1161/CIRCULATIONAHA.122.062746

work page doi:10.1161/circulationaha.122.062746 2023

[33] [33]

Automated CT biomarkers for opportunistic prediction of future cardiovascular events and mortality in an asymptomatic screening population: a retrospective cohort study

Pickhardt PJ, Graffy PM, Zea R, et al. Automated CT biomarkers for opportunistic prediction of future cardiovascular events and mortality in an asymptomatic screening population: a retrospective cohort study. Lancet Digit Health. 2020;2. doi:10.1016/S2589-7500(20)30025-X

work page doi:10.1016/s2589-7500(20)30025-x 2020

[34] [34]

RadImageNet: an open radiologic deep learning research dataset for effective transfer learning

Mei X, Liu Z, Robson PM, et al. RadImageNet: an open radiologic deep learning research dataset for effective transfer learning. Radiol Artif Intell. 2022;4. doi:10.1148/ryai.210315

work page doi:10.1148/ryai.210315 2022

[35] [35]

Jenkinson M, Beckmann CF, Behrens TE, et al. FSL. Neuroimage. 2012;62:782-790. doi:10.1016/j.neuroimage.2011.09.015

work page doi:10.1016/j.neuroimage.2011.09.015 2012

[36] [36]

Non-linear registration, aka spatial normalisation

Andersson JL, Jenkinson M, Smith S. Non-linear registration, aka spatial normalisation. FMRIB Technical Report TR07JA2. FMRIB Analysis Group, University of Oxford; 2007. Accessed [June, 2026]. https://www.fmrib.ox.ac.uk/datasets/techrep/tr07ja2/tr07ja2.pdf

2007

[37] [37]

Assessing and tuning brain decoders: cross-validation, caveats, and guidelines

Varoquaux G, Raamana PR, Engemann DA, et al. Assessing and tuning brain decoders: cross-validation, caveats, and guidelines. Neuroimage. 2017;145:166-179. doi:10.1016/j.neuroimage.2016.10.038 47

work page doi:10.1016/j.neuroimage.2016.10.038 2017

[38] [38]

A guide to cross-validation for artificial intelligence in medical imaging

Bradshaw TJ, Huemann Z, Hu J, Rahmim A. A guide to cross-validation for artificial intelligence in medical imaging. Radiol Artif Intell. 2023;5. doi:10.1148/ryai.220232

work page doi:10.1148/ryai.220232 2023

[39] [39]

Monte Carlo cross-validation for a study with binary outcome and limited sample size

Shan G. Monte Carlo cross-validation for a study with binary outcome and limited sample size. BMC Med Inform Decis Mak. 2022;22:270. doi:10.1186/s12911- 022-02016-z

work page doi:10.1186/s12911- 2022

[40] [40]

Few-shot deployment of pretrained MRI transformers in brain imaging tasks

Li M, Shen G, Farris CW, Zhang X. Few-shot deployment of pretrained MRI transformers in brain imaging tasks. Front Artif Intell. 2026;9:1771088. doi:10.3389/frai.2026.1771088

work page doi:10.3389/frai.2026.1771088 2026

[41] [41]

Decoupled Weight Decay Regularization

Loshchilov I, Hutter F. Decoupled weight decay regularization. In: International Conference on Learning Representations; 2019. doi:10.48550/arXiv.1711.05101

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.05101 2019

[42] [42]

Image quality assessment: from error visibility to structural similarity

Wang Z, Bovik AC, Sheikh HR, Simoncelli EP. Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Process. 2004;13:600-

2004

[43] [43]

doi:10.1109/TIP.2003.819861

work page doi:10.1109/tip.2003.819861 2003

[44] [44]

Mean squared error: love it or leave it? A new look at signal fidelity measures

Wang Z, Bovik AC. Mean squared error: love it or leave it? A new look at signal fidelity measures. IEEE Signal Process Mag. 2009;26:98-117. doi:10.1109/MSP.2008.930649

work page doi:10.1109/msp.2008.930649 2009

[45] [45]

Scope of validity of PSNR in image/video quality assessment

Huynh-Thu Q, Ghanbari M. Scope of validity of PSNR in image/video quality assessment. Electron Lett. 2008;44:800-801. doi:10.1049/el:20080522

work page doi:10.1049/el:20080522 2008

[46] [46]

Densely connected convolutional networks

Huang G, Liu Z, van der Maaten L, Weinberger KQ. Densely connected convolutional networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017:4700-4708. doi:10.1109/CVPR.2017.243 48

work page doi:10.1109/cvpr.2017.243 2017

[47] [47]

MedViT: a robust vision transformer for generalized medical image classification

Manzari ON, Ahmadabadi H, Kashiani H, et al. MedViT: a robust vision transformer for generalized medical image classification. Comput Biol Med. 2023;157:106791. doi:10.1016/j.compbiomed.2023.106791

work page doi:10.1016/j.compbiomed.2023.106791 2023

[48] [48]

Deep residual learning for image recognition

He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2016:770-778. doi:10.1109/CVPR.2016.90

work page doi:10.1109/cvpr.2016.90 2016

[49] [49]

Can Spatiotemporal 3D CNNs Retrace the History of 2D CNNs and ImageNet?

Hara K, Kataoka H, Satoh Y. Can spatiotemporal 3D CNNs retrace the history of 2D CNNs and ImageNet? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018:6546-6555. doi:10.48550/arXiv.1711.09577

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1711.09577 2018

[50] [50]

Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study

Zech JR, Badgeley MA, Liu M, et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: a cross-sectional study. PLoS Med. 2018;15. doi:10.1371/journal.pmed.1002683

work page doi:10.1371/journal.pmed.1002683 2018

[51] [51]

Second opinion needed: communicating uncertainty in medical machine learning

Kompa B, Snoek J, Beam AL. Second opinion needed: communicating uncertainty in medical machine learning. NPJ Digit Med. 2021;4:4. doi:10.1038/s41746-020-00367-3

work page doi:10.1038/s41746-020-00367-3 2021

[52] [52]

Effects of artificial intelligence implementation on efficiency in medical imaging: a systematic literature review and meta-analysis

Wenderott K, Krups J, Zaruchas F, Weigl M. Effects of artificial intelligence implementation on efficiency in medical imaging: a systematic literature review and meta-analysis. NPJ Digit Med. 2024;7:265. doi:10.1038/s41746-024-01248-9

work page doi:10.1038/s41746-024-01248-9 2024

[53] [53]

Structural magnetic resonance imaging in the practical assessment of dementia: beyond exclusion

Scheltens P, Fox N, Barkhof F, De Carli C. Structural magnetic resonance imaging in the practical assessment of dementia: beyond exclusion. Lancet Neurol. 2002;1:13-21. doi:10.1016/S1474-4422(02)00002-9

work page doi:10.1016/s1474-4422(02)00002-9 2002