pith. sign in

arxiv: 2606.24604 · v1 · pith:KWN2DW7Wnew · submitted 2026-06-23 · 💻 cs.AI

Uncertainty-Aware Longitudinal Forecasting of Alzheimer's Disease Progression Using Deep Learning

Pith reviewed 2026-06-25 23:42 UTC · model grok-4.3

classification 💻 cs.AI
keywords Alzheimer's diseaselongitudinal forecastinguncertainty estimationdeep learningprobabilistic trajectoriesTemporal Fusion TransformerMixture Density NetworkADNI
0
0 comments X

The pith

A probabilistic deep learning model generates five-year Alzheimer's trajectories with calibrated uncertainty and outperforms baselines on diagnosis prediction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a framework that forecasts not single next diagnoses but full multi-year paths for Alzheimer's patients while reporting how reliable each forecast is. It adapts a Temporal Fusion Transformer to respect the ordered stages of disease and then uses an autoregressive mixture density network to produce trajectories for diagnosis, cognitive scores, and hippocampal volume. The resulting predictions show stronger accuracy on the hard MCI-to-dementia boundary and produce credible intervals whose coverage matches the nominal 90 percent level. Uncertainty grows naturally with time and is higher for uncommon progression patterns. This lets forecasts convey both likely courses and the range of plausible alternatives rather than point estimates alone.

Core claim

Conditioning an autoregressive Mixture Density Network on patient-context vectors from a Temporal Fusion Transformer encoder equipped with a CORAL ordinal layer produces five-year probabilistic trajectories for diagnosis state, CDR Sum of Boxes, MMSE orientation, and hippocampal volume that achieve near-nominal 90 percent credible-interval coverage, widen appropriately across the horizon, remain consistent with known Alzheimer's biomarker dynamics, and yield higher next-visit accuracy than linear, recurrent, and transformer baselines, especially on MCI-versus-dementia discrimination; aleatoric and epistemic uncertainty are separated via analytic mixture variance and a five-member bootstrap e

What carries the argument

Conditioning of an autoregressive Mixture Density Network on patient-context representations learned by a Temporal Fusion Transformer encoder with CORAL ordinal output layer.

If this is right

  • Next-visit diagnosis accuracy improves most on the MCI-to-dementia transition relative to linear, recurrent, and transformer baselines.
  • Generated trajectories maintain near-nominal 90 percent credible-interval coverage that widens across the five-year horizon.
  • Biomarker trajectories inside the model remain consistent with expected Alzheimer's progression patterns.
  • Epistemic uncertainty rises for rare progression archetypes, MCI and dementia patients, and on external data such as OASIS-3.
  • Aleatoric uncertainty is obtained directly from mixture variance while epistemic uncertainty is obtained from bootstrap ensemble diversity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Forecasts could support individualized planning by showing families the range of possible five-year outcomes rather than a single most-likely path.
  • The same encoder-plus-MDN structure could be tested on other slowly progressing conditions with ordered stages and repeated biomarker measurements.
  • High-epistemic-uncertainty cases identified by the bootstrap ensemble could be flagged for closer clinical follow-up or additional data collection.
  • Trajectory distributions could be used as inputs to simulation studies that test how candidate interventions would shift the entire forecast envelope.

Load-bearing premise

The patient representations learned by the encoder capture the dynamics needed to produce long-term trajectories that stay consistent with Alzheimer's biomarker changes.

What would settle it

On a new longitudinal cohort the generated 90 percent credible intervals cover the observed diagnosis and biomarker values at a rate below 75 percent, or the trajectories show hippocampal-volume or MMSE changes opposite in direction to established Alzheimer's progression patterns.

Figures

Figures reproduced from arXiv: 2606.24604 by Anala M R, Arya Hariharan, Shreyank N Gowda.

Figure 1
Figure 1. Figure 1: Final dataset cohort overview. The final dataset comprised 2039 participants [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Architecture overview The ordinal structure imposes the constraint that mis￾classifying CN as Dementia should be penalised more heavily than misclassifying CN as MCI. 2. Probabilistic trajectory generation Given 𝑡 , gen￾erate a set of 𝑆 plausible future trajectories {𝝉 (𝑠)} 𝑆 𝑠=1, where each trajectory 𝝉 (𝑠) = {(𝑦 (𝑠) ℎ , 𝐛 (𝑠) ℎ )}𝐻 ℎ=1 spec￾ifies a diagnosis state and biomarker vector 𝐛ℎ = (CDR-SBℎ , MM… view at source ↗
Figure 3
Figure 3. Figure 3: Stability analysis results A. Hariharan et al.: Preprint submitted to Elsevier Page 16 of 16 [PITH_FULL_IMAGE:figures/full_fig_p016_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Reliability diagrams for predicted diagnosis transition probabilities on the test set. A. Hariharan et al.: Preprint submitted to Elsevier Page 17 of 16 [PITH_FULL_IMAGE:figures/full_fig_p017_4.png] view at source ↗
read the original abstract

Longitudinal modelling of Alzheimer's disease progression is clinically useful only if it can describe not just the most likely next diagnosis, but how a patient may evolve over time and how reliable that forecast is. Most deep learning approaches reduce this problem to single-step classification, treating cognitively normal, mild cognitive impairment, and dementia as flat categories while providing limited insight into how uncertainty accumulates across future visits. We propose a probabilistic framework that combines ordinal diagnosis prediction, multi-horizon trajectory generation, and decomposed uncertainty estimation. A Temporal Fusion Transformer encoder is adapted with a CORAL ordinal output layer, asymmetric loss weighting, and converter oversampling to respect disease-stage ordering and improve sensitivity to MCI-to-dementia transitions. Conditioned on the learned patient-context representation, an autoregressive Mixture Density Network generates five-year probabilistic trajectories for diagnosis state, CDR Sum of Boxes, MMSE orientation, and hippocampal volume. On ADNI, the model outperforms linear, recurrent, and transformer baselines for next-visit diagnosis prediction, with the strongest gains on MCI-versus-dementia discrimination. Generated trajectories achieve near-nominal 90% credible interval coverage, widening uncertainty across the forecast horizon, and biomarker dynamics consistent with expected Alzheimer's disease progression. We further separate aleatoric from epistemic uncertainty using analytic mixture variance and a five-member bootstrap ensemble, which provides the strongest encoder diversity and output-level epistemic signal. Epistemic uncertainty is higher for rare progression archetypes, MCI and dementia patients, and under external evaluation on OASIS-3, where it increases alongside prediction error.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a probabilistic framework for longitudinal Alzheimer's disease forecasting that integrates a Temporal Fusion Transformer encoder (with CORAL ordinal output, asymmetric loss weighting, and converter oversampling) and an autoregressive Mixture Density Network to generate five-year trajectories for diagnosis state, CDR Sum of Boxes, MMSE, and hippocampal volume while decomposing aleatoric and epistemic uncertainty. On the ADNI dataset it claims superior next-visit diagnosis prediction (especially MCI-to-dementia discrimination) over linear, recurrent, and transformer baselines, near-nominal 90% credible-interval coverage that widens with horizon, biomarker dynamics consistent with expected progression, and higher epistemic uncertainty for rare archetypes and under external OASIS-3 evaluation.

Significance. If the performance and coverage claims are substantiated with quantitative results and rigorous evaluation protocols, the work would offer a concrete advance in multi-horizon, uncertainty-aware longitudinal modeling that respects ordinal disease stages and separates uncertainty sources, addressing a recognized gap between single-step classification and clinically useful trajectory forecasting.

major comments (2)
  1. [Abstract] Abstract: the central claims of outperformance on next-visit diagnosis prediction and near-nominal 90% credible-interval coverage for five-year trajectories are stated without any numerical metrics (AUC, accuracy, coverage percentages), statistical tests, baseline hyper-parameter details, or ablation results, rendering the soundness of the performance assertions unverifiable from the provided text.
  2. [Abstract / Method] Method and Evaluation (implied in abstract description of autoregressive MDN): the claim that the TFT-derived patient context produces accurate five-year probabilistic trajectories with biomarker-consistent dynamics and nominal coverage does not specify whether scheduled sampling, teacher forcing, or direct multi-horizon training was employed, nor whether coverage was evaluated on actual held-out future visits versus simulated rollouts; this detail is load-bearing for the weakest assumption that autoregressive generation avoids compounding errors over five-year horizons.
minor comments (1)
  1. [Abstract] The abstract mentions 'converter oversampling ratio' and 'asymmetric loss weighting' as free parameters but does not indicate their chosen values or sensitivity analysis.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and have revised the manuscript to improve verifiability of the claims.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claims of outperformance on next-visit diagnosis prediction and near-nominal 90% credible-interval coverage for five-year trajectories are stated without any numerical metrics (AUC, accuracy, coverage percentages), statistical tests, baseline hyper-parameter details, or ablation results, rendering the soundness of the performance assertions unverifiable from the provided text.

    Authors: We agree that the abstract should contain key numerical results to allow readers to assess the claims directly. The revised abstract now reports the AUC for next-visit diagnosis prediction (with emphasis on MCI-to-dementia discrimination), the observed 90% credible-interval coverage at different horizons, and a brief note on the statistical comparisons against baselines. revision: yes

  2. Referee: [Abstract / Method] Method and Evaluation (implied in abstract description of autoregressive MDN): the claim that the TFT-derived patient context produces accurate five-year probabilistic trajectories with biomarker-consistent dynamics and nominal coverage does not specify whether scheduled sampling, teacher forcing, or direct multi-horizon training was employed, nor whether coverage was evaluated on actual held-out future visits versus simulated rollouts; this detail is load-bearing for the weakest assumption that autoregressive generation avoids compounding errors over five-year horizons.

    Authors: We acknowledge that the original text did not explicitly describe the autoregressive training and rollout protocol. The revised Methods section now states that teacher forcing was used during training with scheduled sampling introduced for longer horizons, and that coverage statistics were computed on actual held-out future visits from the ADNI longitudinal folds (with full five-year trajectories generated via autoregressive rollout only where future observations were unavailable). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical results on held-out data with no self-referential derivations

full rationale

The paper describes a TFT encoder with CORAL ordinal layer feeding an autoregressive MDN for multi-horizon trajectories, evaluated on ADNI held-out splits for next-visit prediction and coverage metrics. No equations, parameter fits, or self-citations are presented that reduce the reported performance, coverage, or biomarker consistency claims to quantities defined by the model's own fitted inputs or prior author work. The derivation chain consists of standard architectural choices and external data evaluation, remaining self-contained.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The framework rests on standard supervised learning assumptions plus domain-specific ordering of disease stages. Multiple training choices (loss weighting, oversampling ratios, ensemble size) function as free parameters tuned to the ADNI cohort.

free parameters (2)
  • asymmetric loss weighting
    Weights applied to different stage transitions to improve sensitivity to MCI-to-dementia conversions; chosen during training.
  • converter oversampling ratio
    Oversampling factor for patients who convert from MCI to dementia; selected to balance the training distribution.
axioms (1)
  • domain assumption Cognitive diagnosis stages possess a natural ordinal structure that should be explicitly respected by the output layer.
    Invoked to justify the CORAL ordinal output layer.

pith-pipeline@v0.9.1-grok · 5805 in / 1410 out tokens · 34723 ms · 2026-06-25T23:42:53.470902+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

44 extracted references · 15 canonical work pages

  1. [1]

    The Alzheimer’s Disease Neuroimaging Initiative clinical core: progress and plans

    Aisen, P.S., Veitch, D.P., Sperling, R., Petersen, R.C., Bollinger, J., Raman, R., Donohue, M.C., Weiner, M.W., 2024. The Alzheimer’s Disease Neuroimaging Initiative clinical core: progress and plans. Alzheimer’s & Dementia 20, 5143–5154. doi:10.1002/alz.14167

  2. [2]

    ChronoFormer: time-aware transformer architectures for structured clinical event modeling

    Alsentzer, E., McDermott, M., Falck, F., Schiratti, J.B., Naumann, T., 2025. ChronoFormer: time-aware transformer architectures for structured clinical event modeling. arXiv preprint arXiv:2504.07373

  3. [3]

    CXR-TFT: Multi-modal temporal fusion transformer for predicting chest X-ray trajectories, in: Medical Image Computing and Computer Assisted Intervention – MICCAI 2025, Springer

    Arora, M., Wang, X., Erickson, B.J., 2025. CXR-TFT: Multi-modal temporal fusion transformer for predicting chest X-ray trajectories, in: Medical Image Computing and Computer Assisted Intervention – MICCAI 2025, Springer. doi:10.1007/978-3-032-05182-0_16

  4. [4]

    The need for uncertainty quantification in machine-assisted medical decision making

    Begoli, E., Bhattacharya, T., Kusnezov, D., 2019. The need for uncertainty quantification in machine-assisted medical decision making. Nature Machine Intelligence 1, 20–23. doi:10.1038/ s42256-018-0004-1

  5. [5]

    Mixture density networks Technical Report NCRG/94/004

    Bishop, C.M., 1994. Mixture density networks Technical Report NCRG/94/004

  6. [6]

    Rank consistent ordinal regression for neural networks with application to age estimation, Elsevier

    Cao, W., Mirjalili, V., Raschka, S., 2020. Rank consistent ordinal regression for neural networks with application to age estimation, Elsevier. pp. 325–331. doi:10.1016/j.patrec.2020.11.008

  7. [7]

    Using mixture density networks to emulate a stochastic within-host model ofFrancisella tularensis infection

    Carruthers, J., Finnie, T., 2023. Using mixture density networks to emulate a stochastic within-host model ofFrancisella tularensis infection. PLOS Computational Biology 19, e1011266. doi:10.1371/ journal.pcbi.1011266

  8. [8]

    A mixture model for subtype identification in the context of disease progression modeling

    Castaño, D., Schiratti, J.B., Durrleman, S., Jedynak, B., 2025. A mixture model for subtype identification in the context of disease progression modeling. arXiv preprint arXiv:2603.04286

  9. [9]

    A transformer- based unified multimodal framework for Alzheimer’s disease assess- ment

    Chen, T., Wang, Y., Liu, X., Zhang, H., Li, W., 2024. A transformer- based unified multimodal framework for Alzheimer’s disease assess- ment. ComputersinBiologyandMedicine181,109050. doi:10.1016/ j.compbiomed.2024.109050

  10. [10]

    An event-based model for disease progression and its application in familial Alzheimer’s disease and huntington’s disease

    Fonteijn, H.M., Modat, M., Clarkson, M.J., Barnes, J., Lehmann, M., Hobbs, N.Z., Scahill, R.I., Tabrizi, S.J., Ourselin, S., Fox, N.C., et al., 2012. An event-based model for disease progression and its application in familial Alzheimer’s disease and huntington’s disease. NeuroImage 60, 1880–1889. doi:10.1016/j.neuroimage.2012.01.062

  11. [12]

    On calibration ofmodernneuralnetworks

    Guo, C., Pleiss, G., Sun, Y., Weinberger, K.Q., 2017. On calibration ofmodernneuralnetworks. URL:https://arxiv.org/abs/1706.04599, arXiv:1706.04599

  12. [13]

    DeepAD:arobustdeeplearningmodelof Alzheimer’s disease progression for real-world clinical applications

    Hashemifar, S., Iriondo, C., Hejrati, M., Alzheimer’s Disease Neu- roimagingInitiative,2022. DeepAD:arobustdeeplearningmodelof Alzheimer’s disease progression for real-world clinical applications. arXiv preprint arXiv:2203.09096

  13. [14]

    A stage-aware mixture of experts framework for neurodegenerative disease progression mod- elling

    He, T., Jiang, K., Zhao, A., Schroder, A., Thompson, E., Soskic, S., Barkhof, F., Alexander, D.C., 2025. A stage-aware mixture of experts framework for neurodegenerative disease progression mod- elling. arXiv preprint arXiv:2508.07032

  14. [15]

    Higgins, I., Matthey, L., Pal, A., Burgess, C., Glorot, X., Botvinick, M., Mohamed, S., Lerchner, A., 2017.𝛽-VAE: Learning basic visual concepts with a constrained variational framework, in: International Conference on Learning Representations

  15. [16]

    H¨ullermeier, E., F¨urnkranz, J., Cheng, W., and Brinker, K

    Hüllermeier,E.,Waegeman,W.,2021.Aleatoricandepistemicuncer- tainty in machine learning: an introduction to concepts and methods. Machine Learning 110, 457–506. doi:10.1007/s10994-021-05946-3

  16. [17]

    Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cas- cade

    Jack, C.R., Knopman, D.S., Jagust, W.J., Shaw, L.M., Aisen, P.S., Weiner,M.W.,Petersen,R.C.,Trojanowski,J.Q.,2010. Hypothetical model of dynamic biomarkers of the Alzheimer’s pathological cas- cade. TheLancetNeurology9,119–128. doi:10.1016/S1474-4422(09) 70299-6

  17. [18]

    Deep ensembles for epistemic uncertainty: a frequentist perspective

    Jain, A., Jaakkola, T., Barber, D., 2025. Deep ensembles for epistemic uncertainty: a frequentist perspective. arXiv preprint arXiv:2510.22063

  18. [19]

    Ordinal-ResLogit: interpretable deep residual neural networks for ordered choices

    Kamal, K., Farooq, B., 2024. Ordinal-ResLogit: interpretable deep residual neural networks for ordered choices. Journal of Choice Modelling 50, 100454. doi:10.1016/j.jocm.2023.100454

  19. [20]

    Mixture of input-output hidden Markov models for heterogeneous disease progression modeling, in: Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022

    Karaçay, B., Bianchi, M., Günnemann, S., Bouchard, G., 2022. Mixture of input-output hidden Markov models for heterogeneous disease progression modeling, in: Proceedings of the 1st Workshop on Healthcare AI and COVID-19, ICML 2022

  20. [21]

    What uncertainties do we need in Bayesian deep learning for computer vision?, in: Advances in Neural Information Processing Systems, Curran Associates

    Kendall, A., Gal, Y., 2017. What uncertainties do we need in Bayesian deep learning for computer vision?, in: Advances in Neural Information Processing Systems, Curran Associates

  21. [22]

    Adam: a method for stochastic optimiza- tion, in: International Conference on Learning Representations

    Kingma, D.P., Ba, J., 2015. Adam: a method for stochastic optimiza- tion, in: International Conference on Learning Representations

  22. [23]

    Auto-encoding variational Bayes, in: International Conference on Learning Representations

    Kingma, D.P., Welling, M., 2014. Auto-encoding variational Bayes, in: International Conference on Learning Representations

  23. [24]

    Distribution shift detection for the postmarket surveillance of medical AI algorithms: a retrospective simulation study

    Koch, L.M., Baumgartner, C.F., Berens, P., 2024. Distribution shift detection for the postmarket surveillance of medical AI algorithms: a retrospective simulation study. npj Digital Medicine 7, 113. doi:10. 1038/s41746-024-01085-w

  24. [25]

    Second opinion needed: communicatinguncertaintyinmedicalmachinelearning

    Kompa, B., Snoek, J., Beam, A.L., 2021. Second opinion needed: communicatinguncertaintyinmedicalmachinelearning. NPJDigital Medicine 4, 4. doi:10.1038/s41746-020-00367-3

  25. [26]

    Simple and scalable predictive uncertainty estimation using deep ensembles, in: Advances in Neural Information Processing Systems, Curran Asso- ciates

    Lakshminarayanan, B., Pritzel, A., Blundell, C., 2017. Simple and scalable predictive uncertainty estimation using deep ensembles, in: Advances in Neural Information Processing Systems, Curran Asso- ciates

  26. [27]

    OASIS-3:longitudinalneuroimaging,clinical,and cognitivedatasetfornormalagingandAlzheimer’sdisease

    LaMontagne,P.J.,Benzinger,T.L.,Morris,J.C.,Keefe,S.,Hornbeck, R., Xiong, C., Grant, E., Hassenstab, J., Moulder, K., Vlassenko, A.G.,etal.,2019. OASIS-3:longitudinalneuroimaging,clinical,and cognitivedatasetfornormalagingandAlzheimer’sdisease. medRxiv doi:10.1101/2019.12.13.19014902

  27. [29]

    Nguyen, M., He, T., An, L., Alexander, D.C., Feng, J., Yeo, B.T.T.,

  28. [30]

    Physiological noise and signal-to-noise ratio in fMRI with multi-channel array coils

    Predicting Alzheimer’s disease progression using deep re- current neural networks. NeuroImage 222, 117203. doi:10.1016/j. neuroimage.2020.117203

  29. [31]

    Oxtoby,N.P.,Young,A.L.,Cash,D.M.,Benzinger,T.L.,Fagan,A.M., Morris,J.C.,Bateman,R.J.,Fox,N.C.,Schott,J.M.,Alexander,D.C.,

  30. [32]

    Brain 141, 1529–1544

    Data-driven models of dominantly-inherited Alzheimer’s dis- ease progression. Brain 141, 1529–1544. doi:10.1093/brain/awy050

  31. [33]

    Mild cognitive impairment

    Petersen, R.C., 2011. Mild cognitive impairment. New England Journal of Medicine 364, 2227–2234. doi:10.1056/NEJMcp0910237. A. Hariharan et al.:Preprint submitted to ElsevierPage 15 of 16 Uncertainty-Aware Longitudinal Forecasting of AD Progression

  32. [34]

    Phetrittikun, R., Suvirat, C., 2023. Temporal fusion transformer for forecasting vital sign trajectories in intensive care patients, in: 2023 IEEE International Conference on Electronics, Computing and Communication Technologies (CONECCT), IEEE. pp. 1–6. doi:10. 1109/CONECCT57959.2023.10234585

  33. [35]

    Alzheimer’s prevention initiative: a plan to accelerate the evaluation of presymptomatic treatments

    Reiman, E.M., Langbaum, J.B., Fleisher, A.S., Caselli, R.J., Chen, K., Ayutyanont, N., Quiroz, Y.T., Kosik, K.S., Lopera, F., Tariot, P.N., 2011. Alzheimer’s prevention initiative: a plan to accelerate the evaluation of presymptomatic treatments. Journal of Alzheimer’s Disease 26, S321–S329. doi:10.3233/JAD-2011-0059

  34. [36]

    Joint Models for Longitudinal and Time-to- Event Data: With Applications in R

    Rizopoulos, D., 2012. Joint Models for Longitudinal and Time-to- Event Data: With Applications in R. CRC Press, Boca Raton, FL

  35. [37]

    Deep neural networks for rank-consistent ordinal regression based on conditional probabili- ties

    Shi, X., Cao, W., Raschka, S., 2023. Deep neural networks for rank-consistent ordinal regression based on conditional probabili- ties. Pattern Analysis and Applications 26, 941–955. doi:10.1007/ s10044-023-01181-9

  36. [38]

    Predictingthe progression of mild cognitive impairment based on fine-grained and spatiotemporal features of MRI

    Tang,X.,Zhao,L.,Chen,M.,Liu,W.,Zhang,J.,2025. Predictingthe progression of mild cognitive impairment based on fine-grained and spatiotemporal features of MRI. Biomedical Signal Processing and Control 98, 107012. doi:10.1016/j.bspc.2025.107012

  37. [39]

    Attention is all you need, in: Advances in Neural Information Processing Systems, Curran Associates

    Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez, A.N., Kaiser, Ł., Polosukhin, I., 2017. Attention is all you need, in: Advances in Neural Information Processing Systems, Curran Associates

  38. [40]

    Uncertainty-aware ordinal deep learningforcross-datasetdiabeticretinopathygrading

    Wang, M., Liu, Y., Fu, H., 2026. Uncertainty-aware ordinal deep learningforcross-datasetdiabeticretinopathygrading. arXivpreprint arXiv:2602.10315

  39. [41]

    Predicting long-term progression of Alzheimer’s disease using a multimodal deep learning model incorporating interaction effects

    Wang, Y., Gao, R., Wei, T., Johnston, L., Yuan, X., Zhang, Y., Yu, Z., 2024. Predicting long-term progression of Alzheimer’s disease using a multimodal deep learning model incorporating interaction effects. Journal of Translational Medicine 22, 245. doi:10.1186/ s12967-024-05025-w

  40. [42]

    The Alzheimer’s Disease Neuroimaging Initiative 3: continued innovation for clinical trial improvement

    Weiner, M.W., Veitch, D.P., Aisen, P.S., Beckett, L.A., Cairns, N.J., Cedarbaum, J., Donohue, M.C., Green, R.C., Harvey, D., Jack, C.R., et al., 2017. The Alzheimer’s Disease Neuroimaging Initiative 3: continued innovation for clinical trial improvement. Alzheimer’s & Dementia 13, 561–571. doi:10.1016/j.jalz.2016.10.006

  41. [43]

    First, do no harm: addressing AI’s challenges with out-of-distribution data in medicine

    Weng, W.H., Liu, Q., Huang, R., Hsieh, J., Foschini, L., 2025. First, do no harm: addressing AI’s challenges with out-of-distribution data in medicine. Clinical and Translational Science 18, e70132. doi:10. 1111/cts.70132

  42. [44]

    Dementia

    World Health Organization, 2023. Dementia. Technical Report. World Health Organization. Fact sheet. Available at:https://www. who.int/news-room/fact-sheets/detail/dementia

  43. [45]

    Un- certainty quantification for machine learning in healthcare: a survey

    Zhang, Z., Chen, T., Hernández-Lobato, J.M., Li, S., 2025. Un- certainty quantification for machine learning in healthcare: a survey. arXiv preprint arXiv:2505.02874

  44. [46]

    Out-of-distribution detection in medical image analysis: a survey

    Zhao, T., Guo, Y., Wang, X., Shen, D., 2024. Out-of-distribution detection in medical image analysis: a survey. arXiv preprint arXiv:2404.18279 . Figure 3:Stability analysis results A. Hariharan et al.:Preprint submitted to ElsevierPage 16 of 16 Uncertainty-Aware Longitudinal Forecasting of AD Progression Figure 4:Reliability diagrams for predicted diagno...