arxiv: 2605.00708 · v1 · submitted 2026-05-01 · 💻 cs.LG

Recognition: unknown

Deep Kernel Learning for Stratifying Glaucoma Trajectories

Bruce Rushing , Angela Danquah , Alireza Namazi , Arjun Dirghangi , Heman Shakeri

Authors on Pith no claims yet

Pith reviewed 2026-05-09 19:28 UTC · model grok-4.3

classification 💻 cs.LG

keywords deep kernel learningglaucomapatient stratificationelectronic health recordsGaussian processesdisease progressiontrajectory modelingclinical decision support

0 comments

The pith

A deep kernel learning model on EHR data stratifies glaucoma patients into subgroups by learning progression trajectories separate from current severity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a method to identify high-risk glaucoma patients from sparse and irregular electronic health records by modeling their disease trajectories over time. It introduces a deep kernel learning setup that uses Gaussian processes with a transformer feature extractor built on clinical-BERT embeddings. If the approach works, clinicians could spot patients whose condition is actively worsening even when their measured visual acuity looks better than that of patients whose impairment is stable. This matters because current tools often conflate how bad the disease is right now with how fast it is advancing, limiting the ability to direct interventions where they will have the most effect.

Core claim

The central claim is that the deep kernel learning architecture successfully identifies three clinically distinct patient subgroups from multimodal EHR data. Crucially, the model decouples disease progression from current severity, identifying a high-risk group with a worsening trajectory despite having better average visual acuity than a second, stably poor group. This shows the model has learned to identify progression risk rather than simply reflecting the current disease state.

What carries the argument

The deep kernel learning (DKL) architecture with a Gaussian Process backend whose kernel is defined by a transformer-based feature extractor applied to clinical-BERT embeddings of the multimodal EHR data.

If this is right

Clinicians gain a decision-support tool that can flag high-risk patients for targeted interventions even when current visual acuity measurements appear relatively good.
Glaucoma management can shift from reacting to current severity toward anticipating and altering progression trajectories.
The same architecture could be applied to other chronic conditions where EHR data are sparse and irregularly sampled.
Risk stratification becomes feasible without requiring dense, regularly timed clinical measurements.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The decoupling of progression from severity might allow earlier identification of patients who need aggressive treatment before their measured function declines sharply.
Extending the model to incorporate imaging or genetic data could test whether the subgroups remain distinct and clinically useful.
If the three subgroups prove stable across different hospitals or populations, they could serve as a basis for personalized follow-up schedules.

Load-bearing premise

The identified subgroups must reflect genuine, generalizable differences in progression risk rather than artifacts of the specific dataset or modeling choices.

What would settle it

New longitudinal data showing that patients placed in the high-risk subgroup do not actually experience faster disease progression than those in the other subgroups.

Figures

Figures reproduced from arXiv: 2605.00708 by Alireza Namazi, Angela Danquah, Arjun Dirghangi, Bruce Rushing, Heman Shakeri.

**Figure 1.** Figure 1: Deep Kernel Learning Transformer Pipeline for Disease Trajectory Maps. (a) The transformer architecture processes multimodal EHR data through Clinical-BERT embeddings and structured feature extraction. (b) Agglomerative clustering with ward linkage is applied to latent representations to identify distinct patient trajectories, enabling population-level analysis of clinical patterns and outcomes. Model sel… view at source ↗

**Figure 2.** Figure 2: Latent space visualizations of patient trajectory clustering. (a) Direct latent space visualization demonstrates nonlinear progression patterns along a curved manifold structure with clear separation between three disease progression archetypes. (b) UMAP dimensionality reduction confirms three unique clinical trajectories rather than a continuous spectrum of disease progression. We performed data preproces… view at source ↗

**Figure 3.** Figure 3: Clinical trajectory analysis and model interpretability. (a) Posterior predictive mean trajectories from DKL transformer demonstrate three distinct patient archetypes with characteristic visual acuity patterns. Mean and standard deviation are conditioned on entire cluster. (b) SHAP analysis reveals surgeryrelated features and specialty codes as primary drivers of model predictions. (NLP, LP, HM, CF). AC… view at source ↗

**Figure 4.** Figure 4: Posterior predictive mean (4(a)) and variance (4(b)) trajectories with respect to DKL transformer latent dimensions, renormalized on 3,821 patients. The Z-axis represents mean in logMAR units where higher indicates worse glaucoma. 0.0 0.2 0.4 0.6 0.8 1.0 Z1 0.0 0.2 0.4 0.6 0.8 1.0 Z2 0.0 0.5 1.0 1.5 2.0 Mean 3D Trajectories by Cluster (Mean) Cluster 0 (n=821) Cluster 1 (n=1500 sampled) Cluster 2 (n=1500 sa… view at source ↗

**Figure 5.** Figure 5: Posterior predictive mean (5(a)) and variance (5(b)) trajectories with respect to DKL transformer latent dimensions, renormalized on 3,821 patients. The Z-axis represents mean in logMAR units where higher indicates worse glaucoma. Each trajectory represents a patient and three clusters represent three distinct patient trajectories, with purple indicating worst, blue indicating moderate, and yellow indicati… view at source ↗

read the original abstract

Effectively stratifying patient risk in chronic diseases like glaucoma is a major clinical challenge. Clinicians need tools to identify patients at high risk of progression from sparse and irregularly-sampled electronic health records (EHRs). We propose a novel deep kernel learning (DKL) architecture that leverages a Gaussian Process (GP) backend. The GP's kernel is defined by a transformer-based feature extractor applied to clinical-BERT embeddings to model glaucoma patient trajectories from multimodal EHR data. Our method successfully identifies three clinically distinct patient subgroups. Crucially, the model learns to decouple disease progression from current severity, identifying a high-risk group with a worsening trajectory despite having better average visual acuity than a second, stably poor group. This reveals that the model learns to identify progression risk rather than just the current disease state. This ability to stratify patients based on their risk trajectory progression offers a powerful tool for clinical decision support, enabling targeted interventions for high-risk individuals and improving the management of glaucoma care.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The DKL setup with clinical-BERT transformer features into a GP is a reasonable way to handle sparse glaucoma EHR trajectories, but the subgroup decoupling claim lacks the temporal hold-out validation needed to rule out reconstruction artifacts.

read the letter

The paper puts forward a deep kernel learning model that runs a transformer feature extractor over clinical-BERT embeddings and feeds the result into a Gaussian process to model multimodal EHR trajectories in glaucoma. The main claim is that this identifies three patient subgroups and, more importantly, separates progression risk from current disease severity, spotting a high-risk group that worsens despite better average visual acuity than a stable but poor group. That framing of the clinical problem is clear and the architecture choice makes sense for irregular, sparse longitudinal data where standard RNNs or transformers alone can struggle with uncertainty quantification. The GP backend gives a natural way to handle missing observations and produce trajectory predictions with uncertainty, which is useful in this domain. The combination of BERT embeddings with DKL for this specific ophthalmology task is new enough on its own terms. What is missing is any quantitative support. The abstract contains no performance numbers, no baseline comparisons, no statistical tests, and no description of how the subgroups were derived or validated. The stress-test concern is on point: without a strict temporal split (train on data before a cutoff, evaluate predicted trajectories after it) or an external cohort, the reported decoupling of progression from severity can be explained by the model simply fitting patterns already present in the training records rather than learning genuine prospective risk. If the full manuscript does not include those hold-out experiments or at least cross-validation that respects time, the headline result stays under-supported. This work is aimed at researchers who build or apply machine learning to chronic disease management in ophthalmology and similar fields. A reader working on longitudinal EHR modeling would get value from the architecture description and the clinical motivation, even if the results section needs strengthening. It is coherent enough and addresses a real gap, so it deserves a serious referee who can ask for the missing validation steps rather than a desk reject.

Referee Report

2 major / 2 minor

Summary. The paper proposes a deep kernel learning (DKL) architecture that uses a transformer-based feature extractor applied to clinical-BERT embeddings to define the kernel of a Gaussian Process (GP) backend, with the goal of modeling glaucoma patient trajectories from sparse and irregularly sampled multimodal EHR data. It claims to identify three clinically distinct patient subgroups and, crucially, to decouple disease progression from current severity by isolating a high-risk subgroup that exhibits a worsening trajectory despite better average visual acuity than a second, stably poor group.

Significance. If the subgroup identification and decoupling claims hold under rigorous validation, the work could provide a useful approach for risk stratification in glaucoma using real-world EHR data, potentially supporting targeted clinical interventions. The combination of DKL with GP and transformer embeddings on clinical text is a reasonable direction for handling irregular longitudinal data, but the absence of any quantitative metrics, baselines, or validation details in the abstract makes the practical significance difficult to evaluate at present.

major comments (2)

[Abstract] Abstract: The central claims—that the model identifies three clinically distinct subgroups and decouples progression from severity—are presented without any quantitative results, validation metrics, baseline comparisons, statistical tests, or details on data handling or cohort size. This absence makes it impossible to assess whether the reported subgroups reflect genuine structure rather than model artifacts.
[Abstract] Abstract / Results: The headline finding of a high-risk group with worsening trajectory despite better average visual acuity requires evidence that the learned kernel and GP posterior separate future progression dynamics from static severity. No temporal hold-out validation (train on records before cutoff T, evaluate predicted trajectories after T) or external cohort test is reported, so post-hoc comparisons of VA or slopes can be explained by reconstruction of training patterns rather than prospective risk stratification.

minor comments (2)

[Abstract] Abstract: The phrase 'clinically distinct' should be accompanied by explicit clinical metrics or expert review criteria used to label the subgroups.
The manuscript would benefit from a dedicated section detailing the exact architecture (transformer layers, GP kernel parameterization, training objective) and hyperparameter choices.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight important aspects of how our claims are presented. We address each major comment below and describe the revisions we will make to improve clarity and rigor.

read point-by-point responses

Referee: [Abstract] Abstract: The central claims—that the model identifies three clinically distinct subgroups and decouples progression from severity—are presented without any quantitative results, validation metrics, baseline comparisons, statistical tests, or details on data handling or cohort size. This absence makes it impossible to assess whether the reported subgroups reflect genuine structure rather than model artifacts.

Authors: We agree that the abstract, as currently written, does not include sufficient quantitative detail for independent evaluation of the claims. The full manuscript reports the cohort size, data preprocessing steps for the multimodal EHR, and quantitative metrics for the DKL-GP model (including GP marginal likelihood and clustering validity indices) along with statistical comparisons of subgroup trajectories. To address the concern directly, we will revise the abstract to incorporate concise quantitative highlights—such as cohort size, key performance indicators, and significance of trajectory differences—while respecting length constraints. revision: yes
Referee: [Abstract] Abstract / Results: The headline finding of a high-risk group with worsening trajectory despite better average visual acuity requires evidence that the learned kernel and GP posterior separate future progression dynamics from static severity. No temporal hold-out validation (train on records before cutoff T, evaluate predicted trajectories after T) or external cohort test is reported, so post-hoc comparisons of VA or slopes can be explained by reconstruction of training patterns rather than prospective risk stratification.

Authors: We recognize the value of explicit temporal validation to support claims of prospective risk stratification rather than retrospective reconstruction. The current manuscript uses the GP posterior to model full observed trajectories and demonstrates decoupling via the learned kernel parameters, but does not include a dedicated temporal hold-out experiment. We will add such an analysis in the revised manuscript: the model will be trained on records up to a fixed cutoff and evaluated on subsequent observations to confirm that the identified high-risk subgroup exhibits predicted worsening independent of baseline visual acuity. This addition will be summarized in the abstract and detailed in the Results section. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical subgroup discovery is an outcome, not a definitional reduction.

full rationale

The paper defines a DKL-GP architecture with transformer feature extractor on clinical-BERT embeddings, then applies it to sparse EHR trajectories and reports post-hoc discovery of three subgroups with observed decoupling of progression from severity. No equation or self-citation reduces the subgroup labels or the decoupling claim to a fitted parameter by construction; the architecture is standard and the results are presented as data-driven findings rather than tautological outputs. The derivation chain remains independent of the target claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that multimodal EHR data processed via clinical-BERT embeddings contains sufficient signal for a DKL model to learn clinically meaningful progression trajectories separate from current severity.

axioms (1)

domain assumption Multimodal EHR data can be meaningfully embedded using clinical-BERT to capture clinical information relevant to glaucoma progression.
Invoked to justify the input processing step of the architecture.

pith-pipeline@v0.9.0 · 5480 in / 1176 out tokens · 30983 ms · 2026-05-09T19:28:56.945198+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

60 extracted references · 9 canonical work pages · 2 internal anchors

[1]

Advances in neural information processing systems , volume=

Disease trajectory maps , author=. Advances in neural information processing systems , volume=
[2]

Artificial intelligence and statistics , pages=

Deep kernel learning , author=. Artificial intelligence and statistics , pages=. 2016 , organization=

2016
[3]

Advances in neural information processing systems , volume=

Attention is all you need , author=. Advances in neural information processing systems , volume=
[4]

arXiv preprint arXiv:2311.08149 , year=

Modeling Complex Disease Trajectories using Deep Generative Models with Semi-Supervised Latent Processes , author=. arXiv preprint arXiv:2311.08149 , year=

work page arXiv
[5]

Journal of biomedical informatics , volume=

Benchmarking deep learning models on large healthcare datasets , author=. Journal of biomedical informatics , volume=. 2018 , publisher=

2018
[6]

arXiv preprint arXiv:1909.07782 (2019) 19

Interpolation-prediction networks for irregularly sampled time series , author=. arXiv preprint arXiv:1909.07782 , year=

work page arXiv 1909
[7]

risk factors for the progression of open-angle glaucoma , author=

Canadian Glaucoma Study: 2. risk factors for the progression of open-angle glaucoma , author=. Archives of ophthalmology , volume=. 2008 , publisher=

2008
[8]

Ophthalmology , volume=

Evidence-based criteria for assessment of visual field reliability , author=. Ophthalmology , volume=. 2017 , publisher=

2017
[9]

Archives of ophthalmology , volume=

The Ocular Hypertension Treatment Study: baseline factors that predict the onset of primary open-angle glaucoma , author=. Archives of ophthalmology , volume=. 2002 , publisher=

2002
[10]

Jama , volume=

Racial variations in the prevalence of primary open-angle glaucoma: the Baltimore Eye Survey , author=. Jama , volume=. 1991 , publisher=

1991
[11]

Journal of Machine Learning Research , volume=

Learning scalable deep kernels with recurrent structure , author=. Journal of Machine Learning Research , volume=
[12]

2006 , publisher=

Gaussian processes for machine learning , author=. 2006 , publisher=

2006
[13]

Uncertainty in Artificial Intelligence , year=

Gaussian Processes for Big Data , author=. Uncertainty in Artificial Intelligence , year=
[14]

arXiv preprint arXiv:2308.04660 , year=

Efficient bayesian optimization with deep kernel learning and transformer pre-trained on multiple heterogeneous datasets , author=. arXiv preprint arXiv:2308.04660 , year=

work page arXiv
[15]

arXiv preprint arXiv:1904.05342 , year =

Clinicalbert: Modeling clinical notes and predicting hospital readmission , author=. arXiv preprint arXiv:1904.05342 , year=

work page arXiv 1904
[16]

Artificial intelligence and statistics , pages=

Scalable variational Gaussian process classification , author=. Artificial intelligence and statistics , pages=. 2015 , organization=

2015
[17]

Efficient Estimation of Word Representations in Vector Space

Efficient estimation of word representations in vector space , author=. arXiv preprint arXiv:1301.3781 , year=

work page internal anchor Pith review arXiv
[18]

Neural computation , volume=

Long short-term memory , author=. Neural computation , volume=. 1997 , publisher=

1997
[19]

Learning Phrase Representations using RNN Encoder-Decoder for Statistical Machine Translation

Learning phrase representations using RNN encoder-decoder for statistical machine translation , author=. arXiv preprint arXiv:1406.1078 , year=

work page internal anchor Pith review arXiv
[20]

Advances in Neural Information Processing Systems , volume=

Contiformer: Continuous-time transformer for irregular time series modeling , author=. Advances in Neural Information Processing Systems , volume=
[21]

Advances in Neural Information Processing Systems , volume=

Time series as images: Vision transformer for irregularly sampled time series , author=. Advances in Neural Information Processing Systems , volume=
[22]

Ricky TQ Chen, Brandon Amos, and Maximilian Nickel

Transformer embeddings of irregularly spaced events and their participants , author=. arXiv preprint arXiv:2201.00044 , year=

work page arXiv
[23]

International Joint Conference on Artificial Intelligence(IJCAI) , year=

Transformers in time series: A survey , author=. International Joint Conference on Artificial Intelligence(IJCAI) , year=
[24]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Clustering longitudinal clinical marker trajectories from electronic health data: Applications to phenotyping and endotype discovery , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=
[25]

Scientific reports , volume=

Recurrent neural networks for multivariate time series with missing values , author=. Scientific reports , volume=. 2018 , publisher=

2018
[26]

Proceedings of the 2nd Clinical Natural Language Processing Workshop , pages=

Publicly available clinical BERT embeddings , author=. Proceedings of the 2nd Clinical Natural Language Processing Workshop , pages=
[27]

NPJ digital medicine , volume=

Med-BERT: pretrained contextualized embeddings on large-scale structured electronic health records for disease prediction , author=. NPJ digital medicine , volume=. 2021 , publisher=

2021
[28]

NPJ digital medicine , volume=

Scalable and accurate deep learning with electronic health records , author=. NPJ digital medicine , volume=. 2018 , publisher=

2018
[29]

Journal of biomedical informatics , volume=

Deep representation learning of patient data from electronic health records (EHR): a systematic review , author=. Journal of biomedical informatics , volume=. 2021 , publisher=

2021
[30]

arXiv preprint arXiv:2210.12156 , year=

Improving medical predictions by irregular multimodal electronic health records modeling , author=. arXiv preprint arXiv:2210.12156 , year=

work page arXiv
[31]

Artificial Intelligence in Medicine , volume=

TEE4EHR: Transformer event encoder for better representation learning in electronic health records , author=. Artificial Intelligence in Medicine , volume=. 2024 , publisher=

2024
[32]

arXiv preprint arXiv:1906.04716 , year=

Modeling irregularly sampled clinical time series , author=. arXiv preprint arXiv:1906.04716 , year=

work page arXiv 1906
[33]

Journal of Medical Internet Research , volume=

Analyzing patient trajectories with artificial intelligence , author=. Journal of Medical Internet Research , volume=. 2021 , publisher=

2021
[34]

Ophthalmology Science , volume=

Deep learning approaches for predicting glaucoma progression using electronic health records and natural language processing , author=. Ophthalmology Science , volume=. 2022 , publisher=

2022
[35]

Ophthalmology , volume=

Assessing glaucoma progression using machine learning trained on longitudinal visual field and clinical data , author=. Ophthalmology , volume=. 2021 , publisher=

2021
[36]

NPJ digital medicine , volume=

A large language model for electronic health records , author=. NPJ digital medicine , volume=. 2022 , publisher=

2022
[37]

Artificial Intelligence in Medicine , volume=

Deep learning prediction models based on EHR trajectories: A systematic review , author=. Artificial Intelligence in Medicine , volume=. 2023 , publisher=

2023
[38]

ACM Transactions on Management Information Systems , volume=

Time series prediction using deep learning methods in healthcare , author=. ACM Transactions on Management Information Systems , volume=. 2022 , publisher=

2022
[39]

Artificial Intelligence Review , volume=

Deep learning for time series forecasting: a survey , author=. Artificial Intelligence Review , volume=. 2024 , publisher=

2024
[40]

Ophthalmology Glaucoma , volume=

Prediction models for glaucoma in a multicenter electronic health records consortium: The Sight Outcomes Research Collaborative , author=. Ophthalmology Glaucoma , volume=. 2024 , publisher=

2024
[41]

Nature Communications , volume=

VaDeSC-EHR: a transformer-based variational autoencoder for clustering longitudinal survival data from electronic health records , author=. Nature Communications , volume=. 2025 , publisher=

2025
[42]

BMC Medical Informatics and Decision Making , volume=

Representation learning for clinical time series prediction tasks in electronic health records , author=. BMC Medical Informatics and Decision Making , volume=. 2019 , publisher=

2019
[43]

Artificial Intelligence Review , volume=

A comprehensive survey of deep learning for time series forecasting: architectural diversity and open challenges , author=. Artificial Intelligence Review , volume=. 2025 , publisher=

2025
[44]

Philosophical Transactions of the Royal Society A , volume=

Time-series forecasting with deep learning: a survey , author=. Philosophical Transactions of the Royal Society A , volume=. 2021 , publisher=

2021
[45]

Uncertainty in Artificial Intelligence , pages=

The promises and pitfalls of deep kernel learning , author=. Uncertainty in Artificial Intelligence , pages=. 2021 , organization=

2021
[46]

International conference on machine learning , pages=

Self-attentive Hawkes process , author=. International conference on machine learning , pages=. 2020 , organization=

2020
[47]

International conference on machine learning , pages=

Transformer hawkes process , author=. International conference on machine learning , pages=. 2020 , organization=

2020
[48]

International Conference on Learning Representations , year=

Transformer Embeddings of Irregularly Spaced Events and Their Participants , author=. International Conference on Learning Representations , year=
[49]

International Conference on Learning Representations , year=

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author=. International Conference on Learning Representations , year=
[50]

IEEE journal of biomedical and health informatics , volume=

Deep EHR: a survey of recent advances in deep learning techniques for electronic health record (EHR) analysis , author=. IEEE journal of biomedical and health informatics , volume=. 2017 , publisher=

2017
[51]

Nature Communications , volume=

Deep representation learning for clustering longitudinal survival data from electronic health records , author=. Nature Communications , volume=. 2025 , publisher=

2025
[52]

JAMA ophthalmology , volume=

Application of the sight outcomes research collaborative ophthalmology data repository for triaging patients with glaucoma and clinic appointments during pandemics such as COVID-19 , author=. JAMA ophthalmology , volume=. 2020 , publisher=

2020
[53]

1995 , publisher=

Statistical analysis of circular data , author=. 1995 , publisher=

1995
[54]

Advances in neural information processing systems , volume=

A unified approach to interpreting model predictions , author=. Advances in neural information processing systems , volume=
[55]

Advances in neural information processing systems , volume=

GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series , author=. Advances in neural information processing systems , volume=
[56]

Advances in neural information processing systems , volume=

Neural ordinary differential equations , author=. Advances in neural information processing systems , volume=
[57]

Advances in neural information processing systems , volume=

Neural controlled differential equations for irregular time series , author=. Advances in neural information processing systems , volume=
[58]

Factors for glaucoma progression and the effect of treatment: the early manifest glaucoma trial , journal =

Leske, M Cristina and Heijl, Anders and Hussein, Mohamed and Bengtsson, Bo and Hyman, Leslie and Komaroff, Eugene and. Factors for glaucoma progression and the effect of treatment: the early manifest glaucoma trial , journal =
[59]

Predictors of long-term progression in the early manifest glaucoma trial , journal =

Leske, M Cristina and Heijl, Anders and Hyman, Leslie and Bengtsson, Boel and Dong, LiMing and Yang, Zhongming and. Predictors of long-term progression in the early manifest glaucoma trial , journal =
[60]

Archives of Ophthalmology , volume =

De Moraes, Carlos Gustavo V and Juthani, Viral J and Liebmann, Jeffrey M and Teng, Christopher C and Tello, Celso and Susanna Jr, Remo and Ritch, Robert , title =. Archives of Ophthalmology , volume =