pith. sign in

arxiv: 2605.22776 · v1 · pith:ODDNLGTRnew · submitted 2026-05-21 · 💻 cs.LG · cs.AI· stat.CO· stat.ML

SDPM: Survival Diffusion Probabilistic Model for Continuous-Time Survival Analysis

Pith reviewed 2026-05-22 06:52 UTC · model grok-4.3

classification 💻 cs.LG cs.AIstat.COstat.ML
keywords survival analysisdiffusion probabilistic modelscontinuous timecensored datagenerative modelsKaplan-Meier estimatortime-to-event
0
0 comments X

The pith

A diffusion-based generative model estimates continuous-time survival distributions from censored data by modeling the joint distribution of event times and censoring indicators.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces the Survival Diffusion Probabilistic Model (SDPM) to address limitations in existing survival analysis methods that either assume specific hazard forms or discretize time. SDPM uses a denoising diffusion model to generate samples from the conditional distribution of survival outcomes. These samples are then converted into survival function estimates using the Kaplan-Meier estimator under the assumption of conditionally independent censoring. This approach shows competitive performance on real datasets and better accuracy in recovering true distributions on synthetic data compared to nonparametric baselines. The transformations in the target space are key to improving calibration and validity of generated times.

Core claim

SDPM models the conditional distribution P(T, delta | x) using a denoising diffusion probabilistic model in a transformed space with standardized log-times and a continuous Gaussian-mixture representation for the censoring indicator. Under conditionally independent censoring, generated samples are transformed into survival function estimates via the Kaplan-Meier estimator, avoiding parametric assumptions on the event-time distribution and discretization of the time axis.

What carries the argument

Denoising diffusion model for generating samples of (standardized log-time, censoring indicator) pairs, converted to survival estimates with Kaplan-Meier.

If this is right

  • SDPM achieves competitive results on C-index, integrated time-dependent AUC, and integrated Brier score across ten real survival datasets compared to tree-based, boosting, and neural baselines.
  • On synthetic Cox-Weibull data, SDPM recovers the shape of the underlying continuous survival distribution more accurately than a strong nonparametric baseline when enough samples are generated.
  • The proposed target-space transformations improve event-rate calibration, reduce invalid generated times, and yield consistent gains in predictive discrimination.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Generative diffusion approaches might offer advantages in capturing complex, multimodal survival distributions that traditional models struggle with.
  • This method could be adapted for other time-to-event problems in fields like reliability engineering or medical prognosis where continuous time modeling is crucial.
  • Further work might explore integrating SDPM with other generative techniques or scaling it to high-dimensional covariates.

Load-bearing premise

The approach depends on the assumption of conditionally independent censoring to validly apply the Kaplan-Meier estimator to the generated samples for survival function estimation.

What would settle it

Observing that on synthetic data where the true survival function is known, the Kaplan-Meier estimates derived from a large number of SDPM-generated samples deviate substantially from the true curve or perform worse than established nonparametric methods would falsify the claim of more accurate recovery.

Figures

Figures reproduced from arXiv: 2605.22776 by Andrei V. Konstantinov, Lev V. Utkin, Stanislav R. Kirpichenko.

Figure 1
Figure 1. Figure 1: Schematic illustration of the SDPM generation pipeline [PITH_FULL_IMAGE:figures/full_fig_p006_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Critical difference diagram for C-index ranks ( [PITH_FULL_IMAGE:figures/full_fig_p010_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Critical difference diagram for AUC ranks ( [PITH_FULL_IMAGE:figures/full_fig_p010_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Critical difference diagram for IBS ranks ( [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Influence of the number of generated (T, δ) pairs on C-index, integrated time-dependent AUC, and IBS on the VLBW dataset. Shaded regions correspond to 95% confidence intervals estimated from 100 repetitions [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Trade-off between predictive quality, measured by mean integrated time-dependent AUC, and [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Influence of the number of reverse diffusion steps [PITH_FULL_IMAGE:figures/full_fig_p014_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Qualitative comparison of survival function estimates on synthetic Cox-Weibull data for different [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
read the original abstract

Survival analysis aims to estimate a time-to-event distribution from data with censored observations. Many existing methods either impose structural assumptions on the hazard function or discretize the time axis, which may limit flexibility and introduce approximation errors. We propose the Survival Diffusion Probabilistic Model (SDPM), a generative approach to continuous-time survival analysis. SDPM models the conditional distribution of the survival outcome, represented by the pair of observed time and censoring indicator, $\mathbb{P}(T,\delta \mid \mathbf{x})$, using a denoising diffusion model. Under the assumption of conditionally independent censoring, conditional samples generated by the model can be transformed into survival function estimates using the Kaplan-Meier estimator. This formulation avoids parametric assumptions on the event-time distribution and does not require a discretization of the output time space. The model operates in a transformed target space, using standardized log-times and a continuous Gaussian-mixture representation of the censoring indicator. We evaluate SDPM on ten real survival datasets and compare it with five strong baselines, including tree-based, boosting-based, and neural survival models. Results show that SDPM achieves competitive predictive performance across C-index, integrated time-dependent AUC, and integrated Brier score. A study on synthetic Cox-Weibull data demonstrates that SDPM can recover the shape of an underlying continuous survival distribution more accurately than a strong nonparametric baseline when sufficiently many samples are generated. An ablation study confirms the importance of the proposed target-space transformations, which improve event-rate calibration, reduce invalid generated times, and provide consistent gains in predictive discrimination. Codes implementing the proposed model are publicly available.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces SDPM, a denoising diffusion probabilistic model for continuous-time survival analysis. It models the conditional joint distribution P(T, δ | x) of observed time and censoring indicator via a diffusion process operating in a transformed target space (standardized log-times and continuous Gaussian-mixture representation of the binary censoring indicator). Under the conditionally independent censoring assumption, generated samples are passed through the Kaplan-Meier estimator to produce survival function estimates. The approach avoids parametric hazard assumptions and time discretization. On ten real survival datasets, SDPM reports competitive results versus tree-based, boosting, and neural baselines on C-index, integrated time-dependent AUC, and integrated Brier score. On synthetic Cox-Weibull data, it recovers the underlying continuous survival distribution more accurately than a nonparametric baseline when sufficiently many samples are drawn. An ablation study attributes gains to the proposed target-space transformations, including improved event-rate calibration.

Significance. If the empirical claims hold after addressing the mapping and calibration details, SDPM would supply a flexible generative framework for survival analysis that preserves continuous time and directly targets the joint (T, δ) distribution. The public code release supports reproducibility, and the ablation evidence for the log-time and Gaussian-mixture transformations is a concrete strength. The work could influence subsequent generative modeling efforts in censored-data settings, though its impact would be strengthened by explicit verification that generated event rates remain calibrated for the downstream Kaplan-Meier step.

major comments (2)
  1. [Methods (target-space transformation) and Experiments (synthetic Cox-Weibull study)] The central synthetic-data claim (superior recovery of the continuous survival distribution when many samples are generated) depends on feeding model outputs into the Kaplan-Meier estimator. Because δ is represented by a continuous Gaussian mixture during diffusion, a post-sampling mapping to binary values is required. The manuscript should specify the exact thresholding/rounding rule and report the marginal event probability recovered by the mixture versus the ground-truth rate; any systematic mis-calibration would bias the resulting KM curves even if the marginal time distribution is accurate.
  2. [Experiments (real-data evaluation)] Table or figure reporting the ten real-dataset results should include per-metric standard deviations across repeated splits or seeds. Without these, the statement that SDPM is “competitive” cannot be quantitatively distinguished from noise, weakening the cross-method comparison.
minor comments (2)
  1. [Methods] Clarify whether the diffusion noise schedule is learned or fixed, and state the precise form of the Gaussian mixture (means, variances, and mixing weights) used for the censoring indicator.
  2. [Ablation study] The abstract states that the transformations “improve event-rate calibration”; the corresponding ablation table should report the actual calibration metric (e.g., absolute difference in event rate) before and after each transformation.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the two major comments below and will revise the manuscript to incorporate the requested clarifications and additional statistics.

read point-by-point responses
  1. Referee: [Methods (target-space transformation) and Experiments (synthetic Cox-Weibull study)] The central synthetic-data claim (superior recovery of the continuous survival distribution when many samples are generated) depends on feeding model outputs into the Kaplan-Meier estimator. Because δ is represented by a continuous Gaussian mixture during diffusion, a post-sampling mapping to binary values is required. The manuscript should specify the exact thresholding/rounding rule and report the marginal event probability recovered by the mixture versus the ground-truth rate; any systematic mis-calibration would bias the resulting KM curves even if the marginal time distribution is accurate.

    Authors: We agree that the post-sampling mapping for the censoring indicator and its calibration must be made explicit to support the synthetic-data claims. In the revised manuscript we will add a precise description of the thresholding rule (0.5 threshold applied after scaling the mixture output to [0,1]) together with a direct comparison of the marginal event rate recovered from generated samples versus the ground-truth rate on the Cox-Weibull data. This addition will confirm that any observed improvement in KM-curve recovery is not an artifact of mis-calibration. revision: yes

  2. Referee: [Experiments (real-data evaluation)] Table or figure reporting the ten real-dataset results should include per-metric standard deviations across repeated splits or seeds. Without these, the statement that SDPM is “competitive” cannot be quantitatively distinguished from noise, weakening the cross-method comparison.

    Authors: We accept the point that variability measures are needed to substantiate the claim of competitive performance. The revised manuscript will augment the main results table with per-metric standard deviations computed across the repeated random splits (or seeds) already used in the experiments, enabling readers to assess whether observed differences exceed experimental noise. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper introduces SDPM as a denoising diffusion model that directly learns the conditional distribution P(T, δ | x) in a transformed continuous space (standardized log-times plus Gaussian-mixture representation of the binary censoring indicator). Survival-function estimates are obtained by feeding generated samples into the external Kaplan-Meier estimator under the standard conditionally-independent-censoring assumption; this step is a conventional post-processing procedure rather than an internal redefinition that forces the output to equal the training inputs by construction. All reported performance claims (C-index, iAUC, iBS on real data; shape recovery on synthetic Cox-Weibull data) rest on empirical evaluation and ablation studies that compare against independent baselines. No self-definitional equations, fitted-input predictions, or load-bearing self-citations appear in the derivation chain.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The model introduces a transformed target space (standardized log-times plus continuous Gaussian-mixture for censoring) and relies on the external Kaplan-Meier estimator after sampling; no new physical entities are postulated.

free parameters (1)
  • diffusion noise schedule and network hyperparameters
    Standard diffusion training choices that are fitted or tuned on data.
axioms (1)
  • domain assumption Conditionally independent censoring
    Required to convert generated (T, δ) samples into survival function estimates via Kaplan-Meier.

pith-pipeline@v0.9.0 · 5831 in / 1368 out tokens · 27638 ms · 2026-05-22T06:52:14.966299+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

52 extracted references · 52 canonical work pages

  1. [1]

    P. Wang, Y. Li, and C.K. Reddy. Machine learning for survival analysis: A survey.ACM Computing Surveys (CSUR), 51(6):1–36, 2019

  2. [2]

    High-dimensional survival analysis: Methods and applications.Annual review of statistics and its application, 10:25–49, 2023

    Stephen Salerno and Yi Li. High-dimensional survival analysis: Methods and applications.Annual review of statistics and its application, 10:25–49, 2023

  3. [3]

    Deep learning for survival analysis: a review.Artificial Intelligence Review, 57(65):1–34, 2024

    Simon Wiegrebe, Philipp Kopper, Raphael Sonabend, Bernd Bischl, and Andreas Bender. Deep learning for survival analysis: a review.Artificial Intelligence Review, 57(65):1–34, 2024

  4. [4]

    An introduction to deep survival analysis models for predicting time-to-event outcomes.Foundations and Trends®in Machine Learning, 17(6):921–1100, 2024

    George H Chen et al. An introduction to deep survival analysis models for predicting time-to-event outcomes.Foundations and Trends®in Machine Learning, 17(6):921–1100, 2024

  5. [5]

    Emmert-Streib and M

    F. Emmert-Streib and M. Dehmer. Introduction to survival analysis in practice.Machine Learning & Knowledge Extraction, 1:1013–1038, 2019

  6. [6]

    D.R. Cox. Regression models and life-tables.Journal of the Royal Statistical Society, Series B (Method- ological), 34(2):187–220, 1972

  7. [7]

    Proportional hazards tests and diagnostics based on weighted residuals.Biometrika, 81(3):515–526, 1994

    Patricia M Grambsch and Terry M Therneau. Proportional hazards tests and diagnostics based on weighted residuals.Biometrika, 81(3):515–526, 1994

  8. [8]

    Kaplan and P

    E.L. Kaplan and P. Meier. Nonparametric estimation from incomplete observations.Journal of the American Statistical Association, 53(282):457–481, 1958

  9. [9]

    Widodo and B.-S

    A. Widodo and B.-S. Yang. Machine health prognostics using survival probability and support vector machine.Expert Systems with Applications, 38(7):8430–8437, 2011

  10. [10]

    Witten and R

    D.M. Witten and R. Tibshirani. Survival analysis with high-dimensional covariates.Statistical Methods in Medical Research, 19(1):29–51, 2010

  11. [11]

    Ishwaran, U.B

    H. Ishwaran, U.B. Kogalur, E.H. Blackstone, and M.S. Lauer. Random survival forests.Annals of Applied Statistics, 2:841–860, 2008

  12. [12]

    Ridgeway

    G. Ridgeway. The state of boosting.Computing science and statistics, 31:172–181, 1999

  13. [13]

    Katzman, U

    J.L. Katzman, U. Shaham, A. Cloninger, J. Bates, T. Jiang, and Y. Kluger. Deepsurv: Personalized treatment recommender system using a Cox proportional hazards deep neural network.BMC medical research methodology, 18(24):1–12, 2018

  14. [14]

    Deep recurrent survival analysis

    Kan Ren, Jiarui Qin, Lei Zheng, Zhengyu Yang, Weinan Zhang, Lin Qiu, and Yong Yu. Deep recurrent survival analysis. InProceedings of the AAAI conference on artificial intelligence, volume 33, pages 4798–4805, 2019

  15. [15]

    Steingrimsson and S

    J.A. Steingrimsson and S. Morrison. Deep learning for survival outcomes.Statistics in Medicine, 39(17):2339–2349, 2020

  16. [16]

    Tarkhan, N

    A. Tarkhan, N. Simon, T. Bengtsson, K. Nguyen, and J. Dai. Survival prediction using deep learn- ing. InProceedings of AAAI Spring Symposium on Survival Prediction-Algorithms, Challenges and Applications, volume 146, pages 207–214. PMLR, 2021

  17. [17]

    Mueller, and Jane-Ling Wang

    Qixian Zhong, J.W. Mueller, and Jane-Ling Wang. Deep extended hazard models for survival analysis. In Advances in Neural Information Processing Systems, volume 34, pages 15111–15124. Curran Associates, Inc., 2021. 20

  18. [18]

    Transformer-based deep survival analysis

    Shi Hu, Egill Fridgeirsson, Guido van Wingen, and Max Welling. Transformer-based deep survival analysis. InSurvival Prediction-Algorithms, Challenges and Applications, pages 132–148. PMLR, 2021

  19. [19]

    Hierarchical transformer for survival prediction using multimodality whole slide images and genomics

    Chunyuan Li, Xinliang Zhu, Jiawen Yao, and Junzhou Huang. Hierarchical transformer for survival prediction using multimodality whole slide images and genomics. InThe 26th International Conference on Pattern Recognition (ICPR), pages 4256–4262, Montreal, QC, Canada, August 2022. IEEE Computer Society

  20. [20]

    Zhilong Lv, Yuexiao Lin, Rui Yan, Ying Wang, and Fa Zhang. Transsurv: Transformer-based survival analysis model integrating histopathological images and genomic data for colorectal cancer.IEEE/ACM Transactions on Computational Biology and Bioinformatics, pages 1–10, 2022

  21. [21]

    Explainable survival analysis with convolution-involved vision transformer

    Yifan Shen, Li liu, Zhihao Tang, Zongyi Chen, Guixiang Ma, Jiyan Dong, Xi Zhang, Lin Yang, and Qingfeng Zheng. Explainable survival analysis with convolution-involved vision transformer. InPro- ceedings of the AAAI Conference on Artificial Intelligence (AAAI-22), volume 36, pages 2207–2215, 2022

  22. [22]

    Explainable survival analysis with uncertainty using convolution- involved vision transformer.Computerized Medical Imaging and Graphics, 110:102302, 2023

    Zhihao Tang, Li Liu, Zongyi Chen, Guixiang Ma, Jiyan Dong, Xujie Sun, Xi Zhang, Chaozhuo Li, Qingfeng Zheng, Lin Yang, et al. Explainable survival analysis with uncertainty using convolution- involved vision transformer.Computerized Medical Imaging and Graphics, 110:102302, 2023

  23. [23]

    Survtrace: Transformers for survival analysis with competing events

    Zifeng Wang and Jimeng Sun. Survtrace: Transformers for survival analysis with competing events. In Proceedings of the 13th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, pages 1–9, 2022

  24. [24]

    Krivtsov, and K

    Xingyu Li, V. Krivtsov, and K. Arora. Attention-based deep survival model for time series data. Reliability Engineering and System Safety, 217(108033):1–12, 2022

  25. [25]

    Attention-based deep recurrent model for survival prediction.ACM Transactions on Computing for Healthcare, 2(4):1–18, 2021

    Zhaohong Sun, Wei Dong, Jinlong Shi, Kunlun He, and Zhengxing Huang. Attention-based deep recurrent model for survival prediction.ACM Transactions on Computing for Healthcare, 2(4):1–18, 2021

  26. [26]

    Wright, T

    M.N. Wright, T. Dankowski, and A. Ziegler. Unbiased split variable selection for random survival forests using maximally selected rank statistics.Statistics in Medicine, 36(8):1272–1284, 2017

  27. [27]

    Apell´ aniz, J

    P.A. Apell´ aniz, J. Parras, and S. Zazo. Leveraging the variational bayes autoencoder for survival analysis.Scientific Reports, 14(1):24567, 2024

  28. [28]

    Adversarial time-to-event modeling

    Paidamoyo Chapfuwa, Chenyang Tao, Chunyuan Li, Courtney Page, Benjamin Goldstein, Lawrence Carin Duke, and Ricardo Henao. Adversarial time-to-event modeling. InInternational Con- ference on Machine Learning, pages 735–744. PMLR, 2018

  29. [29]

    Dhariwal and A

    P. Dhariwal and A. Nichol. Diffusion models beat gans on image synthesis.Advances in neural infor- mation processing systems, 34:8780–8794, 2021

  30. [30]

    Jonathan Ho, Ajay Jain, and P. Abbeel. Denoising diffusion probabilistic models. InAdvances in Neural Information Processing Systems, volume 33, pages 6840–6851. Curran Associates, Inc., 2020

  31. [31]

    Kotelnikov, D

    A. Kotelnikov, D. Baranchuk, I. Rubachev, and A. Babenko. Tabddpm: Modelling tabular data with diffusion models. InInternational conference on machine learning, pages 17564–17579. PMLR, 2023

  32. [32]

    Nichol and P

    A.Q. Nichol and P. Dhariwal. Improved denoising diffusion probabilistic models. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pages 8162–8171. PMLR, 18–24 Jul 2021. 21

  33. [33]

    Peebles and Saining Xie

    W. Peebles and Saining Xie. Scalable diffusion models with transformers.ICCV, 2023

  34. [34]

    Ermon, and J

    Juntong Shi, Minkai Xu, Harper Hua, Hengrui Zhang, S. Ermon, and J. Leskovec. Tabdiff: a mixed-type diffusion model for tabular data generation. InInternational Conference on Learning Representations, volume 2025, pages 37353–37375, 2025

  35. [35]

    Brockschmidt, M

    M. Brockschmidt, M. Schr¨ oder, and S. Feuerriegel. Survdiff: A diffusion model for generating synthetic data in survival analysis.arXiv:2509.22352, 2025

  36. [36]

    Akiba, S

    T. Akiba, S. Sano, T. Yanase, T. Ohta, and M. Koyama. Optuna: A next-generation hyperparam- eter optimization framework. InProceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, pages 2623–2631, 2019

  37. [37]

    Tancik, P.P

    M. Tancik, P.P. Srinivasan, B. Mildenhall, S. Fridovich-Keil, N. Raghavan, U. Singhal, R. Ramamoorthi, J.T. Barron, and R. Ng. Fourier features let networks learn high frequency functions in low dimensional domains. InAdvances in Neural Information Processing Systems (NeurIPS). Curran Associates, Inc., 2020

  38. [38]

    Dispenzieri, J.A

    A. Dispenzieri, J.A. Katzmann, R.A. Kyle, D.R. Larson, T.M. Therneau, C.L. Colby, R.J. Clark, G.P. Mead, S. Kumar, L.J. Melton III, et al. Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population. InMayo Clinic Proceedings, volume 87, pages 517–523. Elsevier, 2012

  39. [39]

    Ganzfried, M

    B.F. Ganzfried, M. Riester, B. Haibe-Kains, T. Risch, S. Tyekucheva, I. Jazic, Xin Victoria Wang, M. Ahmadifar, M.J. Birrer, G. Parmigiani, C. Huttenhower, and L. Waldron. curatedovariandata: clinically annotated data for the ovarian cancer transcriptome.Database, 2013:bat013, 01 2013

  40. [40]

    Fleming and D.P

    T.R. Fleming and D.P. Harrington.Counting processes and survival analysis. John Wiley & Sons, 2013

  41. [41]

    Blair, D.R

    A.L. Blair, D.R. Hadden, J.A. Weaver, D.B. Archer, P.B. Johnston, and C.J. Maguire. The 5-year prognosis for vision in diabetes.The Ulster medical journal, 49(2):139, 1980

  42. [42]

    Royston and D.G

    P. Royston and D.G. Altman. External validation of a cox prognostic model: principles and methods. BMC medical research methodology, 13(1):33, 2013

  43. [43]

    Connors, N.V

    A.F. Connors, N.V. Dawson, N.A. Desbiens, W.J. Fulkerson, L. Goldman, W.A. Knaus, J. Lynn, R.K. Oye, M. Bergner, A. Damiano, et al. A controlled trial to improve care for seriously iii hospitalized patients: The study to understand prognoses and preferences for outcomes and risks of treatments (support).Jama, 274(20):1591–1598, 1995

  44. [44]

    Lichtenberg, K.A

    Jianfang Liu, T. Lichtenberg, K.A. Hoadley, L.M. Poisson, A.J. Lazar, A.D. Cherniack, A.J. Kovatich, C.C. Benz, D.A. Levine, A.V. Lee, et al. An integrated tcga pan-cancer clinical data resource to drive high-quality survival outcome analytics.Cell, 173(2):400–416, 2018

  45. [45]

    O’Shea, D.A

    M. O’Shea, D.A. Savitz, M. L. Hage, and K.A. Feinstein. Prenatal events and the risk of subependy- mal/intraventricular haemorrhage in very low birthweight neonates.Paediatric and Perinatal Epidemi- ology, 6(3):352–362, 1992

  46. [46]

    Hosmer, S

    D.W. Hosmer, S. Lemeshow, and S. May. Applied survival analysis.Wiley Series in Probability and Statistics, 2008

  47. [47]

    Zame, Jinsung Yoon, and M

    Changhee Lee, W. Zame, Jinsung Yoon, and M. Van Der Schaar. Deephit: A deep learning approach to survival analysis with competing risks. InProceedings of the AAAI conference on artificial intelligence, volume 32, 2018. 22

  48. [48]

    Vieira, G

    D. Vieira, G. Gimenez, G. Marmerola, and V. Estima. Xgboost survival embeddings: improving statis- tical properties of xgboost survival analysis implementation, 2021

  49. [49]

    Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

    Frank E Harrell Jr, Kerry L Lee, and Daniel B Mark. Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors.Statistics in medicine, 15(4):361–387, 1996

  50. [50]

    Uno, Tianxi Cai, Lu Tian, and Lee-Jen Wei

    H. Uno, Tianxi Cai, Lu Tian, and Lee-Jen Wei. Evaluating prediction rules for t-year survivors with censored regression models.Journal of the American Statistical Association, 102(478):527–537, 2007

  51. [51]

    Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18(17-18):2529–2545, 1999

    Erika Graf, Claudia Schmoor, Willi Sauerbrei, and Martin Schumacher. Assessment and comparison of prognostic classification schemes for survival data.Statistics in medicine, 18(17-18):2529–2545, 1999

  52. [52]

    Bender, T

    R. Bender, T. Augustin, and M. Blettner. Generating survival times to simulate cox proportional hazards models.Statistics in Medicine, 24(11):1713–1723, 2005. 23 A Hyperparameter search spaces This appendix summarizes the hyperparameter search spaces used in Optuna for all compared models. For all methods, hyperparameter optimization was performed indepen...