pith. machine review for the scientific record. sign in

arxiv: 2408.02839 · v6 · submitted 2024-08-05 · 📊 stat.ML · cs.LG

Recognition: unknown

Mini-batch Estimation for Deep Cox Models: Statistical Foundations and Practical Guidance

Authors on Pith no claims yet
classification 📊 stat.ML cs.LG
keywords mb-mplecox-nnmini-batchpartial-likelihoodbatchconsistentconvergencedeep
0
0 comments X
read the original abstract

The stochastic gradient descent (SGD) algorithm has been widely used to optimize deep Cox neural network (Cox-NN) by updating model parameters using mini-batches of data. We show that SGD aims to optimize the average of mini-batch partial-likelihood, which is different from the standard partial-likelihood. This distinction requires developing new statistical properties for the global optimizer, namely, the mini-batch maximum partial-likelihood estimator (mb-MPLE). We establish that mb-MPLE for Cox-NN is consistent and achieves the optimal minimax convergence rate up to a polylogarithmic factor. For Cox regression with linear covariate effects, we further show that mb-MPLE is $\sqrt{n}$-consistent and asymptotically normal with asymptotic variance approaching the information lower bound as batch size increases, which is confirmed by simulation studies. Additionally, we offer practical guidance on using SGD, supported by theoretical analysis and numerical evidence. For Cox-NN, we demonstrate that the ratio of the learning rate to the batch size is critical in SGD dynamics, offering insight into hyperparameter tuning. For Cox regression, we characterize the iterative convergence of SGD, ensuring that the global optimizer, mb-MPLE, can be approximated with sufficiently many iterations. Finally, we demonstrate the effectiveness of mb-MPLE in a large-scale real-world application where the standard MPLE is intractable.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Radiomics-Guided Vision Transformers for Survival Analysis

    physics.med-ph 2026-04 unverdicted novelty 5.0

    A radiomics-guided hybrid Vision Transformer integrates pixel embeddings with interpretable radiomic features in a multimodal Cox model for survival analysis, yielding competitive discrimination and clinically meaning...