Differentially private hypothesis testing in survival analysis
Pith reviewed 2026-05-19 19:09 UTC · model grok-4.3
pith:CJTGSYI5 Add to your LaTeX paper
What is a Pith Number?\usepackage{pith}
\pithnumber{CJTGSYI5}
Prints a linked pith:CJTGSYI5 badge after your title and writes the identifier into PDF metadata. Compiles on arXiv with no extra files. Learn more
The pith
Differentially private tests for Cox coefficients and cumulative hazards achieve finite-sample guarantees in survival analysis.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We initiate a finite-sample theory of private hypothesis testing in survival analysis applications. For Cox regression coefficients, we develop private partial-likelihood-ratio and score-type tests, including a private calibration procedure for the rejection threshold. For cumulative hazard functions, we propose a private distributed two-sample test. Across these problems, we prove differential privacy and finite-sample testing guarantees, as well as minimax lower bounds.
What carries the argument
Private partial-likelihood-ratio and score-type tests equipped with a private calibration procedure for rejection thresholds, together with a private distributed two-sample test for cumulative hazard functions.
If this is right
- Privacy is statistically negligible once the privacy budget exceeds a threshold that scales with sample size and censoring rate.
- In high-privacy regimes the testing rate is governed by the privacy noise rather than the usual statistical fluctuation.
- Minimax lower bounds identify the precise regimes where further improvement is impossible.
- Optimal private rates for some semiparametric survival models remain open.
Where Pith is reading between the lines
- The same calibration technique could be adapted to other semiparametric models that rely on estimating equations.
- Practical deployment would require checking how the added privacy noise interacts with heavy censoring or model misspecification.
- The distributed two-sample construction suggests a route for private meta-analysis across hospitals without sharing raw records.
Load-bearing premise
The underlying survival data follows standard right-censored models such as the Cox proportional hazards model, and the private calibration procedure for rejection thresholds can be implemented without invalidating the finite-sample guarantees.
What would settle it
A Monte Carlo experiment on simulated right-censored data in which the private test maintains the nominal type-I error rate under the null while its power curve lies close to the non-private curve for moderate privacy budgets.
Figures
read the original abstract
Survival analysis is widely used in applications involving sensitive individual-level data, yet differentially private hypothesis testing for right-censored data remains largely undeveloped. We initiate a finite-sample theory of private hypothesis testing in survival analysis applications. For Cox regression coefficients, we develop private partial-likelihood-ratio and score-type tests, including a private calibration procedure for the rejection threshold. For cumulative hazard functions, we propose a private distributed two-sample test. Across these problems, we prove differential privacy and finite-sample testing guarantees, as well as minimax lower bounds. Our results identify when privacy is statistically negligible, when it dominates the testing rate, and where optimal private rates for testing in semiparametric survival models remain open. This theoretical analysis is accompanied by numerical experiments on simulated data.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript initiates a finite-sample theory of differentially private hypothesis testing for right-censored survival data. For Cox regression coefficients it constructs private partial-likelihood-ratio and score-type tests together with a private calibration procedure that sets the rejection threshold. For cumulative hazard functions it proposes a private distributed two-sample test. Differential privacy, finite-sample type-I and type-II error bounds, and minimax lower bounds are proved for these procedures; regimes in which privacy is statistically negligible versus dominant are identified, and the theoretical results are accompanied by numerical experiments on simulated data.
Significance. If the finite-sample type-I error control holds, the work supplies the first rigorous private testing theory for semiparametric survival models, a setting that routinely handles sensitive medical data. The explicit identification of privacy-negligible and privacy-dominated regimes, together with matching lower bounds, gives practitioners concrete guidance. The proofs of differential privacy and the finite-sample guarantees constitute the primary technical contribution.
major comments (1)
- [§4.3] §4.3 (private calibration of the rejection threshold for the partial-likelihood-ratio test): the finite-sample type-I error bound is claimed to hold under random right-censoring, yet the proof does not explicitly compose the privacy noise (added to the observed information or to the quantile estimator) with the randomness of the censoring times and the resulting random observed information matrix. Because both the test statistic and its null distribution are functions of the realized censoring pattern, a detailed argument showing that the added privacy mechanism does not inflate the type-I error beyond the stated bound is required to support the central finite-sample guarantee.
minor comments (2)
- [§2.1] §2.1: the notation for the privacy budget (ε,δ) should be stated once at the beginning and used consistently; the current text alternates between (ε,0)-DP and (ε,δ)-DP without explicit justification for the choice in each theorem.
- [Table 1] Table 1: the column headers for the empirical type-I error rates do not indicate the number of Monte-Carlo replications; adding this information would allow readers to assess the precision of the reported frequencies.
Simulated Author's Rebuttal
We thank the referee for their positive summary of the manuscript and for the constructive major comment on the finite-sample type-I error analysis. We address the point raised in §4.3 below.
read point-by-point responses
-
Referee: [§4.3] §4.3 (private calibration of the rejection threshold for the partial-likelihood-ratio test): the finite-sample type-I error bound is claimed to hold under random right-censoring, yet the proof does not explicitly compose the privacy noise (added to the observed information or to the quantile estimator) with the randomness of the censoring times and the resulting random observed information matrix. Because both the test statistic and its null distribution are functions of the realized censoring pattern, a detailed argument showing that the added privacy mechanism does not inflate the type-I error beyond the stated bound is required to support the central finite-sample guarantee.
Authors: We appreciate the referee's request for an explicit composition argument. The proof in the manuscript proceeds by conditioning on the realized censoring times and the resulting observed information matrix: the private test statistic and the calibrated quantile are both constructed conditionally on this random matrix, so that the conditional type-I error is bounded by the target level for any fixed censoring pattern. The unconditional bound then follows directly by integrating the conditional bound with respect to the distribution of the censoring times (via the law of total probability). Because the privacy mechanism is applied after the censoring pattern is observed, it does not alter this conditional control. To make the argument fully transparent and to address the referee's concern directly, we will revise the manuscript by inserting a short clarifying paragraph (or subsection) that explicitly states the conditioning step, invokes the law of total probability, and confirms that the privacy noise does not inflate the unconditional type-I error beyond the stated finite-sample bound. revision: yes
Circularity Check
No circularity; claims rest on independent proofs for private tests in censored models
full rationale
The paper develops private partial-likelihood-ratio and score-type tests plus a private calibration procedure for rejection thresholds in Cox models, along with a private distributed two-sample test for cumulative hazards. It claims to prove differential privacy, finite-sample type-I error control, and minimax lower bounds directly from the right-censored data model and standard DP mechanisms. No quoted step reduces a derived quantity to a fitted input by construction, invokes a self-citation as the sole justification for a uniqueness or ansatz claim, or renames a known empirical pattern. The derivation chain is therefore self-contained against external benchmarks such as classical Cox partial likelihood and DP composition theorems.
Axiom & Free-Parameter Ledger
free parameters (1)
- privacy budget epsilon
axioms (1)
- domain assumption Data follows right-censored survival model with Cox proportional hazards structure
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We develop private partial-likelihood-ratio and score-type tests, including a private calibration procedure for the rejection threshold... prove differential privacy and finite-sample testing guarantees, as well as minimax lower bounds.
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
the log partial likelihood for the Cox model... gradient... at-risk sets
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Differentially Private Estimation and Inference in High-Dimensional Regression with FDR Control
Private estimation and inference in high-dimensional regression with fdr control , author=. arXiv preprint arXiv:2310.16260 , year=
work page internal anchor Pith review Pith/arXiv arXiv
- [2]
-
[3]
arXiv preprint arXiv:2505.24811 , year=
Locally Differentially Private Two-Sample Testing , author=. arXiv preprint arXiv:2505.24811 , year=
- [4]
-
[5]
arXiv preprint arXiv:2508.04800 , year=
Differentially Private Model-X Knockoffs via Johnson-Lindenstrauss Transform , author=. arXiv preprint arXiv:2508.04800 , year=
-
[6]
Private Approximations of the 2nd-Moment Matrix Using Existing Techniques in Linear Regression
Private approximations of the 2nd-moment matrix using existing techniques in linear regression , author=. arXiv preprint arXiv:1507.00056 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[7]
International Conference on Machine Learning , pages=
Differentially private ordinary least squares , author=. International Conference on Machine Learning , pages=. 2017 , organization=
work page 2017
-
[8]
2012 IEEE 53rd annual symposium on foundations of computer science , pages=
The johnson-lindenstrauss transform itself preserves differential privacy , author=. 2012 IEEE 53rd annual symposium on foundations of computer science , pages=. 2012 , organization=
work page 2012
-
[9]
The Annals of Statistics , volume=
Differentially private inference via noisy optimization , author=. The Annals of Statistics , volume=. 2023 , publisher=
work page 2023
-
[10]
arXiv preprint arXiv:2402.07131 , year=
Resampling methods for private statistical inference , author=. arXiv preprint arXiv:2402.07131 , year=
-
[11]
Andersen, P. K. and Gill, R. D. , title =. Annals of Statistics , year =
-
[12]
Journal of the American Statistical Association , volume=
Privacy-Preserving Parametric Inference: A Case for Robust Statistics , author=. Journal of the American Statistical Association , volume=. 2020 , publisher=
work page 2020
-
[13]
Convergence rates for differentially private statistical estimation , author=. Proceedings of the... International Conference on Machine Learning. International Conference on Machine Learning , volume=
-
[14]
The Structure of Optimal Private Tests for Simple Hypotheses , booktitle =
Cl. The Structure of Optimal Private Tests for Simple Hypotheses , booktitle =. 2019 , doi =
work page 2019
-
[15]
Olivier Guilbaud , journal =. Exact
-
[16]
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =
Unified Transfer Learning in High-Dimensional Linear Regression , author =. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics , pages =. 2024 , editor =
work page 2024
-
[17]
Thomas R. Fleming and Judith R. O'Fallon and Peter C. O'Brien and David P. Harrington , journal =. Modified Kolmogorov-Smirnov Test Procedures with Application to Arbitrarily Right-Censored Data , volume =
-
[18]
Huang, Jian and Sun, T. and Ying, Zhiliang and Yu, Y. and Zhang, Cun-Hui , journal=. Oracle inequalities for the
-
[19]
Elly K. H. Hung and Yi Yu , year=. Optimal
-
[20]
Finite Sample Differentially Private Confidence Intervals
Finite sample differentially private confidence intervals , author=. arXiv preprint arXiv:1711.03908 , year=
work page internal anchor Pith review Pith/arXiv arXiv
-
[21]
Differentially Private Confidence Intervals for Empirical Risk Minimization , author=. 2018 , eprint=
work page 2018
-
[22]
arXiv preprint arXiv:2406.06755 , year=
Optimal federated learning for nonparametric regression with heterogeneous distributed differential privacy constraints , author=. arXiv preprint arXiv:2406.06755 , year=
-
[23]
arXiv preprint arXiv:2406.06749 , year=
Federated nonparametric hypothesis testing with differential privacy constraints: Optimal rates and adaptive tests , author=. arXiv preprint arXiv:2406.06749 , year=
-
[24]
Advances in Neural Information Processing Systems , volume=
Optimal private and communication constraint distributed goodness-of-fit testing for discrete distributions in the large sample regime , author=. Advances in Neural Information Processing Systems , volume=
-
[25]
Proceedings of Thirty Fifth Conference on Learning Theory , pages =
Private High-Dimensional Hypothesis Testing , author =. Proceedings of Thirty Fifth Conference on Learning Theory , pages =. 2022 , editor =
work page 2022
-
[26]
Journal of Nonparametric Statistics , volume=
Remember the curse of dimensionality: The case of goodness-of-fit testing in arbitrary dimension , author=. Journal of Nonparametric Statistics , volume=. 2018 , publisher=
work page 2018
-
[27]
Theory of Probability and Its Applications , volume =
Bentkus, Vidmantas , title =. Theory of Probability and Its Applications , volume =. 2003 , doi =
work page 2003
-
[28]
A tail inequality for quadratic forms of subgaussian random vectors , author=. 2011 , eprint=
work page 2011
-
[29]
The Sample Complexity of Robust Covariance Testing , author=. 2020 , eprint=
work page 2020
-
[30]
High-dimensional CLT for Sums of Non-degenerate Random Vectors: n^
Arun Kumar Kuchibhotla and Alessandro Rinaldo , year=. High-dimensional CLT for Sums of Non-degenerate Random Vectors: n^. 2009.13673 , archivePrefix=
-
[31]
arXiv preprint arXiv:2412.20542 , year=
On the Missing Factor in Some Concentration Inequalities for Martingales , author=. arXiv preprint arXiv:2412.20542 , year=
-
[32]
International Conference on Machine Learning , pages=
The test of tests: A framework for differentially private hypothesis testing , author=. International Conference on Machine Learning , pages=. 2023 , organization=
work page 2023
-
[33]
Journal of the Royal Statistical Society: Series B (Methodological) , volume=
Regression models and life-tables , author=. Journal of the Royal Statistical Society: Series B (Methodological) , volume=. 1972 , publisher=
work page 1972
-
[34]
Advances in Neural Information Processing Systems , volume=
Differentially private testing of identity and closeness of discrete distributions , author=. Advances in Neural Information Processing Systems , volume=
-
[35]
arXiv preprint arXiv:2208.06803 , year=
Differentially private hypothesis testing with the subsampled and aggregated randomized response mechanism , author=. arXiv preprint arXiv:2208.06803 , year=
-
[36]
Proceedings of the 5th Machine Learning for Healthcare Conference , pages =
Differentially Private Survival Function Estimation , author =. Proceedings of the 5th Machine Learning for Healthcare Conference , pages =. 2020 , editor =
work page 2020
-
[37]
Advances in Neural Information Processing Systems , volume=
Private hypothesis selection , author=. Advances in Neural Information Processing Systems , volume=
-
[38]
Awan, Jordan and Wang, Yue , journal=. Differentially private. 2025 , publisher=
work page 2025
-
[39]
Daniel G. Alabi and Salil P. Vadhan , title =. Journal of Machine Learning Research , year =
-
[40]
arXiv preprint arXiv:2310.19043 , year=
Differentially private permutation tests: Applications to kernel methods , author=. arXiv preprint arXiv:2310.19043 , year=
-
[41]
Journal of Computational and Graphical Statistics , volume=
Differentially Private Significance Tests for Regression Coefficients , author=. Journal of Computational and Graphical Statistics , volume=. 2019 , publisher=
work page 2019
-
[42]
The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
When data can't meet: estimating correlation across privacy barriers , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=
-
[43]
The algorithmic foundations of differential privacy , author=. Foundations and trends. 2014 , publisher=
work page 2014
-
[44]
Journal of the American Statistical Association , year =
Ilmun Kim and Antonin Schrab , title =. Journal of the American Statistical Association , year =. doi:10.1080/01621459.2025.2610033 , note =
-
[45]
2014 IEEE 55th annual symposium on foundations of computer science , pages=
Private empirical risk minimization: Efficient algorithms and tight error bounds , author=. 2014 IEEE 55th annual symposium on foundations of computer science , pages=. 2014 , organization=
work page 2014
-
[46]
The Annals of Statistics , volume=
The cost of privacy: Optimal rates of convergence for parameter estimation with differential privacy , author=. The Annals of Statistics , volume=. 2021 , publisher=
work page 2021
-
[47]
Proceedings of the 34th International Conference on Machine Learning , series =
Priv’IT: Private and Sample Efficient Identity Testing , author =. Proceedings of the 34th International Conference on Machine Learning , series =. 2017 , publisher =
work page 2017
-
[48]
Advances in Neural Information Processing Systems , volume=
Differentially private uniformly most powerful tests for binomial data , author=. Advances in Neural Information Processing Systems , volume=
-
[49]
Annals of Data Science , volume=
A survey on differential privacy for medical data analysis , author=. Annals of Data Science , volume=. 2024 , publisher=
work page 2024
-
[50]
Differentially Private Regression for Discrete-Time Survival Analysis , year =
Nguy\^. Differentially Private Regression for Discrete-Time Survival Analysis , year =. doi:10.1145/3132847.3132928 , booktitle =
-
[51]
Local differential privacy in survival analysis using private failure indicators , journal =
Maxime Eg. Local differential privacy in survival analysis using private failure indicators , journal =. 2025 , doi =
work page 2025
-
[52]
IEEE Symposium on Security and Privacy (SP) , pages =
Robust de-anonymization of large sparse datasets , author =. IEEE Symposium on Security and Privacy (SP) , pages =. 2008 , organization =
work page 2008
-
[53]
Nature Communications , volume =
Estimating the success of re-identifications in incomplete datasets using generative models , author =. Nature Communications , volume =. 2019 , publisher =
work page 2019
-
[54]
Cynthia Dwork , title =. Proceedings of the 33rd International Conference on Automata, Languages and Programming (ICALP 2006) , pages =. 2006 , publisher =
work page 2006
-
[55]
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =
Parametric Bootstrap for Differentially Private Confidence Intervals , author =. Proceedings of The 25th International Conference on Artificial Intelligence and Statistics , pages =. 2022 , editor =
work page 2022
-
[56]
Proceedings of the 33rd International Conference on Machine Learning (ICML) , pages =
Marco Gaboardi and Hyun Lim and Ryan Rogers and Salil Vadhan , title =. Proceedings of the 33rd International Conference on Machine Learning (ICML) , pages =. 2016 , publisher =
work page 2016
-
[57]
Ryan Rogers and Daniel Kifer , title =. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS) , series =. 2017 , publisher =
work page 2017
-
[58]
Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security , pages =
Sara Couch and Zeyu Kazan and Kai Shi and Andrew Bray and Alex Groce , title =. Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security , pages =. 2019 , publisher =
work page 2019
-
[59]
2018 1st International Conference on Data Intelligence and Security (ICDIS) , pages =
Zack Campbell and Andrew Bray and Anna Ritz and Alex Groce , title =. 2018 1st International Conference on Data Intelligence and Security (ICDIS) , pages =. 2018 , publisher =
work page 2018
-
[60]
Journal of Machine Learning Research , volume=
Differentially Private Bootstrap: New Privacy Analysis and Inference Strategies , author=. Journal of Machine Learning Research , volume=
-
[61]
High-Dimensional Statistics: A Non-Asymptotic Viewpoint , author =. 2019 , publisher =
work page 2019
-
[62]
Statistical Models Based on Counting Processes , author=. 1993 , publisher=
work page 1993
-
[63]
Journal of Interactive Marketing , volume=
How to project customer retention , author=. Journal of Interactive Marketing , volume=. 2007 , publisher=
work page 2007
-
[64]
Statistical methods for reliability data , author=. 2021 , publisher=
work page 2021
-
[65]
Terry M. Therneau and Patricia M. Grambsch , title =. 2000 , publisher =
work page 2000
-
[66]
Journal of Machine Learning Research , year =
Jongmin Mun and Seungwoo Kwak and Ilmun Kim , title =. Journal of Machine Learning Research , year =
-
[67]
Proceedings on Privacy Enhancing Technologies , volume =
Josh Smith and Hassan Jameel Asghar and Gianpaolo Gioiosa and Sirine Mrabet and Serge Gaspers and Paul Tyler , title =. Proceedings on Privacy Enhancing Technologies , volume =. 2022 , doi =
work page 2022
-
[68]
U-Statistics: Theory and Practice , author =. 1990 , publisher =. doi:10.1201/9780203734520 , isbn =
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.