Sample size and power calculations for causal inference with time-to-event outcomes
Pith reviewed 2026-05-20 22:46 UTC · model grok-4.3
The pith
A new analytical sample size formula for marginal hazard ratios in causal survival studies uses the asymptotic variance of the inverse probability weighted Cox estimator.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By extending the robust sandwich variance to the inverse probability weighted partial likelihood estimator under the marginal structural Cox proportional hazards model, the paper obtains an explicit asymptotic variance formula that supports a closed-form sample size calculation valid at any chosen marginal hazard ratio for both randomized trials and observational studies.
What carries the argument
The asymptotic variance of the inverse probability weighted partial likelihood estimator for the marginal structural Cox proportional hazards model
If this is right
- Randomized trial sample size calculations need only treatment proportion, effect size, and event rate as inputs.
- Observational study calculations require one additional overlap coefficient that captures covariate similarity between groups.
- The same baseline variance supports a general inflation adjustment for any choice of propensity score balancing weights.
- The formula corrects systematic misstatements that appear in classic log-rank-based sample size methods when applied to causal estimands.
Where Pith is reading between the lines
- The overlap coefficient could be estimated directly from observed data to guide whether an observational study is feasible before full data collection.
- The variance formula might be adapted to other survival models such as accelerated failure time or additive hazards once their sandwich forms are derived.
- Routine use of the online calculator could reduce the frequency of underpowered observational survival studies by making covariate overlap an explicit design input.
Load-bearing premise
The marginal structural Cox proportional hazards model must be correctly specified and the propensity score model must produce valid weights so that the inverse probability weighted estimator remains consistent for the marginal hazard ratio.
What would settle it
A Monte Carlo simulation in which the empirical coverage of the derived sample size formula reaches the target power level when the marginal structural Cox model and propensity score weights are correctly specified, but falls short when either is misspecified.
read the original abstract
This paper develops power and sample size formulas for causal inference with time-to-event outcomes. The target estimand is the marginal hazard ratio: the coefficient of a marginal structural Cox proportional hazard model with treatment as the only predictor. We extend the robust sandwich variance theory and derive the analytical form of the asymptotic variance for the inverse probability weighted partial likelihood estimator. Building on this, we derive a new analytical sample size formula valid at any prespecified effect size, applicable to both randomized trials and observational studies. For randomized trials, the formula requires only the canonical inputs of treatment proportion, effect size, and event rate. The new formula corrects the mischaracterization of classic log-rank-based formulas. For observational studies, one additional input suffices: an overlap coefficient summarizing covariate similarity between comparison groups. We further develop a variance inflation approach applicable to any propensity score balancing weights, anchored to the corrected baseline variance. We provide an online calculator and an R package 'PSpower' to implement the method.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript develops analytical power and sample size formulas for estimating the marginal hazard ratio in time-to-event studies under causal inference. It extends robust sandwich variance theory to obtain the asymptotic variance of the inverse-probability-weighted partial-likelihood estimator in a marginal structural Cox model, then derives a closed-form sample-size expression that applies to both randomized trials (using only treatment proportion, effect size, and event rate) and observational studies (adding a single overlap coefficient). A variance-inflation approach for general propensity-score balancing weights is also presented, together with an R package 'PSpower' and online calculator.
Significance. If the central derivations are correct, the work supplies a practical, analytically grounded tool for study planning that corrects known limitations of log-rank-based formulas and extends them to observational settings with minimal additional inputs. The explicit provision of reproducible software (R package and calculator) is a clear strength that supports immediate usability and verification.
major comments (1)
- [§3.2] §3.2 (Asymptotic variance of the IPW partial-likelihood estimator): The derived variance expression for observational data relies on a marginal overlap coefficient but does not incorporate the influence-function contribution arising from estimation of the propensity-score parameters. Standard semiparametric results for IPW estimators require that the full score for the PS model be included in the sandwich; omitting it produces an understated variance that is load-bearing for the subsequent sample-size formula.
minor comments (2)
- [Abstract] Abstract: The claim that the new formula 'corrects the mischaracterization of classic log-rank-based formulas' would be clearer if a specific prior formula (with equation reference) were cited as an example of the error being fixed.
- [Software] Software section: No numerical validation (e.g., simulation comparing analytic variance to Monte-Carlo variance under estimated PS) is described; adding a small table of such checks would improve credibility without altering the central claim.
Simulated Author's Rebuttal
We thank the referee for their careful reading and constructive feedback on our manuscript. We address the single major comment below and indicate the revisions we will make to improve the presentation of our variance derivation.
read point-by-point responses
-
Referee: [§3.2] §3.2 (Asymptotic variance of the IPW partial-likelihood estimator): The derived variance expression for observational data relies on a marginal overlap coefficient but does not incorporate the influence-function contribution arising from estimation of the propensity-score parameters. Standard semiparametric results for IPW estimators require that the full score for the PS model be included in the sandwich; omitting it produces an understated variance that is load-bearing for the subsequent sample-size formula.
Authors: We appreciate the referee's observation on the semiparametric influence function. Our derivation in §3.2 obtains the asymptotic variance of the IPW partial-likelihood estimator for the marginal structural Cox model by extending the robust sandwich formula and summarizing the effect of the weights through a single marginal overlap coefficient. This choice yields a closed-form expression that depends only on quantities available at the design stage (treatment proportion, event rate, effect size, and overlap). We acknowledge that the standard efficient influence function for IPW estimators augments the estimating equation with the score of the propensity-score model, and that omitting this term does not produce the fully efficient asymptotic variance. Because the sample-size formula is intended for use before any propensity-score model has been selected, incorporating a specific PS score would require additional parametric assumptions that would undermine the generality and simplicity of the method. We will revise the manuscript to state this modeling assumption explicitly, to note that the reported variance corresponds to the case of known propensity scores, and to discuss that the resulting sample-size recommendation is therefore slightly conservative in practice. We view this as a partial revision that preserves the practical utility of the formula while addressing the referee's concern. revision: partial
Circularity Check
Derivation of asymptotic variance for IPW partial likelihood and sample size formula builds directly on established robust sandwich theory without reducing to self-definition or fitted inputs by construction.
full rationale
The paper states it extends robust sandwich variance theory to obtain an explicit asymptotic variance for the IPW-weighted partial likelihood estimator of the marginal HR, then derives the sample size formula from that variance. No equations or steps in the provided description reduce the target formula to a fitted quantity, a self-citation chain, or an ansatz smuggled in by prior work of the same authors. The central claim remains an analytical derivation from standard semiparametric variance results, applicable to both RCTs and observational studies via an overlap coefficient; this is self-contained against external benchmarks and receives only a minor score for any incidental self-citation that is not load-bearing.
Axiom & Free-Parameter Ledger
free parameters (1)
- overlap coefficient
axioms (2)
- domain assumption Marginal structural Cox proportional hazards model is correctly specified for the target estimand
- domain assumption Propensity score model yields weights that produce consistent IPW estimation
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
We extend the robust sandwich variance theory and derive the analytical form of the asymptotic variance for the inverse probability weighted partial likelihood estimator... V = A(τ_0)^{-2} B(τ_0) with ψ_i and η_i defined via weighted risk-set averages s_k(τ,t)
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leancostAlphaLog_fourth_deriv_at_zero echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
Proposition 1: eV_RCT / eV_Freed = 2 cosh(τ_0)[cosh(τ_0)−1]/τ_0² ≥1
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Statistical methods in medical research , volume=
The performance of inverse probability of treatment weighting and full matching on the propensity score in the presence of model misspecification when estimating the effect of treatment on survival outcomes , author=. Statistical methods in medical research , volume=. 2017 , publisher=
work page 2017
-
[2]
Wood, Natalya Pya, and Benjamin Säfken
Li, F. and Morgan, K. L. and Zaslavsky, A. M. , title =. Journal of the American Statistical Association , volume =. doi:10.1080/01621459.2016.1260466 , year =
-
[3]
American journal of epidemiology , volume=
Addressing extreme propensity scores via the overlap weights , author=. American journal of epidemiology , volume=. 2019 , publisher=
work page 2019
-
[4]
Journal of the American Statistical Association , volume=
Randomization analysis of experimental data: The Fisher randomization test comment , author=. Journal of the American Statistical Association , volume=. 1980 , publisher=
work page 1980
- [5]
-
[6]
The hazards of hazard ratios , author=. Epidemiology , volume=. 2010 , publisher=
work page 2010
-
[7]
Statistics in medicine , volume=
Power analysis for multivariable Cox regression models , author=. Statistics in medicine , volume=. 2019 , publisher=
work page 2019
-
[8]
Statistics in Medicine , volume=
Informing power and sample size calculations when using inverse probability of treatment weighting using the propensity score , author=. Statistics in Medicine , volume=. 2021 , publisher=
work page 2021
-
[9]
Journal of computational biology , volume=
Estimating dataset size requirements for classifying DNA microarray data , author=. Journal of computational biology , volume=. 2003 , publisher=
work page 2003
-
[10]
Power and sample size for observational studies of point exposure effects , author=. Biometrics , volume=. 2022 , publisher=
work page 2022
-
[11]
Statistics in Medicine , volume=
Sample size calculation for randomized trials via inverse probability of response weighting when outcome data are missing at random , author=. Statistics in Medicine , volume=. 2023 , publisher=
work page 2023
-
[12]
J. Aitchison and S. M. Shen , journal =. Logistic-Normal Distributions: Some Properties and Uses , urldate =
- [13]
-
[14]
Journal of the Royal Statistical Society: Series B (Methodological) , volume=
Regression models and life-tables , author=. Journal of the Royal Statistical Society: Series B (Methodological) , volume=. 1972 , publisher=
work page 1972
-
[15]
Efficient estimation of average treatment effects using the estimated propensity score , author=. Econometrica , volume=. 2003 , publisher=
work page 2003
- [16]
-
[17]
Statistics in medicine , volume=
Stratification and weighting via the propensity score in estimation of causal treatment effects: a comparative study , author=. Statistics in medicine , volume=. 2004 , publisher=
work page 2004
-
[18]
Bulletin of the Calcutta Mathematical Society , volume=
On a measure of divergence between two statistical populations defined by their probability distribution , author=. Bulletin of the Calcutta Mathematical Society , volume=
-
[19]
Journal of Educational and Behavioral Statistics , volume=
Statistical power for causally defined indirect effects in group-randomized trials with individual-level mediators , author=. Journal of Educational and Behavioral Statistics , volume=. 2017 , publisher=
work page 2017
-
[20]
Dealing with limited overlap in estimation of average treatment effects , author=. Biometrika , volume=. 2009 , publisher=
work page 2009
-
[21]
Behavior Research Methods , volume=
Sample size and power calculations for causal mediation analysis: a tutorial and shiny app , author=. Behavior Research Methods , volume=. 2024 , publisher=
work page 2024
-
[22]
Connors, A F and Speroff, T and Dawson, N V and Thomas, C and Harrell, F E and Wagner, D and Desbiens, N and Goldman, L and Wu, A W and Califf, R M and Fulkerson, W J and Vidaillet, H and Broste, S and Bellamy, P and Lynn, J and Knaus, W A , TITLE =. 1996 , JOURNAL =
work page 1996
-
[23]
Statistics in medicine , volume=
Propensity score weighting for covariate adjustment in randomized clinical trials , author=. Statistics in medicine , volume=. 2021 , publisher=
work page 2021
-
[24]
Statistics in medicine , volume=
Inverse probability weighting for covariate adjustment in randomized studies , author=. Statistics in medicine , volume=. 2014 , publisher=
work page 2014
-
[25]
Sample size and power calculations for causal inference in observational studies , author=. Annals of Statistics , pages=
-
[26]
Statistics in medicine , volume=
On the propensity score weighting analysis with survival outcome: Estimands, estimation, and inference , author=. Statistics in medicine , volume=. 2018 , publisher=
work page 2018
-
[27]
The asymptotic properties of nonparametric tests for comparing survival distributions , author=. Biometrika , volume=. 1981 , doi=
work page 1981
-
[28]
Sample-size formula for the proportional-hazards regression model , author=. Biometrics , pages=. 1983 , publisher=
work page 1983
-
[29]
Controlled Clinical Trials , volume=
Sample-size calculations for the Cox proportional hazards regression model with nonbinary covariates , author=. Controlled Clinical Trials , volume=. 2000 , doi=
work page 2000
-
[30]
Tables of the number of patients required in clinical trials using the log-rank test , author=. Biometrics , volume=. 1982 , publisher=
work page 1982
-
[31]
Variance estimation in inverse probability weighted Cox models , author=. Biometrics , volume=. 2021 , publisher=
work page 2021
-
[32]
Evaluation & the Health Professions , volume=
An overview of variance inflation factors for sample-size calculation , author=. Evaluation & the Health Professions , volume=. 2003 , publisher=
work page 2003
-
[33]
Journal of the American statistical Association , volume=
The robust inference for the Cox proportional hazards model , author=. Journal of the American statistical Association , volume=. 1989 , publisher=
work page 1989
-
[34]
James L. Powell and James H. Stock and Thomas M. Stoker , journal =. Semiparametric Estimation of Index Coefficients , urldate =
-
[35]
Closed-form variance estimator for weighted propensity score estimators with survival outcome , author=. Biometrical Journal , volume=. 2018 , publisher=
work page 2018
-
[36]
Introduction to sample size determination and power analysis for clinical trials , journal =. 1981 , issn =. doi:https://doi.org/10.1016/0197-2456(81)90001-5 , author =
-
[37]
Evaluation of sample size and power for analyses of survival with allowance for nonuniform patient entry, losses to follow-up, noncompliance, and stratification , author=. Biometrics , pages=. 1986 , publisher=
work page 1986
-
[38]
Statistics in medicine , volume=
Sample size and power for a logrank test and Cox proportional hazards model with multiple groups and strata, or a quantitative covariate with multiple strata , author=. Statistics in medicine , volume=. 2013 , publisher=
work page 2013
-
[39]
Causal interpretation of the hazard ratio in randomized clinical trials , author=. Clinical Trials , volume=. 2024 , publisher=
work page 2024
- [40]
- [41]
-
[42]
Sample size calculations in clinical research , author=. 2017 , publisher=
work page 2017
-
[43]
Partial likelihood , author=. Biometrika , volume=. 1975 , publisher=
work page 1975
-
[44]
David A. Binder , journal =. On the Variances of Asymptotically Normal Estimators from Complex Surveys , urldate =
-
[45]
David A. Binder , journal =. Fitting Cox's Proportional Hazards Models from Survey Data , urldate =
-
[46]
Counting processes and survival analysis , author=. 1997 , publisher=
work page 1997
-
[47]
PAIK, MYUNGHEE CHO and TSAI, WEI-YANN , title =. Biometrika , volume =. 1997 , issn =
work page 1997
-
[48]
Lin, D. Y. and Wei, L. J. and Ying, Zhiliang , title =. Biometrika , volume =. 1998 , issn =
work page 1998
-
[49]
C. A. Struthers and J. D. Kalbfleisch , journal =. Misspecified Proportional Hazard Models , urldate =
-
[50]
Marginal structural models to estimate the causal effect of zidovudine on the survival of HIV-positive men , author=. Epidemiology , volume=
-
[51]
Statistical methods in medical research , volume=
Propensity score weighting under limited overlap and model misspecification , author=. Statistical methods in medical research , volume=. 2020 , publisher=
work page 2020
-
[52]
Sample sizes based on the log-rank statistic in complex clinical trials , author=. Biometrics , pages=. 1988 , publisher=
work page 1988
-
[53]
Semiparametric theory and missing data , author=. 2006 , publisher=
work page 2006
-
[54]
Lifetime Data Analysis , volume=
Subtleties in the interpretation of hazard contrasts , author=. Lifetime Data Analysis , volume=. 2020 , publisher=
work page 2020
-
[55]
Approximation theorems of mathematical statistics , author=. 1980 , publisher=
work page 1980
-
[56]
P. K. Andersen and R. D. Gill , journal =. Cox's Regression Model for Counting Processes: A Large Sample Study , urldate =
-
[57]
Journal of Official Statistics , volume=
Weighting for unequal Pi , author=. Journal of Official Statistics , volume=. 1992 , publisher=
work page 1992
-
[58]
Journal of official Statistics , volume=
Methods for design effects , author=. Journal of official Statistics , volume=. 1995 , publisher=
work page 1995
-
[59]
Vaart, A. W. van der , year=. Asymptotic Statistics , publisher=
- [60]
-
[61]
Lifetime data analysis , volume=
Exposure stratified case-cohort designs , author=. Lifetime data analysis , volume=. 2000 , publisher=
work page 2000
-
[62]
Lu Tian and David Zucker and L. J. Wei , journal =. On the Cox Model with Time-Varying Regression Coefficients , urldate =
-
[63]
Breslow, Norman E. and Wellner, Jon A. , title =. Scandinavian Journal of Statistics , volume =. doi:https://doi.org/10.1111/j.1467-9469.2006.00523.x , year =
-
[64]
Zhang, Min and Schaubel, Douglas E. , title =. Biometrics , volume =. doi:https://doi.org/10.1111/j.1541-0420.2012.01759.x , year =
-
[65]
BMC medical research methodology , volume=
Restricted mean survival time: an alternative to the hazard ratio for the design and analysis of randomized trials with a time-to-event outcome , author=. BMC medical research methodology , volume=. 2013 , publisher=
work page 2013
-
[66]
Xu, Ronghui and O’Quigley, John , title =. Biostatistics , volume =. 2000 , issn =
work page 2000
-
[67]
Adjusted survival curves with inverse probability weights , journal =. 2004 , issn =. doi:https://doi.org/10.1016/j.cmpb.2003.10.004 , author =
-
[68]
A Paradox concerning Nuisance Parameters and Projected Estimating Functions , urldate =
Masayuki Henmi and Shinto Eguchi , journal =. A Paradox concerning Nuisance Parameters and Projected Estimating Functions , urldate =
-
[69]
Marginal structural models and causal inference in epidemiology , author=. Epidemiology , volume=. 2000 , publisher=
work page 2000
-
[70]
Statistics in medicine , volume=
Variance estimation when using inverse probability of treatment weighting (IPTW) with survival analysis , author=. Statistics in medicine , volume=. 2016 , publisher=
work page 2016
-
[71]
Phadnis, Milind A. and Mayo, Matthew S. , title =. Biometrical Journal , volume =. doi:https://doi.org/10.1002/bimj.202000043 , year =
-
[72]
Statistical Method in Medical Research , volume=
Asymptotic validity of Schoenfeld’s sample size formula for the Cox proportional hazards model via theWald test approach , author=. Statistical Method in Medical Research , volume=
-
[73]
American Journal of Epidemiology , volume =
Cheng, Chao and Li, Fan and Thomas, Laine E and Li, Fan , title =. American Journal of Epidemiology , volume =. 2022 , issn =
work page 2022
-
[74]
Annals of Internal Medicine , volume =
Fluorouracil plus Levamisole as Effective Adjuvant Therapy after Resection of Stage III Colon Carcinoma: A Final Report , author=. Annals of Internal Medicine , volume =. 1995 , doi =
work page 1995
-
[75]
Charles G. Moertel and Thomas R. Fleming and John S. Macdonald and Daniel G. Haller and John A. Laurie and Phyllis J. Goodman and James S. Ungerleider and William A. Emerson and Douglas C. Tormey and John H. Glick and Michael H. Veeder and James A. Mailliard , title =. New England Journal of Medicine , volume =
-
[76]
Health Services and Outcomes research methodology , volume=
Estimation of causal effects using propensity score weighting: An application to data on right heart catheterization , author=. Health Services and Outcomes research methodology , volume=. 2001 , publisher=
work page 2001
-
[77]
American Journal of Epidemiology , volume=
Using big data to emulate a target trial when a randomized trial is not available , author=. American Journal of Epidemiology , volume=. 2016 , publisher=. doi:10.1093/aje/kwv254 , url=
-
[78]
Annals of internal medicine , volume=
The target trial framework for causal inference from observational data: why and when is it helpful? , author=. Annals of internal medicine , volume=. 2025 , publisher=
work page 2025
-
[80]
Andersen, P. K. and R. D. Gill (1982). Cox's regression model for counting processes: A large sample study. The Annals of Statistics\/ 10\/ (4), 1100--1120
work page 1982
-
[81]
Austin, P. C. (2021). Informing power and sample size calculations when using inverse probability of treatment weighting using the propensity score. Statistics in Medicine\/ 40\/ (27), 6150--6163
work page 2021
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.