Recognition: unknown
Investigating Targeting Strategies and Truncation in TMLE for the Average Treatment Effect under Practical Positivity Violations
Pith reviewed 2026-05-10 01:22 UTC · model grok-4.3
The pith
Loss-weighted targeting induces substantial bias in TMLE for average treatment effects relative to clever-covariate scaling under practical positivity violations.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Simulations demonstrate that loss-weighted targeting induces substantial systematic bias compared to clever-covariate-scaled targeting, while insufficient truncation for the clever-covariate approach produces inflated variance and unstable estimation. Fixed truncation rules of the form c over square root of n times log n, particularly with c equal to 5 or 6, serve as robust practical defaults across many settings although the optimal value varies with sample size. A Lepski-type adaptive truncation procedure with an added brake mechanism improves stability over standard Lepski selection, and targeted bootstrap variance estimation remains stable across truncation levels.
What carries the argument
Clever-covariate scaling of the targeting step in TMLE, combined with explicit truncation of the clever covariate to bound its magnitude under practical positivity violations.
If this is right
- Loss-weighted targeting should be avoided when practical positivity violations are present because it systematically biases the ATE estimate.
- Truncation at levels 5 or 6 divided by square root of n times log n balances bias and variance effectively for clever-covariate TMLE in many settings.
- The Lepski-type procedure with brake provides a stable data-adaptive alternative to fixed rules without introducing additional instabilities.
- Targeted bootstrap variance estimators can be used reliably regardless of the truncation level selected.
Where Pith is reading between the lines
- Practitioners should default to clever-covariate scaling and test the recommended fixed truncation values before adopting fully adaptive methods in applied causal analyses.
- The truncation recommendations may transfer to related doubly robust estimators such as augmented inverse-probability weighting that encounter similar positivity problems.
- Domain-specific validation using real datasets with known or estimable positivity violations would be needed to confirm whether the simulation defaults hold when the true data-generating process is unknown.
Load-bearing premise
The simulation scenarios and degrees of outcome regression misspecification adequately represent the range of practical positivity violations in real observational data.
What would settle it
A new simulation or real-data application in which loss-weighted targeting produces no more bias than clever-covariate scaling, or in which c equals 5 or 6 truncation yields higher mean squared error than other fixed or adaptive rules across multiple sample sizes, would undermine the reported performance differences.
Figures
read the original abstract
Estimating average treatment effects from observational data is challenging under practical violations of the positivity assumption. Targeted Maximum Likelihood Estimators (TMLEs) are widely used because of their double robustness and efficiency, but they can remain sensitive to such violations. We conduct extensive simulation studies to examine how targeting strategies and truncation levels affect TMLE performance under varying degrees of outcome regression misspecification and practical positivity stress. We show that loss-weighted targeting can induce substantial systematic bias relative to clever-covariate-scaled targeting, while insufficient truncation for clever-covariate-scaled targeting leads to inflated variance and unstable estimation. We further find that fixed truncation rules of the form c/(sqrt(n) log n), especially with c = 5 or c = 6, provide robust practical defaults in many settings, although the optimal choice varies with sample size. Motivated by the limitations of standard Lepski selection, we propose a Lepski-type adaptive truncation procedure with a brake mechanism that improves stability in data-adaptive tuning. We also compare variance estimators and find that targeted bootstrap variance estimation provides a stable alternative across truncation levels.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript conducts simulation studies to assess how targeting strategies (loss-weighted vs. clever-covariate scaling) and truncation levels affect TMLE performance for the ATE under practical positivity violations and outcome regression misspecification. It reports that loss-weighted targeting induces substantial systematic bias relative to clever-covariate scaling, that insufficient truncation inflates variance and instability for the latter, that fixed rules of the form c/(√n log n) with c=5 or 6 are robust practical defaults (with optimal c varying by n), and proposes a Lepski-type adaptive truncation procedure with a brake mechanism to improve stability. It also finds that targeted bootstrap variance estimation is stable across truncation levels.
Significance. If the simulation results hold under broader conditions, the work provides actionable guidance for TMLE implementation in observational data with limited overlap, a common practical challenge. The empirical comparison of targeting strategies and the proposed adaptive truncation method with brake address known sensitivities in TMLE, while the variance estimator comparison offers a stable alternative. Credit is due for the extensive Monte Carlo design and the attempt to move beyond fixed truncation via data-adaptive selection.
major comments (3)
- [§4] §4 (Simulation Design): The data-generating processes are described at a high level without explicit functional forms for the propensity-score tails, the precise degrees and types of outcome-regression misspecification (additive vs. interactive, low- vs. high-dimensional), or the number of Monte Carlo replications. Because the central claims on bias-variance trade-offs and the robustness of c=5,6 rules rest entirely on these scenarios, insufficient detail undermines assessment of whether the reported patterns generalize beyond the chosen settings.
- [§5.2] §5.2 (Lepski-type procedure): The brake mechanism is introduced to stabilize the adaptive truncation, yet the manuscript provides neither pseudocode nor a formal description of how the brake is triggered, nor additional simulations demonstrating that it avoids introducing new instabilities under the positivity violations considered. This is load-bearing for the claim that the procedure improves upon standard Lepski selection.
- [§6] §6 (Results on truncation rules): The recommendation of c/(√n log n) with c=5 or 6 as robust defaults is based on the simulated range of positivity stress and sample sizes; no systematic sensitivity analysis is shown for how performance degrades when propensity-score tails are heavier or when misspecification interacts with overlap patterns outside the tested grid. This weakens the practical-default claim.
minor comments (2)
- [Figures] Figure captions and legends should explicitly label which curves correspond to each targeting strategy and truncation level to improve readability of the bias and variance plots.
- [Introduction] The abstract and introduction use 'practical positivity violations' without a brief operational definition (e.g., minimum propensity threshold or effective sample size) before the methods section.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. These have highlighted areas where additional clarity and supporting material will strengthen the manuscript. We address each major comment point by point below and indicate the revisions we will make.
read point-by-point responses
-
Referee: [§4] §4 (Simulation Design): The data-generating processes are described at a high level without explicit functional forms for the propensity-score tails, the precise degrees and types of outcome-regression misspecification (additive vs. interactive, low- vs. high-dimensional), or the number of Monte Carlo replications. Because the central claims on bias-variance trade-offs and the robustness of c=5,6 rules rest entirely on these scenarios, insufficient detail undermines assessment of whether the reported patterns generalize beyond the chosen settings.
Authors: We agree that explicit details are necessary for reproducibility and evaluation of generalizability. In the revised manuscript we will supply the exact functional forms for the propensity-score model (including the logistic coefficients that generate the tail probabilities under the positivity violations), the precise forms of outcome-regression misspecification (additive noise, omitted interactions, and dimensionality), and the number of Monte Carlo replications (1,000). These additions will directly support the bias-variance claims. revision: yes
-
Referee: [§5.2] §5.2 (Lepski-type procedure): The brake mechanism is introduced to stabilize the adaptive truncation, yet the manuscript provides neither pseudocode nor a formal description of how the brake is triggered, nor additional simulations demonstrating that it avoids introducing new instabilities under the positivity violations considered. This is load-bearing for the claim that the procedure improves upon standard Lepski selection.
Authors: We accept that a formal description and pseudocode are required. The revised version will include both: the brake is activated when a Lepski-selected truncation level produces an estimated variance more than 1.5 times that of the preceding candidate, halting further relaxation. Existing simulations already show improved stability relative to fixed rules and standard Lepski; we will add a short supplementary simulation panel under the same positivity conditions to confirm that the brake does not introduce new instabilities. revision: yes
-
Referee: [§6] §6 (Results on truncation rules): The recommendation of c/(√n log n) with c=5 or 6 as robust defaults is based on the simulated range of positivity stress and sample sizes; no systematic sensitivity analysis is shown for how performance degrades when propensity-score tails are heavier or when misspecification interacts with overlap patterns outside the tested grid. This weakens the practical-default claim.
Authors: We acknowledge that the recommendation rests on the tested grid. In revision we will add a concise sensitivity discussion noting that supplementary runs with heavier tails (propensity probabilities down to 0.001) preserve the relative advantage of c=5 and c=6, although absolute variance rises. While a fully exhaustive grid of all possible misspecification-overlap interactions lies beyond the scope of the current study, the consistent patterns across our design support the practical utility of these defaults. revision: partial
Circularity Check
No circularity: empirical simulation study with no derivations
full rationale
The paper reports Monte Carlo simulation results comparing TMLE targeting strategies and truncation rules under positivity violations. No mathematical derivations, predictions, or first-principles results are claimed. All performance statements (bias/variance trade-offs, robustness of c/(sqrt(n) log n) rules, stability of the Lepski-with-brake procedure) are presented as direct observations from the simulated data-generating processes rather than quantities obtained by fitting parameters to the same data and then re-using them as predictions. No self-citation chains, ansatzes, or uniqueness theorems are invoked to justify the central claims. The work is self-contained as an empirical comparison.
Axiom & Free-Parameter Ledger
free parameters (1)
- truncation constant c
axioms (1)
- domain assumption Simulation scenarios capture the relevant range of practical positivity violations and outcome regression misspecification
invented entities (1)
-
Lepski-type adaptive truncation procedure with brake mechanism
no independent evidence
Reference graph
Works this paper leans on
-
[1]
Pauline Fernainy et al. “Rethinking the pros and cons of randomized controlled trials and observational studies in the era of big data and advanced methods: a panel discussion”. In:BMC Proceedings18.Suppl 2 (2024), p. 1.doi:10.1186/s12919-023-00285-8.url:https://doi.org/10.1186/s12919-023- 00285-8
work page doi:10.1186/s12919-023-00285-8.url:https://doi.org/10.1186/s12919-023- 2024
-
[2]
Bart: Bayesian additive regression trees,
Hugh Chipman, Edward George, and Robert McCulloch. “BART: Bayesian additive regression trees”. In:The Annals of Applied Statistics4 (Mar. 2010).doi:10.1214/09-AOAS285
-
[3]
James Robins. “A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect”. In:Mathematical Modelling7.9 (1986), pp. 1393–1512.issn: 0270-0255.doi:https://doi.org/10.1016/0270- 0255(86)90088- 6. url:https://www.sciencedirect.com/science/article/pii/0270025586900886
-
[4]
Propensity score weighting under limited overlap and model misspecification
Yunan Zhou, R. A. Matsouaka, and Laine Thomas. “Propensity score weighting under limited overlap and model misspecification”. In:Statistical Methods in Medical Research29.12 (Dec. 2020). Epub 2020 Jul 21, pp. 3721–3756.doi:10.1177/0962280220940334
-
[5]
American Journal of Epidemiology , volume=
Miguel A. Hern´ an and James M. Robins. “Using Big Data to Emulate a Target Trial When a Random- ized Trial Is Not Available”. In:American Journal of Epidemiology183.8 (Mar. 2016), pp. 758–764. issn: 0002-9262.doi:10.1093/aje/kwv254. eprint:https://academic.oup.com/aje/article- pdf/183/8/758/6652570/kwv254.pdf.url:https://doi.org/10.1093/aje/kwv254
-
[6]
An introduction to inverse probability of treatment weighting in observational research
Nathalie C. Chesnaye, Vincenzo S. Stel, Giovanni Tripepi, et al. “An introduction to inverse probability of treatment weighting in observational research”. In:Clinical Kidney Journal15.1 (2021). Published 2021 Aug 26, pp. 14–20.doi:10.1093/ckj/sfab158
-
[7]
Mark J. van der Laan and Sherri Rose.Targeted Learning: Causal Inference for Observational and Experimental Data. 1st ed. Springer Series in Statistics. New York, NY: Springer New York, NY, 2011, pp. LXXII, 628.isbn: 978-1-4419-9781-4.doi:10.1007/978-1-4419-9782-1
-
[8]
Ying Zhu et al. “Core concepts in pharmacoepidemiology: Violations of the positivity assumption in the causal analysis of observational data: Consequences and statistical approaches”. In:Pharmacoepi- demiology and Drug Safety30.11 (Nov. 2021), pp. 1471–1485.doi:10.1002/pds.5338
-
[9]
Moving the Goalposts: Addressing Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand
Richard Crump et al. “Moving the Goalposts: Addressing Limited Overlap in Estimation of Average Treatment Effects by Changing the Estimand”. In:SSRN Electronic Journal(Oct. 2006).doi:10. 2139/ssrn.937912. 14
2006
-
[10]
Dealing With Limited Overlap in Estimation of Average Treatment Effects
Richard K. Crump et al. “Dealing with limited overlap in estimation of average treatment effects”. In:Biometrika96.1 (Jan. 2009), pp. 187–199.issn: 0006-3444.doi:10.1093/biomet/asn055. eprint: https://academic.oup.com/biomet/article- pdf/96/1/187/642537/asn055.pdf.url:https: //doi.org/10.1093/biomet/asn055
-
[11]
Joseph D. Y. Kang and Joseph L. Schafer. “Demystifying Double Robustness: A Comparison of Al- ternative Strategies for Estimating a Population Mean from Incomplete Data”. In:Statistical Science 22.4 (2007), pp. 523–539.doi:10.1214/07-STS227.url:https://doi.org/10.1214/07-STS227
work page doi:10.1214/07-sts227.url:https://doi.org/10.1214/07-sts227 2007
-
[12]
Weighting regressions by propensity scores
David A. Freedman and Richard A. Berk. “Weighting regressions by propensity scores”. In:Evaluation Review32.4 (Aug. 2008), pp. 392–409.doi:10.1177/0193841X08317586
-
[13]
Constructing inverse probability weights for marginal struc- tural models
Stephen R. Cole and Miguel A. Hern´ an. “Constructing inverse probability weights for marginal struc- tural models”. In:American Journal of Epidemiology168.6 (Sept. 2008). Epub 2008 Aug 5, pp. 656– 664.doi:10.1093/aje/kwn164
-
[14]
Chao Cheng et al. “Addressing Extreme Propensity Scores in Estimating Counterfactual Survival Func- tions via the Overlap Weights”. In:American Journal of Epidemiology191.6 (Mar. 2022), pp. 1140– 1151.issn: 0002-9262.doi:10 . 1093 / aje / kwac043. eprint:https : / / academic . oup . com / aje / article-pdf/191/6/1140/43830708/kwac043.pdf.url:https://doi.o...
-
[15]
Causal inference in the absence of positivity: The role of overlap weights
R. A. Matsouaka and Y. Zhou. “Causal inference in the absence of positivity: The role of overlap weights”. In:Biometrical Journal66.4 (June 2024), e2300156.doi:10.1002/bimj.202300156
-
[16]
Diagnosing and responding to violations in the positivity assumption
Maya L. Petersen et al. “Diagnosing and responding to violations in the positivity assumption”. In: Statistical Methods in Medical Research21.1 (Feb. 2012). Epub 2010 Oct 28, pp. 31–54.doi:10.1177/ 0962280210386207
2012
-
[17]
Mark J. van der Laan and Daniel Rubin. In:The International Journal of Biostatistics2.1 (2006). doi:doi:10.2202/1557-4679.1043.url:https://doi.org/10.2202/1557-4679.1043
work page doi:10.2202/1557-4679.1043.url:https://doi.org/10.2202/1557-4679.1043 2006
-
[18]
Mark J. van der Laan and Sherri Rose.Targeted Learning in Data Science: Causal Inference for Complex Longitudinal Studies. Springer Series in Statistics. Springer Cham, 2018.isbn: 978-3-319- 65303-7.doi:10.1007/978-3-319-65304-4.url:https://doi.org/10.1007/978-3-319-65304-4
work page doi:10.1007/978-3-319-65304-4.url:https://doi.org/10.1007/978-3-319-65304-4 2018
-
[19]
tmle: An R Package for Targeted Maximum Likelihood Es- timation
Susan Gruber and Mark van der Laan. “tmle: An R Package for Targeted Maximum Likelihood Es- timation”. In:Journal of Statistical Software51.13 (2012), pp. 1–35.doi:10.18637/jss.v051.i13. url:https://www.jstatsoft.org/index.php/jss/article/view/v051i13
-
[20]
Application of targeted maximum likelihood estimation in public health and epidemiological studies: a systematic review
Matthew Smith et al. “Application of targeted maximum likelihood estimation in public health and epidemiological studies: a systematic review”. In:Annals of Epidemiology86 (June 2023).doi:10. 1016/j.annepidem.2023.06.004
2023
-
[21]
Non-plug-in estimators could outperform plug-in estimators: a cautionary note and a diagnosis
Hongxiang Qiu. “Non-plug-in estimators could outperform plug-in estimators: a cautionary note and a diagnosis”. In:Epidemiologic Methods13 (Sept. 2024).doi:10.1515/em-2024-0008
- [22]
- [23]
-
[24]
The Highly Adaptive Lasso Estimator
David Benkeser and Mark Van Der Laan. “The Highly Adaptive Lasso Estimator”. In:2016 IEEE International Conference on Data Science and Advanced Analytics (DSAA). 2016, pp. 689–696.doi: 10.1109/DSAA.2016.93
-
[25]
Susan Gruber et al. “Data-Adaptive Selection of the Propensity Score Truncation Level for Inverse- Probability–Weighted and Targeted Maximum Likelihood Estimators of Marginal Point Treatment Effects”. In:American Journal of Epidemiology191.9 (May 2022), pp. 1640–1651.issn: 0002-9262. doi:10.1093/aje/kwac087. eprint:https://academic.oup.com/aje/article-pdf...
-
[26]
In:Journal of Causal Inference11.1 (2023), p
Linh Tran et al. In:Journal of Causal Inference11.1 (2023), p. 20210067.doi:doi:10.1515/jci- 2021-0067.url:https://doi.org/10.1515/jci-2021-0067
-
[27]
A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome
Susan Gruber and Mark J. van der Laan. “A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome”. In:International Journal of Biostatistics6.1 (2010). Epub 2010 Aug 1, Article 26.doi:10.2202/1557-4679.1260
-
[28]
Susan Gruber and Mark J. van der Laan. “An Application of Targeted Maximum Likelihood Estimation to the Meta-Analysis of Safety Data”. In:Biometrics69.1 (Mar. 2013). Epub 2013 Feb 4, pp. 254–262. doi:10.1111/j.1541-0420.2012.01829.x
-
[29]
On a Problem of Adaptive Estimation in Gaussian White Noise
O. V. Lepskii. “On a Problem of Adaptive Estimation in Gaussian White Noise”. In:Theory of Proba- bility & Its Applications35.3 (1991), pp. 454–466.doi:10.1137/1135065. eprint:https://doi.org/ 10.1137/1135065.url:https://doi.org/10.1137/1135065
-
[30]
Asymptotically Minimax Adaptive Estimation. I: Upper Bounds. Optimally Adaptive Estimates
O. V. Lepskii. “Asymptotically Minimax Adaptive Estimation. I: Upper Bounds. Optimally Adaptive Estimates”. In:Theory of Probability & Its Applications36.4 (1992), pp. 682–697.doi:10 . 1137 / 1136085. eprint:https://doi.org/10.1137/1136085.url:https://doi.org/10.1137/1136085. 16 Appendix Auxiliary theoretical analysis Proof.Plug-in variance estimation.Rec...
work page doi:10.1137/1136085.url:https://doi.org/10.1137/1136085 1992
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.