pith. sign in

arxiv: 2606.25107 · v1 · pith:3SYEXEMFnew · submitted 2026-06-23 · 📊 stat.ME

Nonparametric Estimation of Optimal Stochastic Just-In-Time Adaptive Interventions for Distal Outcomes

Pith reviewed 2026-06-25 21:44 UTC · model grok-4.3

classification 📊 stat.ME
keywords nonparametric estimationjust-in-time adaptive interventionsstochastic policiesdistal outcomesregimen-response curveGaussian processoptimal policydata-adaptive tilting
0
0 comments X

The pith

A nonparametric estimator recovers the regimen-response curve for distal outcomes under stochastic policies in just-in-time interventions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs an estimator for how different stochastic treatment regimens affect long-term end-of-study outcomes when each person faces many decision points. Standard methods become unstable or target only short-term effects instead. The approach achieves nonparametric efficiency for the full curve that links policies to distal outcomes and adds a tilting step that keeps estimates stable as the number of decisions grows. Weak convergence of the curve estimate to a Gaussian process then supplies simultaneous confidence bands, while separate asymptotic results cover the policy that optimizes the curve. This supplies a complete route to estimation, inference, and optimization when the scientific target is the distal outcome rather than an intermediate one.

Core claim

We develop a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies and introduce a data-adaptive tilting procedure to stabilize estimation in settings with many decision points. We show that the estimated regimen-response curve converges weakly to a Gaussian process, enabling simultaneous confidence bands, and we derive asymptotic theory for the optimizer of the curve, thereby enabling inference for the learned optimal stochastic policy.

What carries the argument

The nonparametrically efficient estimator of the regimen-response curve (the mapping from stochastic policies to expected distal outcomes), together with the data-adaptive tilting procedure that keeps the estimator stable when the number of decision points per person is large.

If this is right

  • Simultaneous confidence bands become available for the entire regimen-response curve.
  • Asymptotic inference is justified for the stochastic policy that optimizes the curve.
  • Estimation and optimization can target distal end-of-study outcomes rather than proximal or discounted ones.
  • Stable estimates are obtained even when each individual contributes dozens of decision points.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same curve-estimation strategy could be applied to adaptive interventions outside mobile-health settings whenever distal outcomes are the primary scientific target.
  • The Gaussian-process limit opens the possibility of using functional data tools to compare multiple candidate policies on the full curve rather than only at the optimum.
  • The tilting construction may extend to other high-frequency decision settings in which standard inverse-probability weighting becomes unstable.

Load-bearing premise

The data-adaptive tilting procedure stabilizes the estimator without introducing bias or changing the target parameter when the number of decision points per individual is large.

What would settle it

A simulation study with many decision points per individual in which the estimator fails to attain the nonparametric efficiency bound or the curve estimate does not converge weakly to a Gaussian process would falsify the central claims.

Figures

Figures reproduced from arXiv: 2606.25107 by Ashkan Ertefaie, Jack M. Wolf, Nandita Mitra.

Figure 1
Figure 1. Figure 1: Scaled pointwise bias for the regimen response curve. [PITH_FULL_IMAGE:figures/full_fig_p024_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Pointwise coverage for the regimen response curve. [PITH_FULL_IMAGE:figures/full_fig_p025_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Pointwise relative efficiency for the regimen response curve versus the oracle [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Pointwise relative mean squared error for the regimen response curve for each [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗
read the original abstract

Mobile and wearable technologies enable the delivery of just-in-time adaptive interventions (JITAIs) -- interventions that adapt treatment delivery to an individual's rapidly changing internal state and context in real-time, real-world settings. Estimating optimal JITAIs, however, remains challenging because these studies often involve dozens of decision points per individual, and existing methods can produce unstable and irregular estimators with substantial bias and slow convergence rates. Advanced reinforcement learning approaches may be difficult to interpret and often target proximal, discounted outcomes rather than the distal end-of-study outcomes that define long-term success in many behavioral and clinical studies. To address these challenges, we develop a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies and introduce a data-adaptive tilting procedure to stabilize estimation in settings with many decision points. We show that the estimated regimen-response curve converges weakly to a Gaussian process, enabling simultaneous confidence bands, and we derive asymptotic theory for the optimizer of the curve, thereby enabling inference for the learned optimal stochastic policy. These developments provide a unified framework for estimation, inference, and optimization of stochastic JITAIs for distal outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript develops a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies in just-in-time adaptive interventions (JITAIs). It introduces a data-adaptive tilting procedure to stabilize estimation when the number of decision points is large, establishes that the estimated curve converges weakly to a Gaussian process (enabling simultaneous confidence bands), and derives asymptotic theory for the optimizer of the curve to support inference on the learned optimal stochastic policy.

Significance. If the central claims hold, the work supplies a unified semiparametric framework for estimation, inference, and optimization of stochastic JITAIs that target distal rather than proximal outcomes. The nonparametric efficiency result and weak convergence to a Gaussian process are standard once influence functions and Donsker conditions are verified, but their application to high-dimensional decision-point settings with distal outcomes is practically relevant for mobile-health studies. The data-adaptive tilting procedure, if shown to preserve the target parameter asymptotically, would be a useful stabilization device.

minor comments (2)
  1. The abstract and introduction should explicitly state the precise class of stochastic policies under which the regimen-response curve is defined and the form of the tilting weights (e.g., whether they depend on estimated propensity scores or outcome regressions).
  2. Notation for the regimen-response curve, the tilting parameter, and the optimizer should be introduced once and used consistently; multiple symbols for the same object appear in the abstract.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our manuscript and for recommending minor revision. The referee's description accurately reflects the paper's contributions on the nonparametric estimator for the regimen-response curve under stochastic policies, the data-adaptive tilting procedure, weak convergence to a Gaussian process, and asymptotic results for the optimizer. No major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity; derivations are self-contained semiparametric results

full rationale

The paper develops a nonparametric estimator for the regimen-response curve under stochastic policies, establishes weak convergence to a Gaussian process, and derives asymptotics for its optimizer. These follow standard semiparametric efficiency arguments once an influence function is identified and Donsker conditions hold. The data-adaptive tilting is presented as a stabilization step whose asymptotic effect is controlled by the same efficiency theory without redefining the target parameter. No equations reduce a claimed prediction or uniqueness result to a fitted input or self-citation by construction. The abstract and claimed results contain independent statistical content that does not collapse to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; standard regularity conditions for nonparametric estimation and weak convergence are presumed but not listed.

axioms (1)
  • standard math Standard regularity conditions for nonparametric efficiency and weak convergence of the estimator to a Gaussian process
    Invoked to support the convergence and inference claims.

pith-pipeline@v0.9.1-grok · 5735 in / 1263 out tokens · 20736 ms · 2026-06-25T21:44:05.832235+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

77 extracted references · 34 canonical work pages · 1 internal anchor

  1. [1]

    and van der Laan, Mark J

    Ertefaie, Ashkan and Duttweiler, Luke and Johnson, Brent A. and van der Laan, Mark J. , month = sep, year =. Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve , url =. doi:10.48550/arXiv.2309.16099 , abstract =

  2. [2]

    The International Journal of Biostatistics , volume=

    Targeted estimation of nuisance parameters to obtain valid statistical inference , author=. The International Journal of Biostatistics , volume=. 2014 , publisher=

  3. [3]

    Biometrika , volume=

    Doubly robust nonparametric inference on the average treatment effect , author=. Biometrika , volume=. 2017 , publisher=

  4. [4]

    Biometrics , author =

    Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso , volume =. Biometrics , author =. 2023 , keywords =. doi:10.1111/biom.13719 , abstract =

  5. [5]

    Contemporary Clinical Trials , author =

    The mobile assistance for regulating smoking (. Contemporary Clinical Trials , author =. 2021 , keywords =. doi:10.1016/j.cct.2021.106513 , abstract =

  6. [6]

    , year =

    Benkeser, David and van der Laan, Mark , month = oct, year =. The highly adaptive lasso estimator , url =. 2016. doi:10.1109/DSAA.2016.93 , abstract =

  7. [7]

    and Ridder, Geert , year =

    Efficient estimation of average treatment effects using the estimated propensity score , volume =. Econometrica , author =. 2003 , pages =. doi:10.1111/1468-0262.00442 , abstract =

  8. [8]

    The International Journal of Biostatistics , author =

    A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso , volume =. The International Journal of Biostatistics , author =. doi:10.1515/ijb-2015-0097 , abstract =

  9. [9]

    Biometrika , author =

    Estimating time-varying causal excursion effects in mobile health with binary outcomes , volume =. Biometrika , author =. 2021 , pages =. doi:10.1093/biomet/asaa070 , abstract =

  10. [10]

    Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

    Nonparametric causal effects based on incremental propensity score interventions , volume =. Journal of the American Statistical Association , author =. 2019 , keywords =. doi:10.1080/01621459.2017.1422737 , abstract =

  11. [11]

    and Wolfson, Julian , month = mar, year =

    Barnard, Martha and Huling, Jared D. and Wolfson, Julian , month = mar, year =. A unified framework for causal estimand selection , url =. doi:10.48550/arXiv.2410.12093 , abstract =

  12. [12]

    The International Journal of Biostatistics , author =

    Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part. The International Journal of Biostatistics , author =. 2010 , file =. doi:10.2202/1557-4679.1200 , abstract =

  13. [13]

    Journal of the American Statistical Association , author =

    Marginal mean models for dynamic regimes , volume =. Journal of the American Statistical Association , author =. 2001 , pmid =. doi:10.1198/016214501753382327 , abstract =

  14. [14]

    Mathematical Modelling , author =

    A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , volume =. Mathematical Modelling , author =. 1986 , pages =. doi:10.1016/0270-0255(86)90088-6 , abstract =

  15. [15]

    Journal of the American Statistical Association , author =

    Estimating dynamic treatment regimes in mobile health using. Journal of the American Statistical Association , author =. 2020 , pages =. doi:10.1080/01621459.2018.1537919 , language =

  16. [16]

    Annales de l'I.H.P

    Inefficient estimators of the bivariate survival function for three models , volume =. Annales de l'I.H.P. Probabilités et statistiques , author =. 1995 , pages =

  17. [17]

    Biometrics , author =

    Nonparametric assessment of regimen response curve estimators , volume =. Biometrics , author =. 2025 , pages =. doi:10.1093/biomtc/ujaf066 , abstract =

  18. [18]

    Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

    Assessing time-varying causal effect moderation in mobile health , volume =. Journal of the American Statistical Association , author =. 2018 , pages =. doi:10.1080/01621459.2017.1305274 , abstract =

  19. [19]

    Journal of the American Statistical Association , volume =

    Off-policy estimation of long-term average outcomes with applications to mobile health , volume =. Journal of the American Statistical Association , author =. 2021 , pages =. doi:10.1080/01621459.2020.1807993 , abstract =

  20. [20]

    Psychological Methods , author =

    The microrandomized trial for developing digital interventions:. Psychological Methods , author =. 2022 , keywords =. doi:10.1037/met0000283 , abstract =

  21. [21]

    Journal of Statistical Planning and Inference , author =

    Nonparametric causal effects based on marginal structural models , volume =. Journal of Statistical Planning and Inference , author =. 2007 , keywords =. doi:10.1016/j.jspi.2005.12.008 , abstract =

  22. [22]

    , year =

    van der Vaart, Aad W. , year =. Asymptotic

  23. [23]

    van der Vaart, A. W. and Wellner, Jon A. , year =. Weak

  24. [24]

    Journal of Machine Learning Research , author =

    Asymptotic inference for multi-stage stationary treatment policy with variable selection , volume =. Journal of Machine Learning Research , author =. 2025 , pages =

  25. [25]

    Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =

    Kallus, Nathan and Uehara, Masatoshi , year =. Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =. Advances in

  26. [26]

    More robust doubly robust off-policy evaluation , url =

    Farajtabar, Mehrdad and Chow, Yinlam and Ghavamzadeh, Mohammad , month = jul, year =. More robust doubly robust off-policy evaluation , url =. Proceedings of the 35th

  27. [27]

    Biometrika , author =

    Constructing dynamic treatment regimes over indefinite time horizons , volume =. Biometrika , author =. 2018 , pages =

  28. [28]

    Robins, Miguel A

    Marginal structural models and causal inference in epidemiology , volume =. Epidemiology (Cambridge, Mass.) , author =. 2000 , keywords =. doi:10.1097/00001648-200009000-00011 , abstract =

  29. [29]

    The American Statistician , author =

    Model selection, confounder control, and marginal structural models:. The American Statistician , author =. 2004 , pages =

  30. [30]

    and Williams, Nicholas and Diaz, Ivan , month = jun, year =

    McClean, Alec and Levis, Alexander W. and Williams, Nicholas and Diaz, Ivan , month = jun, year =. Longitudinal weighted and trimmed treatment effects with flip interventions , url =. doi:10.48550/arXiv.2506.09188 , abstract =

  31. [31]

    The International Journal of Biostatistics , author =

    Causal effect models for realistic individualized treatment and intention to treat rules , volume =. The International Journal of Biostatistics , author =. 2007 , keywords =. doi:10.2202/1557-4679.1022 , abstract =

  32. [32]

    Biometrics , author =

    Population intervention causal effects based on stochastic interventions , volume =. Biometrics , author =. 2012 , keywords =. doi:10.1111/j.1541-0420.2011.01685.x , abstract =

  33. [33]

    Statistics in Medicine , author =

    Estimation of the effect of interventions that modify the received treatment , volume =. Statistics in Medicine , author =. 2013 , keywords =. doi:10.1002/sim.5907 , abstract =

  34. [34]

    Econometrica , author =

    Policy learning with observational data , volume =. Econometrica , author =. 2021 , keywords =. doi:10.3982/ECTA15732 , abstract =

  35. [35]

    Zubizarreta

    Residual weighted learning for estimating individualized treatment rules , volume =. Journal of the American Statistical Association , author =. 2017 , keywords =. doi:10.1080/01621459.2015.1093947 , abstract =

  36. [36]

    Statistics in Medicine , author =

    Reinforcement learning design for cancer clinical trials , volume =. Statistics in Medicine , author =. 2009 , keywords =. doi:10.1002/sim.3720 , abstract =

  37. [37]

    , month = dec, year =

    Manski, Charles F. , month = dec, year =. Identification for

  38. [38]

    Zou, Hui , month = dec, year =. The. Journal of the American Statistical Association , publisher =. doi:10.1198/016214506000000735 , abstract =

  39. [39]

    The International Journal of Biostatistics , author =

    Targeted estimation of nuisance parameters to obtain valid statistical inference , volume =. The International Journal of Biostatistics , author =. 2014 , keywords =. doi:10.1515/ijb-2012-0038 , abstract =

  40. [40]

    Biometrika , author =

    Doubly robust nonparametric inference on the average treatment effect , volume =. Biometrika , author =. 2017 , pages =. doi:10.1093/biomet/asx053 , abstract =

  41. [41]

    Hekler, Saul Shiffman, Audrey Boruvka, Daniel Almirall, Ambuj Tewari, and Susan A

    Klasnja, Predrag and Hekler, Eric B. and Shiffman, Saul and Boruvka, Audrey and Almirall, Daniel and Tewari, Ambuj and Murphy, Susan A. , year =. Microrandomized trials:. Health Psychology , publisher =. doi:10.1037/hea0000305 , abstract =

  42. [42]

    Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =

    Just-in-time adaptive interventions (. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =. 2017 , pages =. doi:10.1007/s12160-016-9830-8 , abstract =

  43. [43]

    Kang, Joseph D. Y. and Schafer, Joseph L. , month = nov, year =. Demystifying double robustness:. Statistical Science , publisher =. doi:10.1214/07-STS227 , abstract =

  44. [44]

    and Bibaut, Aurélien and Luedtke, Alexander R

    van der Laan, Mark J. and Bibaut, Aurélien and Luedtke, Alexander R. , editor =. Targeted. 2018 , pages =. doi:10.1007/978-3-319-65304-4_25 , abstract =

  45. [45]

    Highly Adaptive Principal Component Regression

    Wang, Mingxun and Schuler, Alejandro and Laan, Mark van der and Meixide, Carlos García , month = may, year =. Highly adaptive principal component regression , url =. doi:10.48550/arXiv.2602.10613 , abstract =

  46. [46]

    and Wager, S

    Athey, S. and Wager, S. (2021). Policy learning with observational data. Econometrica , 89(1):133--161

  47. [47]

    J., and Gilbert, P

    Benkeser, D., Carone, M., van der Laan, M. J., and Gilbert, P. B. (2017). Doubly robust nonparametric inference on the average treatment effect. Biometrika , 104(4):863--880

  48. [48]

    and van der Laan, M

    Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE International Conference on Data Science and Advanced Analytics ( DSAA ) , pages 689--696

  49. [49]

    Boruvka, A., Almirall, D., Witkiewitz, K., and Murphy, S. A. (2018). Assessing time-varying causal effect moderation in mobile health. Journal of the American Statistical Association , 113(523):1112--1121

  50. [50]

    A., and van der Laan, M

    Ertefaie, A., Duttweiler, L., Johnson, B. A., and van der Laan, M. J. (2023a). Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve. arXiv:2309.16099 [math]

  51. [51]

    S., and van der Laan, M

    Ertefaie, A., Hejazi, N. S., and van der Laan, M. J. (2023b). Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso. Biometrics , 79(2):1029--1041

  52. [52]

    and Strawderman, R

    Ertefaie, A. and Strawderman, R. L. (2018). Constructing dynamic treatment regimes over indefinite time horizons. Biometrika , 105(4):963--977

  53. [53]

    Gao, D., Liu, Y., and Zeng, D. (2025). Asymptotic inference for multi-stage stationary treatment policy with variable selection. Journal of Machine Learning Research , 26(167):1--50

  54. [54]

    D., van der Laan, M

    Gill, R. D., van der Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. Annales de l'I.H.P. Probabilités et statistiques , 31(3):545--597

  55. [55]

    and Rotnitzky, A

    Haneuse, S. and Rotnitzky, A. (2013). Estimation of the effect of interventions that modify the received treatment. Statistics in Medicine , 32(30):5260--5277

  56. [56]

    M., Have, T

    Joffe, M. M., Have, T. R. T., Feldman, H. I., and Kimmel, S. E. (2004). Model selection, confounder control, and marginal structural models: Review and new applications. The American Statistician , 58(4):272--279

  57. [57]

    Kang, J. D. Y. and Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science , 22(4):523--539

  58. [58]

    Kennedy, E. H. (2019). Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association , 114(526):645--656

  59. [59]

    B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S

    Klasnja, P., Hekler, E. B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S. A. (2015). Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. Health Psychology , 34(Suppl):1220--1228

  60. [60]

    Liao, P., Klasnja, P., and Murphy, S. (2021). Off-policy estimation of long-term average outcomes with applications to mobile health. Journal of the American Statistical Association , 116(533):382--391

  61. [61]

    J., Laber, E

    Luckett, D. J., Laber, E. B., Kahkoska, A. R., Maahs, D. M., Mayer-Davis, E., and Kosorok, M. R. (2020). Estimating dynamic treatment regimes in mobile health using V -learning. Journal of the American Statistical Association , 115(530):692--706

  62. [62]

    W., Williams, N., and Diaz, I

    McClean, A., Levis, A. W., Williams, N., and Diaz, I. (2025). Longitudinal weighted and trimmed treatment effects with flip interventions. arXiv:2506.09188 [stat]

  63. [63]

    A., van der Laan, M

    Murphy, S. A., van der Laan, M. J., and Robins, J. M. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association , 96(456):1410--1423

  64. [64]

    Muñoz, I. D. and van der Laan, M. (2012). Population intervention causal effects based on stochastic interventions. Biometrics , 68(2):541--549

  65. [65]

    N., Lam, C

    Nahum-Shani, I., Potter, L. N., Lam, C. Y., Yap, J., Moreno, A., Stoffel, R., Wu, Z., Wan, N., Dempsey, W., Kumar, S., Ertin, E., Murphy, S. A., Rehg, J. M., and Wetter, D. W. (2021). The mobile assistance for regulating smoking ( MARS ) micro-randomized trial design protocol. Contemporary Clinical Trials , 110:106513

  66. [66]

    N., Spring, B

    Nahum-Shani, I., Smith, S. N., Spring, B. J., Collins, L. M., Witkiewitz, K., Tewari, A., and Murphy, S. A. (2017). Just-in-time adaptive interventions ( JITAIs ) in mobile health: Key components and design principles for ongoing health behavior support. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , 52(6):446--462

  67. [67]

    Orellana, L., Rotnitzky, A., and Robins, J. M. (2010). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part I : Main content. The International Journal of Biostatistics , 6(2)

  68. [68]

    T., Baer, B

    Pham, C. T., Baer, B. R., and Ertefaie, A. (2025). Nonparametric assessment of regimen response curve estimators. Biometrics , 81(2):ujaf066

  69. [69]

    E., Collins, L

    Qian, T., Walton, A. E., Collins, L. M., Klasnja, P., Lanza, S. T., Nahum-Shani, I., Rabbi, M., Russell, M. A., Walton, M. A., Yoo, H., and Murphy, S. A. (2022). The microrandomized trial for developing digital interventions: Experimental design and data analysis considerations. Psychological Methods , 27(5):874--894

  70. [70]

    Qian, T., Yoo, H., Klasnja, P., Almirall, D., and Murphy, S. A. (2021). Estimating time-varying causal excursion effects in mobile health with binary outcomes. Biometrika , 108(3):507--527

  71. [71]

    Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling , 7(9):1393--1512

  72. [72]

    M., Hernán, M

    Robins, J. M., Hernán, M. A., and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) , 11(5):550--560

  73. [73]

    J., Bibaut, A., and Luedtke, A

    van der Laan, M. J., Bibaut, A., and Luedtke, A. R. (2018). CV - TMLE for Nonpathwise Differentiable Target Parameters . In van der Laan, M. J. and Rose, S., editors, Targeted Learning in Data Science : Causal Inference for Complex Longitudinal Studies , pages 455--481. Springer International Publishing, Cham

  74. [74]

    van der Laan, M. J. and Petersen, M. L. (2007). Causal effect models for realistic individualized treatment and intention to treat rules. The International Journal of Biostatistics , 3(1):Article 3

  75. [75]

    van der Vaart, A. W. (2007). Asymptotic Statistics . Number 3 in Cambridge series on statistical and probabilistic mathematics. Cambridge University Press, Cambridge, first paperback edition, 8th printing edition

  76. [76]

    R., and Zeng, D

    Zhao, Y., Kosorok, M. R., and Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine , 28(26):3294--3315

  77. [77]

    Zhou, X., Mayer-Hamblett, N., Khan, U., and Kosorok, M. R. (2017). Residual weighted learning for estimating individualized treatment rules. Journal of the American Statistical Association , 112(517):169--187