Nonparametric Estimation of Optimal Stochastic Just-In-Time Adaptive Interventions for Distal Outcomes

Ashkan Ertefaie; Jack M. Wolf; Nandita Mitra

arxiv: 2606.25107 · v1 · pith:3SYEXEMFnew · submitted 2026-06-23 · 📊 stat.ME

Nonparametric Estimation of Optimal Stochastic Just-In-Time Adaptive Interventions for Distal Outcomes

Jack M. Wolf , Nandita Mitra , Ashkan Ertefaie This is my paper

Pith reviewed 2026-06-25 21:44 UTC · model grok-4.3

classification 📊 stat.ME

keywords nonparametric estimationjust-in-time adaptive interventionsstochastic policiesdistal outcomesregimen-response curveGaussian processoptimal policydata-adaptive tilting

0 comments

The pith

A nonparametric estimator recovers the regimen-response curve for distal outcomes under stochastic policies in just-in-time interventions.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper constructs an estimator for how different stochastic treatment regimens affect long-term end-of-study outcomes when each person faces many decision points. Standard methods become unstable or target only short-term effects instead. The approach achieves nonparametric efficiency for the full curve that links policies to distal outcomes and adds a tilting step that keeps estimates stable as the number of decisions grows. Weak convergence of the curve estimate to a Gaussian process then supplies simultaneous confidence bands, while separate asymptotic results cover the policy that optimizes the curve. This supplies a complete route to estimation, inference, and optimization when the scientific target is the distal outcome rather than an intermediate one.

Core claim

We develop a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies and introduce a data-adaptive tilting procedure to stabilize estimation in settings with many decision points. We show that the estimated regimen-response curve converges weakly to a Gaussian process, enabling simultaneous confidence bands, and we derive asymptotic theory for the optimizer of the curve, thereby enabling inference for the learned optimal stochastic policy.

What carries the argument

The nonparametrically efficient estimator of the regimen-response curve (the mapping from stochastic policies to expected distal outcomes), together with the data-adaptive tilting procedure that keeps the estimator stable when the number of decision points per person is large.

If this is right

Simultaneous confidence bands become available for the entire regimen-response curve.
Asymptotic inference is justified for the stochastic policy that optimizes the curve.
Estimation and optimization can target distal end-of-study outcomes rather than proximal or discounted ones.
Stable estimates are obtained even when each individual contributes dozens of decision points.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same curve-estimation strategy could be applied to adaptive interventions outside mobile-health settings whenever distal outcomes are the primary scientific target.
The Gaussian-process limit opens the possibility of using functional data tools to compare multiple candidate policies on the full curve rather than only at the optimum.
The tilting construction may extend to other high-frequency decision settings in which standard inverse-probability weighting becomes unstable.

Load-bearing premise

The data-adaptive tilting procedure stabilizes the estimator without introducing bias or changing the target parameter when the number of decision points per individual is large.

What would settle it

A simulation study with many decision points per individual in which the estimator fails to attain the nonparametric efficiency bound or the curve estimate does not converge weakly to a Gaussian process would falsify the central claims.

Figures

Figures reproduced from arXiv: 2606.25107 by Ashkan Ertefaie, Jack M. Wolf, Nandita Mitra.

**Figure 2.** Figure 2: Pointwise coverage for the regimen response curve. [PITH_FULL_IMAGE:figures/full_fig_p025_2.png] view at source ↗

**Figure 3.** Figure 3: Pointwise relative efficiency for the regimen response curve versus the oracle [PITH_FULL_IMAGE:figures/full_fig_p027_3.png] view at source ↗

**Figure 4.** Figure 4: Pointwise relative mean squared error for the regimen response curve for each [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

read the original abstract

Mobile and wearable technologies enable the delivery of just-in-time adaptive interventions (JITAIs) -- interventions that adapt treatment delivery to an individual's rapidly changing internal state and context in real-time, real-world settings. Estimating optimal JITAIs, however, remains challenging because these studies often involve dozens of decision points per individual, and existing methods can produce unstable and irregular estimators with substantial bias and slow convergence rates. Advanced reinforcement learning approaches may be difficult to interpret and often target proximal, discounted outcomes rather than the distal end-of-study outcomes that define long-term success in many behavioral and clinical studies. To address these challenges, we develop a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies and introduce a data-adaptive tilting procedure to stabilize estimation in settings with many decision points. We show that the estimated regimen-response curve converges weakly to a Gaussian process, enabling simultaneous confidence bands, and we derive asymptotic theory for the optimizer of the curve, thereby enabling inference for the learned optimal stochastic policy. These developments provide a unified framework for estimation, inference, and optimization of stochastic JITAIs for distal outcomes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives a nonparametric estimator for stochastic JITAIs targeting distal outcomes, with tilting to stabilize long sequences and theory for GP bands plus inference on the optimizer.

read the letter

The main point is a nonparametric efficient estimator of the regimen-response curve under stochastic policies for distal outcomes, paired with data-adaptive tilting to handle dozens of decision points per person.

What is new is the combination of efficiency theory, tilting, and weak convergence to a Gaussian process specifically for end-of-study outcomes rather than proximal or discounted ones. The paper also supplies asymptotics for the optimizer so that inference on the learned policy is possible.

It does a solid job identifying the practical problem: existing methods get unstable or biased with many time points, and most RL-style approaches miss the distal target that matters for long-term behavioral outcomes. The tilting step is a direct attempt to fix that without losing the semiparametric efficiency.

The soft spot is whether the data-adaptive tilting actually preserves the target parameter and the efficiency bound when the number of decision points grows. The abstract presents this as controlled, but the full paper must show that the adaptation does not introduce bias or break the Donsker conditions needed for the Gaussian-process limit. Without seeing the influence-function derivation or simulation checks, it is hard to judge how fragile the procedure is in realistic mobile-health data.

This is for statisticians working on causal methods or policy learning in longitudinal studies with many time points. A reader who already knows semiparametric efficiency theory will see the technical development clearly.

It deserves a serious referee to verify the tilting argument and the convergence conditions.

Referee Report

0 major / 2 minor

Summary. The manuscript develops a nonparametrically efficient estimator of the regimen-response curve for distal outcomes under a class of stochastic policies in just-in-time adaptive interventions (JITAIs). It introduces a data-adaptive tilting procedure to stabilize estimation when the number of decision points is large, establishes that the estimated curve converges weakly to a Gaussian process (enabling simultaneous confidence bands), and derives asymptotic theory for the optimizer of the curve to support inference on the learned optimal stochastic policy.

Significance. If the central claims hold, the work supplies a unified semiparametric framework for estimation, inference, and optimization of stochastic JITAIs that target distal rather than proximal outcomes. The nonparametric efficiency result and weak convergence to a Gaussian process are standard once influence functions and Donsker conditions are verified, but their application to high-dimensional decision-point settings with distal outcomes is practically relevant for mobile-health studies. The data-adaptive tilting procedure, if shown to preserve the target parameter asymptotically, would be a useful stabilization device.

minor comments (2)

The abstract and introduction should explicitly state the precise class of stochastic policies under which the regimen-response curve is defined and the form of the tilting weights (e.g., whether they depend on estimated propensity scores or outcome regressions).
Notation for the regimen-response curve, the tilting parameter, and the optimizer should be introduced once and used consistently; multiple symbols for the same object appear in the abstract.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of our manuscript and for recommending minor revision. The referee's description accurately reflects the paper's contributions on the nonparametric estimator for the regimen-response curve under stochastic policies, the data-adaptive tilting procedure, weak convergence to a Gaussian process, and asymptotic results for the optimizer. No major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity; derivations are self-contained semiparametric results

full rationale

The paper develops a nonparametric estimator for the regimen-response curve under stochastic policies, establishes weak convergence to a Gaussian process, and derives asymptotics for its optimizer. These follow standard semiparametric efficiency arguments once an influence function is identified and Donsker conditions hold. The data-adaptive tilting is presented as a stabilization step whose asymptotic effect is controlled by the same efficiency theory without redefining the target parameter. No equations reduce a claimed prediction or uniqueness result to a fitted input or self-citation by construction. The abstract and claimed results contain independent statistical content that does not collapse to the inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Only the abstract is available, so the ledger is necessarily incomplete; standard regularity conditions for nonparametric estimation and weak convergence are presumed but not listed.

axioms (1)

standard math Standard regularity conditions for nonparametric efficiency and weak convergence of the estimator to a Gaussian process
Invoked to support the convergence and inference claims.

pith-pipeline@v0.9.1-grok · 5735 in / 1263 out tokens · 20736 ms · 2026-06-25T21:44:05.832235+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

77 extracted references · 34 canonical work pages · 1 internal anchor

[1]

and van der Laan, Mark J

Ertefaie, Ashkan and Duttweiler, Luke and Johnson, Brent A. and van der Laan, Mark J. , month = sep, year =. Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve , url =. doi:10.48550/arXiv.2309.16099 , abstract =

work page doi:10.48550/arxiv.2309.16099
[2]

The International Journal of Biostatistics , volume=

Targeted estimation of nuisance parameters to obtain valid statistical inference , author=. The International Journal of Biostatistics , volume=. 2014 , publisher=

2014
[3]

Biometrika , volume=

Doubly robust nonparametric inference on the average treatment effect , author=. Biometrika , volume=. 2017 , publisher=

2017
[4]

Biometrics , author =

Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso , volume =. Biometrics , author =. 2023 , keywords =. doi:10.1111/biom.13719 , abstract =

work page doi:10.1111/biom.13719 2023
[5]

Contemporary Clinical Trials , author =

The mobile assistance for regulating smoking (. Contemporary Clinical Trials , author =. 2021 , keywords =. doi:10.1016/j.cct.2021.106513 , abstract =

work page doi:10.1016/j.cct.2021.106513 2021
[6]

, year =

Benkeser, David and van der Laan, Mark , month = oct, year =. The highly adaptive lasso estimator , url =. 2016. doi:10.1109/DSAA.2016.93 , abstract =

work page doi:10.1109/dsaa.2016.93 2016
[7]

and Ridder, Geert , year =

Efficient estimation of average treatment effects using the estimated propensity score , volume =. Econometrica , author =. 2003 , pages =. doi:10.1111/1468-0262.00442 , abstract =

work page doi:10.1111/1468-0262.00442 2003
[8]

The International Journal of Biostatistics , author =

A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso , volume =. The International Journal of Biostatistics , author =. doi:10.1515/ijb-2015-0097 , abstract =

work page doi:10.1515/ijb-2015-0097 2015
[9]

Biometrika , author =

Estimating time-varying causal excursion effects in mobile health with binary outcomes , volume =. Biometrika , author =. 2021 , pages =. doi:10.1093/biomet/asaa070 , abstract =

work page doi:10.1093/biomet/asaa070 2021
[10]

Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

Nonparametric causal effects based on incremental propensity score interventions , volume =. Journal of the American Statistical Association , author =. 2019 , keywords =. doi:10.1080/01621459.2017.1422737 , abstract =

work page doi:10.1080/01621459.2017.1422737 2019
[11]

and Wolfson, Julian , month = mar, year =

Barnard, Martha and Huling, Jared D. and Wolfson, Julian , month = mar, year =. A unified framework for causal estimand selection , url =. doi:10.48550/arXiv.2410.12093 , abstract =

work page doi:10.48550/arxiv.2410.12093
[12]

The International Journal of Biostatistics , author =

Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part. The International Journal of Biostatistics , author =. 2010 , file =. doi:10.2202/1557-4679.1200 , abstract =

work page doi:10.2202/1557-4679.1200 2010
[13]

Journal of the American Statistical Association , author =

Marginal mean models for dynamic regimes , volume =. Journal of the American Statistical Association , author =. 2001 , pmid =. doi:10.1198/016214501753382327 , abstract =

work page doi:10.1198/016214501753382327 2001
[14]

Mathematical Modelling , author =

A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , volume =. Mathematical Modelling , author =. 1986 , pages =. doi:10.1016/0270-0255(86)90088-6 , abstract =

work page doi:10.1016/0270-0255(86)90088-6 1986
[15]

Journal of the American Statistical Association , author =

Estimating dynamic treatment regimes in mobile health using. Journal of the American Statistical Association , author =. 2020 , pages =. doi:10.1080/01621459.2018.1537919 , language =

work page doi:10.1080/01621459.2018.1537919 2020
[16]

Annales de l'I.H.P

Inefficient estimators of the bivariate survival function for three models , volume =. Annales de l'I.H.P. Probabilités et statistiques , author =. 1995 , pages =

1995
[17]

Biometrics , author =

Nonparametric assessment of regimen response curve estimators , volume =. Biometrics , author =. 2025 , pages =. doi:10.1093/biomtc/ujaf066 , abstract =

work page doi:10.1093/biomtc/ujaf066 2025
[18]

Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

Assessing time-varying causal effect moderation in mobile health , volume =. Journal of the American Statistical Association , author =. 2018 , pages =. doi:10.1080/01621459.2017.1305274 , abstract =

work page doi:10.1080/01621459.2017.1305274 2018
[19]

Journal of the American Statistical Association , volume =

Off-policy estimation of long-term average outcomes with applications to mobile health , volume =. Journal of the American Statistical Association , author =. 2021 , pages =. doi:10.1080/01621459.2020.1807993 , abstract =

work page doi:10.1080/01621459.2020.1807993 2021
[20]

Psychological Methods , author =

The microrandomized trial for developing digital interventions:. Psychological Methods , author =. 2022 , keywords =. doi:10.1037/met0000283 , abstract =

work page doi:10.1037/met0000283 2022
[21]

Journal of Statistical Planning and Inference , author =

Nonparametric causal effects based on marginal structural models , volume =. Journal of Statistical Planning and Inference , author =. 2007 , keywords =. doi:10.1016/j.jspi.2005.12.008 , abstract =

work page doi:10.1016/j.jspi.2005.12.008 2007
[22]

, year =

van der Vaart, Aad W. , year =. Asymptotic
[23]

van der Vaart, A. W. and Wellner, Jon A. , year =. Weak
[24]

Journal of Machine Learning Research , author =

Asymptotic inference for multi-stage stationary treatment policy with variable selection , volume =. Journal of Machine Learning Research , author =. 2025 , pages =

2025
[25]

Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =

Kallus, Nathan and Uehara, Masatoshi , year =. Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =. Advances in
[26]

More robust doubly robust off-policy evaluation , url =

Farajtabar, Mehrdad and Chow, Yinlam and Ghavamzadeh, Mohammad , month = jul, year =. More robust doubly robust off-policy evaluation , url =. Proceedings of the 35th
[27]

Biometrika , author =

Constructing dynamic treatment regimes over indefinite time horizons , volume =. Biometrika , author =. 2018 , pages =

2018
[28]

Robins, Miguel A

Marginal structural models and causal inference in epidemiology , volume =. Epidemiology (Cambridge, Mass.) , author =. 2000 , keywords =. doi:10.1097/00001648-200009000-00011 , abstract =

work page doi:10.1097/00001648-200009000-00011 2000
[29]

The American Statistician , author =

Model selection, confounder control, and marginal structural models:. The American Statistician , author =. 2004 , pages =

2004
[30]

and Williams, Nicholas and Diaz, Ivan , month = jun, year =

McClean, Alec and Levis, Alexander W. and Williams, Nicholas and Diaz, Ivan , month = jun, year =. Longitudinal weighted and trimmed treatment effects with flip interventions , url =. doi:10.48550/arXiv.2506.09188 , abstract =

work page doi:10.48550/arxiv.2506.09188
[31]

The International Journal of Biostatistics , author =

Causal effect models for realistic individualized treatment and intention to treat rules , volume =. The International Journal of Biostatistics , author =. 2007 , keywords =. doi:10.2202/1557-4679.1022 , abstract =

work page doi:10.2202/1557-4679.1022 2007
[32]

Biometrics , author =

Population intervention causal effects based on stochastic interventions , volume =. Biometrics , author =. 2012 , keywords =. doi:10.1111/j.1541-0420.2011.01685.x , abstract =

work page doi:10.1111/j.1541-0420.2011.01685.x 2012
[33]

Statistics in Medicine , author =

Estimation of the effect of interventions that modify the received treatment , volume =. Statistics in Medicine , author =. 2013 , keywords =. doi:10.1002/sim.5907 , abstract =

work page doi:10.1002/sim.5907 2013
[34]

Econometrica , author =

Policy learning with observational data , volume =. Econometrica , author =. 2021 , keywords =. doi:10.3982/ECTA15732 , abstract =

work page doi:10.3982/ecta15732 2021
[35]

Zubizarreta

Residual weighted learning for estimating individualized treatment rules , volume =. Journal of the American Statistical Association , author =. 2017 , keywords =. doi:10.1080/01621459.2015.1093947 , abstract =

work page doi:10.1080/01621459.2015.1093947 2017
[36]

Statistics in Medicine , author =

Reinforcement learning design for cancer clinical trials , volume =. Statistics in Medicine , author =. 2009 , keywords =. doi:10.1002/sim.3720 , abstract =

work page doi:10.1002/sim.3720 2009
[37]

, month = dec, year =

Manski, Charles F. , month = dec, year =. Identification for
[38]

Zou, Hui , month = dec, year =. The. Journal of the American Statistical Association , publisher =. doi:10.1198/016214506000000735 , abstract =

work page doi:10.1198/016214506000000735
[39]

The International Journal of Biostatistics , author =

Targeted estimation of nuisance parameters to obtain valid statistical inference , volume =. The International Journal of Biostatistics , author =. 2014 , keywords =. doi:10.1515/ijb-2012-0038 , abstract =

work page doi:10.1515/ijb-2012-0038 2014
[40]

Biometrika , author =

Doubly robust nonparametric inference on the average treatment effect , volume =. Biometrika , author =. 2017 , pages =. doi:10.1093/biomet/asx053 , abstract =

work page doi:10.1093/biomet/asx053 2017
[41]

Hekler, Saul Shiffman, Audrey Boruvka, Daniel Almirall, Ambuj Tewari, and Susan A

Klasnja, Predrag and Hekler, Eric B. and Shiffman, Saul and Boruvka, Audrey and Almirall, Daniel and Tewari, Ambuj and Murphy, Susan A. , year =. Microrandomized trials:. Health Psychology , publisher =. doi:10.1037/hea0000305 , abstract =

work page doi:10.1037/hea0000305
[42]

Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =

Just-in-time adaptive interventions (. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =. 2017 , pages =. doi:10.1007/s12160-016-9830-8 , abstract =

work page doi:10.1007/s12160-016-9830-8 2017
[43]

Kang, Joseph D. Y. and Schafer, Joseph L. , month = nov, year =. Demystifying double robustness:. Statistical Science , publisher =. doi:10.1214/07-STS227 , abstract =

work page doi:10.1214/07-sts227
[44]

and Bibaut, Aurélien and Luedtke, Alexander R

van der Laan, Mark J. and Bibaut, Aurélien and Luedtke, Alexander R. , editor =. Targeted. 2018 , pages =. doi:10.1007/978-3-319-65304-4_25 , abstract =

work page doi:10.1007/978-3-319-65304-4_25 2018
[45]

Highly Adaptive Principal Component Regression

Wang, Mingxun and Schuler, Alejandro and Laan, Mark van der and Meixide, Carlos García , month = may, year =. Highly adaptive principal component regression , url =. doi:10.48550/arXiv.2602.10613 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.10613
[46]

and Wager, S

Athey, S. and Wager, S. (2021). Policy learning with observational data. Econometrica , 89(1):133--161

2021
[47]

J., and Gilbert, P

Benkeser, D., Carone, M., van der Laan, M. J., and Gilbert, P. B. (2017). Doubly robust nonparametric inference on the average treatment effect. Biometrika , 104(4):863--880

2017
[48]

and van der Laan, M

Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE International Conference on Data Science and Advanced Analytics ( DSAA ) , pages 689--696

2016
[49]

Boruvka, A., Almirall, D., Witkiewitz, K., and Murphy, S. A. (2018). Assessing time-varying causal effect moderation in mobile health. Journal of the American Statistical Association , 113(523):1112--1121

2018
[50]

A., and van der Laan, M

Ertefaie, A., Duttweiler, L., Johnson, B. A., and van der Laan, M. J. (2023a). Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve. arXiv:2309.16099 [math]

arXiv
[51]

S., and van der Laan, M

Ertefaie, A., Hejazi, N. S., and van der Laan, M. J. (2023b). Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso. Biometrics , 79(2):1029--1041
[52]

and Strawderman, R

Ertefaie, A. and Strawderman, R. L. (2018). Constructing dynamic treatment regimes over indefinite time horizons. Biometrika , 105(4):963--977

2018
[53]

Gao, D., Liu, Y., and Zeng, D. (2025). Asymptotic inference for multi-stage stationary treatment policy with variable selection. Journal of Machine Learning Research , 26(167):1--50

2025
[54]

D., van der Laan, M

Gill, R. D., van der Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. Annales de l'I.H.P. Probabilités et statistiques , 31(3):545--597

1995
[55]

and Rotnitzky, A

Haneuse, S. and Rotnitzky, A. (2013). Estimation of the effect of interventions that modify the received treatment. Statistics in Medicine , 32(30):5260--5277

2013
[56]

M., Have, T

Joffe, M. M., Have, T. R. T., Feldman, H. I., and Kimmel, S. E. (2004). Model selection, confounder control, and marginal structural models: Review and new applications. The American Statistician , 58(4):272--279

2004
[57]

Kang, J. D. Y. and Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science , 22(4):523--539

2007
[58]

Kennedy, E. H. (2019). Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association , 114(526):645--656

2019
[59]

B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S

Klasnja, P., Hekler, E. B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S. A. (2015). Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. Health Psychology , 34(Suppl):1220--1228

2015
[60]

Liao, P., Klasnja, P., and Murphy, S. (2021). Off-policy estimation of long-term average outcomes with applications to mobile health. Journal of the American Statistical Association , 116(533):382--391

2021
[61]

J., Laber, E

Luckett, D. J., Laber, E. B., Kahkoska, A. R., Maahs, D. M., Mayer-Davis, E., and Kosorok, M. R. (2020). Estimating dynamic treatment regimes in mobile health using V -learning. Journal of the American Statistical Association , 115(530):692--706

2020
[62]

W., Williams, N., and Diaz, I

McClean, A., Levis, A. W., Williams, N., and Diaz, I. (2025). Longitudinal weighted and trimmed treatment effects with flip interventions. arXiv:2506.09188 [stat]

arXiv 2025
[63]

A., van der Laan, M

Murphy, S. A., van der Laan, M. J., and Robins, J. M. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association , 96(456):1410--1423

2001
[64]

Muñoz, I. D. and van der Laan, M. (2012). Population intervention causal effects based on stochastic interventions. Biometrics , 68(2):541--549

2012
[65]

N., Lam, C

Nahum-Shani, I., Potter, L. N., Lam, C. Y., Yap, J., Moreno, A., Stoffel, R., Wu, Z., Wan, N., Dempsey, W., Kumar, S., Ertin, E., Murphy, S. A., Rehg, J. M., and Wetter, D. W. (2021). The mobile assistance for regulating smoking ( MARS ) micro-randomized trial design protocol. Contemporary Clinical Trials , 110:106513

2021
[66]

N., Spring, B

Nahum-Shani, I., Smith, S. N., Spring, B. J., Collins, L. M., Witkiewitz, K., Tewari, A., and Murphy, S. A. (2017). Just-in-time adaptive interventions ( JITAIs ) in mobile health: Key components and design principles for ongoing health behavior support. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , 52(6):446--462

2017
[67]

Orellana, L., Rotnitzky, A., and Robins, J. M. (2010). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part I : Main content. The International Journal of Biostatistics , 6(2)

2010
[68]

T., Baer, B

Pham, C. T., Baer, B. R., and Ertefaie, A. (2025). Nonparametric assessment of regimen response curve estimators. Biometrics , 81(2):ujaf066

2025
[69]

E., Collins, L

Qian, T., Walton, A. E., Collins, L. M., Klasnja, P., Lanza, S. T., Nahum-Shani, I., Rabbi, M., Russell, M. A., Walton, M. A., Yoo, H., and Murphy, S. A. (2022). The microrandomized trial for developing digital interventions: Experimental design and data analysis considerations. Psychological Methods , 27(5):874--894

2022
[70]

Qian, T., Yoo, H., Klasnja, P., Almirall, D., and Murphy, S. A. (2021). Estimating time-varying causal excursion effects in mobile health with binary outcomes. Biometrika , 108(3):507--527

2021
[71]

Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling , 7(9):1393--1512

1986
[72]

M., Hernán, M

Robins, J. M., Hernán, M. A., and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) , 11(5):550--560

2000
[73]

J., Bibaut, A., and Luedtke, A

van der Laan, M. J., Bibaut, A., and Luedtke, A. R. (2018). CV - TMLE for Nonpathwise Differentiable Target Parameters . In van der Laan, M. J. and Rose, S., editors, Targeted Learning in Data Science : Causal Inference for Complex Longitudinal Studies , pages 455--481. Springer International Publishing, Cham

2018
[74]

van der Laan, M. J. and Petersen, M. L. (2007). Causal effect models for realistic individualized treatment and intention to treat rules. The International Journal of Biostatistics , 3(1):Article 3

2007
[75]

van der Vaart, A. W. (2007). Asymptotic Statistics . Number 3 in Cambridge series on statistical and probabilistic mathematics. Cambridge University Press, Cambridge, first paperback edition, 8th printing edition

2007
[76]

R., and Zeng, D

Zhao, Y., Kosorok, M. R., and Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine , 28(26):3294--3315

2009
[77]

Zhou, X., Mayer-Hamblett, N., Khan, U., and Kosorok, M. R. (2017). Residual weighted learning for estimating individualized treatment rules. Journal of the American Statistical Association , 112(517):169--187

2017

[1] [1]

and van der Laan, Mark J

Ertefaie, Ashkan and Duttweiler, Luke and Johnson, Brent A. and van der Laan, Mark J. , month = sep, year =. Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve , url =. doi:10.48550/arXiv.2309.16099 , abstract =

work page doi:10.48550/arxiv.2309.16099

[2] [2]

The International Journal of Biostatistics , volume=

Targeted estimation of nuisance parameters to obtain valid statistical inference , author=. The International Journal of Biostatistics , volume=. 2014 , publisher=

2014

[3] [3]

Biometrika , volume=

Doubly robust nonparametric inference on the average treatment effect , author=. Biometrika , volume=. 2017 , publisher=

2017

[4] [4]

Biometrics , author =

Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso , volume =. Biometrics , author =. 2023 , keywords =. doi:10.1111/biom.13719 , abstract =

work page doi:10.1111/biom.13719 2023

[5] [5]

Contemporary Clinical Trials , author =

The mobile assistance for regulating smoking (. Contemporary Clinical Trials , author =. 2021 , keywords =. doi:10.1016/j.cct.2021.106513 , abstract =

work page doi:10.1016/j.cct.2021.106513 2021

[6] [6]

, year =

Benkeser, David and van der Laan, Mark , month = oct, year =. The highly adaptive lasso estimator , url =. 2016. doi:10.1109/DSAA.2016.93 , abstract =

work page doi:10.1109/dsaa.2016.93 2016

[7] [7]

and Ridder, Geert , year =

Efficient estimation of average treatment effects using the estimated propensity score , volume =. Econometrica , author =. 2003 , pages =. doi:10.1111/1468-0262.00442 , abstract =

work page doi:10.1111/1468-0262.00442 2003

[8] [8]

The International Journal of Biostatistics , author =

A generally efficient targeted minimum loss based estimator based on the highly adaptive lasso , volume =. The International Journal of Biostatistics , author =. doi:10.1515/ijb-2015-0097 , abstract =

work page doi:10.1515/ijb-2015-0097 2015

[9] [9]

Biometrika , author =

Estimating time-varying causal excursion effects in mobile health with binary outcomes , volume =. Biometrika , author =. 2021 , pages =. doi:10.1093/biomet/asaa070 , abstract =

work page doi:10.1093/biomet/asaa070 2021

[10] [10]

Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

Nonparametric causal effects based on incremental propensity score interventions , volume =. Journal of the American Statistical Association , author =. 2019 , keywords =. doi:10.1080/01621459.2017.1422737 , abstract =

work page doi:10.1080/01621459.2017.1422737 2019

[11] [11]

and Wolfson, Julian , month = mar, year =

Barnard, Martha and Huling, Jared D. and Wolfson, Julian , month = mar, year =. A unified framework for causal estimand selection , url =. doi:10.48550/arXiv.2410.12093 , abstract =

work page doi:10.48550/arxiv.2410.12093

[12] [12]

The International Journal of Biostatistics , author =

Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part. The International Journal of Biostatistics , author =. 2010 , file =. doi:10.2202/1557-4679.1200 , abstract =

work page doi:10.2202/1557-4679.1200 2010

[13] [13]

Journal of the American Statistical Association , author =

Marginal mean models for dynamic regimes , volume =. Journal of the American Statistical Association , author =. 2001 , pmid =. doi:10.1198/016214501753382327 , abstract =

work page doi:10.1198/016214501753382327 2001

[14] [14]

Mathematical Modelling , author =

A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect , volume =. Mathematical Modelling , author =. 1986 , pages =. doi:10.1016/0270-0255(86)90088-6 , abstract =

work page doi:10.1016/0270-0255(86)90088-6 1986

[15] [15]

Journal of the American Statistical Association , author =

Estimating dynamic treatment regimes in mobile health using. Journal of the American Statistical Association , author =. 2020 , pages =. doi:10.1080/01621459.2018.1537919 , language =

work page doi:10.1080/01621459.2018.1537919 2020

[16] [16]

Annales de l'I.H.P

Inefficient estimators of the bivariate survival function for three models , volume =. Annales de l'I.H.P. Probabilités et statistiques , author =. 1995 , pages =

1995

[17] [17]

Biometrics , author =

Nonparametric assessment of regimen response curve estimators , volume =. Biometrics , author =. 2025 , pages =. doi:10.1093/biomtc/ujaf066 , abstract =

work page doi:10.1093/biomtc/ujaf066 2025

[18] [18]

Journal of the American Statistical Association 112(518), 859–877 (2017) https://doi.org/10.1080/01621459.2017.1285773 https://doi.org/10.1080/01621459.2017.1285773

Assessing time-varying causal effect moderation in mobile health , volume =. Journal of the American Statistical Association , author =. 2018 , pages =. doi:10.1080/01621459.2017.1305274 , abstract =

work page doi:10.1080/01621459.2017.1305274 2018

[19] [19]

Journal of the American Statistical Association , volume =

Off-policy estimation of long-term average outcomes with applications to mobile health , volume =. Journal of the American Statistical Association , author =. 2021 , pages =. doi:10.1080/01621459.2020.1807993 , abstract =

work page doi:10.1080/01621459.2020.1807993 2021

[20] [20]

Psychological Methods , author =

The microrandomized trial for developing digital interventions:. Psychological Methods , author =. 2022 , keywords =. doi:10.1037/met0000283 , abstract =

work page doi:10.1037/met0000283 2022

[21] [21]

Journal of Statistical Planning and Inference , author =

Nonparametric causal effects based on marginal structural models , volume =. Journal of Statistical Planning and Inference , author =. 2007 , keywords =. doi:10.1016/j.jspi.2005.12.008 , abstract =

work page doi:10.1016/j.jspi.2005.12.008 2007

[22] [22]

, year =

van der Vaart, Aad W. , year =. Asymptotic

[23] [23]

van der Vaart, A. W. and Wellner, Jon A. , year =. Weak

[24] [24]

Journal of Machine Learning Research , author =

Asymptotic inference for multi-stage stationary treatment policy with variable selection , volume =. Journal of Machine Learning Research , author =. 2025 , pages =

2025

[25] [25]

Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =

Kallus, Nathan and Uehara, Masatoshi , year =. Intrinsically efficient, stable, and bounded off-policy evaluation for reinforcement learning , volume =. Advances in

[26] [26]

More robust doubly robust off-policy evaluation , url =

Farajtabar, Mehrdad and Chow, Yinlam and Ghavamzadeh, Mohammad , month = jul, year =. More robust doubly robust off-policy evaluation , url =. Proceedings of the 35th

[27] [27]

Biometrika , author =

Constructing dynamic treatment regimes over indefinite time horizons , volume =. Biometrika , author =. 2018 , pages =

2018

[28] [28]

Robins, Miguel A

Marginal structural models and causal inference in epidemiology , volume =. Epidemiology (Cambridge, Mass.) , author =. 2000 , keywords =. doi:10.1097/00001648-200009000-00011 , abstract =

work page doi:10.1097/00001648-200009000-00011 2000

[29] [29]

The American Statistician , author =

Model selection, confounder control, and marginal structural models:. The American Statistician , author =. 2004 , pages =

2004

[30] [30]

and Williams, Nicholas and Diaz, Ivan , month = jun, year =

McClean, Alec and Levis, Alexander W. and Williams, Nicholas and Diaz, Ivan , month = jun, year =. Longitudinal weighted and trimmed treatment effects with flip interventions , url =. doi:10.48550/arXiv.2506.09188 , abstract =

work page doi:10.48550/arxiv.2506.09188

[31] [31]

The International Journal of Biostatistics , author =

Causal effect models for realistic individualized treatment and intention to treat rules , volume =. The International Journal of Biostatistics , author =. 2007 , keywords =. doi:10.2202/1557-4679.1022 , abstract =

work page doi:10.2202/1557-4679.1022 2007

[32] [32]

Biometrics , author =

Population intervention causal effects based on stochastic interventions , volume =. Biometrics , author =. 2012 , keywords =. doi:10.1111/j.1541-0420.2011.01685.x , abstract =

work page doi:10.1111/j.1541-0420.2011.01685.x 2012

[33] [33]

Statistics in Medicine , author =

Estimation of the effect of interventions that modify the received treatment , volume =. Statistics in Medicine , author =. 2013 , keywords =. doi:10.1002/sim.5907 , abstract =

work page doi:10.1002/sim.5907 2013

[34] [34]

Econometrica , author =

Policy learning with observational data , volume =. Econometrica , author =. 2021 , keywords =. doi:10.3982/ECTA15732 , abstract =

work page doi:10.3982/ecta15732 2021

[35] [35]

Zubizarreta

Residual weighted learning for estimating individualized treatment rules , volume =. Journal of the American Statistical Association , author =. 2017 , keywords =. doi:10.1080/01621459.2015.1093947 , abstract =

work page doi:10.1080/01621459.2015.1093947 2017

[36] [36]

Statistics in Medicine , author =

Reinforcement learning design for cancer clinical trials , volume =. Statistics in Medicine , author =. 2009 , keywords =. doi:10.1002/sim.3720 , abstract =

work page doi:10.1002/sim.3720 2009

[37] [37]

, month = dec, year =

Manski, Charles F. , month = dec, year =. Identification for

[38] [38]

Zou, Hui , month = dec, year =. The. Journal of the American Statistical Association , publisher =. doi:10.1198/016214506000000735 , abstract =

work page doi:10.1198/016214506000000735

[39] [39]

The International Journal of Biostatistics , author =

Targeted estimation of nuisance parameters to obtain valid statistical inference , volume =. The International Journal of Biostatistics , author =. 2014 , keywords =. doi:10.1515/ijb-2012-0038 , abstract =

work page doi:10.1515/ijb-2012-0038 2014

[40] [40]

Biometrika , author =

Doubly robust nonparametric inference on the average treatment effect , volume =. Biometrika , author =. 2017 , pages =. doi:10.1093/biomet/asx053 , abstract =

work page doi:10.1093/biomet/asx053 2017

[41] [41]

Hekler, Saul Shiffman, Audrey Boruvka, Daniel Almirall, Ambuj Tewari, and Susan A

Klasnja, Predrag and Hekler, Eric B. and Shiffman, Saul and Boruvka, Audrey and Almirall, Daniel and Tewari, Ambuj and Murphy, Susan A. , year =. Microrandomized trials:. Health Psychology , publisher =. doi:10.1037/hea0000305 , abstract =

work page doi:10.1037/hea0000305

[42] [42]

Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =

Just-in-time adaptive interventions (. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , author =. 2017 , pages =. doi:10.1007/s12160-016-9830-8 , abstract =

work page doi:10.1007/s12160-016-9830-8 2017

[43] [43]

Kang, Joseph D. Y. and Schafer, Joseph L. , month = nov, year =. Demystifying double robustness:. Statistical Science , publisher =. doi:10.1214/07-STS227 , abstract =

work page doi:10.1214/07-sts227

[44] [44]

and Bibaut, Aurélien and Luedtke, Alexander R

van der Laan, Mark J. and Bibaut, Aurélien and Luedtke, Alexander R. , editor =. Targeted. 2018 , pages =. doi:10.1007/978-3-319-65304-4_25 , abstract =

work page doi:10.1007/978-3-319-65304-4_25 2018

[45] [45]

Highly Adaptive Principal Component Regression

Wang, Mingxun and Schuler, Alejandro and Laan, Mark van der and Meixide, Carlos García , month = may, year =. Highly adaptive principal component regression , url =. doi:10.48550/arXiv.2602.10613 , abstract =

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2602.10613

[46] [46]

and Wager, S

Athey, S. and Wager, S. (2021). Policy learning with observational data. Econometrica , 89(1):133--161

2021

[47] [47]

J., and Gilbert, P

Benkeser, D., Carone, M., van der Laan, M. J., and Gilbert, P. B. (2017). Doubly robust nonparametric inference on the average treatment effect. Biometrika , 104(4):863--880

2017

[48] [48]

and van der Laan, M

Benkeser, D. and van der Laan, M. (2016). The highly adaptive lasso estimator. In 2016 IEEE International Conference on Data Science and Advanced Analytics ( DSAA ) , pages 689--696

2016

[49] [49]

Boruvka, A., Almirall, D., Witkiewitz, K., and Murphy, S. A. (2018). Assessing time-varying causal effect moderation in mobile health. Journal of the American Statistical Association , 113(523):1112--1121

2018

[50] [50]

A., and van der Laan, M

Ertefaie, A., Duttweiler, L., Johnson, B. A., and van der Laan, M. J. (2023a). Nonparametric estimation of a covariate-adjusted counterfactual treatment regimen response curve. arXiv:2309.16099 [math]

arXiv

[51] [51]

S., and van der Laan, M

Ertefaie, A., Hejazi, N. S., and van der Laan, M. J. (2023b). Nonparametric inverse-probability-weighted estimators based on the highly adaptive lasso. Biometrics , 79(2):1029--1041

[52] [52]

and Strawderman, R

Ertefaie, A. and Strawderman, R. L. (2018). Constructing dynamic treatment regimes over indefinite time horizons. Biometrika , 105(4):963--977

2018

[53] [53]

Gao, D., Liu, Y., and Zeng, D. (2025). Asymptotic inference for multi-stage stationary treatment policy with variable selection. Journal of Machine Learning Research , 26(167):1--50

2025

[54] [54]

D., van der Laan, M

Gill, R. D., van der Laan, M. J., and Wellner, J. A. (1995). Inefficient estimators of the bivariate survival function for three models. Annales de l'I.H.P. Probabilités et statistiques , 31(3):545--597

1995

[55] [55]

and Rotnitzky, A

Haneuse, S. and Rotnitzky, A. (2013). Estimation of the effect of interventions that modify the received treatment. Statistics in Medicine , 32(30):5260--5277

2013

[56] [56]

M., Have, T

Joffe, M. M., Have, T. R. T., Feldman, H. I., and Kimmel, S. E. (2004). Model selection, confounder control, and marginal structural models: Review and new applications. The American Statistician , 58(4):272--279

2004

[57] [57]

Kang, J. D. Y. and Schafer, J. L. (2007). Demystifying double robustness: A comparison of alternative strategies for estimating a population mean from incomplete data. Statistical Science , 22(4):523--539

2007

[58] [58]

Kennedy, E. H. (2019). Nonparametric causal effects based on incremental propensity score interventions. Journal of the American Statistical Association , 114(526):645--656

2019

[59] [59]

B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S

Klasnja, P., Hekler, E. B., Shiffman, S., Boruvka, A., Almirall, D., Tewari, A., and Murphy, S. A. (2015). Microrandomized trials: An experimental design for developing just-in-time adaptive interventions. Health Psychology , 34(Suppl):1220--1228

2015

[60] [60]

Liao, P., Klasnja, P., and Murphy, S. (2021). Off-policy estimation of long-term average outcomes with applications to mobile health. Journal of the American Statistical Association , 116(533):382--391

2021

[61] [61]

J., Laber, E

Luckett, D. J., Laber, E. B., Kahkoska, A. R., Maahs, D. M., Mayer-Davis, E., and Kosorok, M. R. (2020). Estimating dynamic treatment regimes in mobile health using V -learning. Journal of the American Statistical Association , 115(530):692--706

2020

[62] [62]

W., Williams, N., and Diaz, I

McClean, A., Levis, A. W., Williams, N., and Diaz, I. (2025). Longitudinal weighted and trimmed treatment effects with flip interventions. arXiv:2506.09188 [stat]

arXiv 2025

[63] [63]

A., van der Laan, M

Murphy, S. A., van der Laan, M. J., and Robins, J. M. (2001). Marginal mean models for dynamic regimes. Journal of the American Statistical Association , 96(456):1410--1423

2001

[64] [64]

Muñoz, I. D. and van der Laan, M. (2012). Population intervention causal effects based on stochastic interventions. Biometrics , 68(2):541--549

2012

[65] [65]

N., Lam, C

Nahum-Shani, I., Potter, L. N., Lam, C. Y., Yap, J., Moreno, A., Stoffel, R., Wu, Z., Wan, N., Dempsey, W., Kumar, S., Ertin, E., Murphy, S. A., Rehg, J. M., and Wetter, D. W. (2021). The mobile assistance for regulating smoking ( MARS ) micro-randomized trial design protocol. Contemporary Clinical Trials , 110:106513

2021

[66] [66]

N., Spring, B

Nahum-Shani, I., Smith, S. N., Spring, B. J., Collins, L. M., Witkiewitz, K., Tewari, A., and Murphy, S. A. (2017). Just-in-time adaptive interventions ( JITAIs ) in mobile health: Key components and design principles for ongoing health behavior support. Annals of Behavioral Medicine: A Publication of the Society of Behavioral Medicine , 52(6):446--462

2017

[67] [67]

Orellana, L., Rotnitzky, A., and Robins, J. M. (2010). Dynamic regime marginal structural mean models for estimation of optimal dynamic treatment regimes, part I : Main content. The International Journal of Biostatistics , 6(2)

2010

[68] [68]

T., Baer, B

Pham, C. T., Baer, B. R., and Ertefaie, A. (2025). Nonparametric assessment of regimen response curve estimators. Biometrics , 81(2):ujaf066

2025

[69] [69]

E., Collins, L

Qian, T., Walton, A. E., Collins, L. M., Klasnja, P., Lanza, S. T., Nahum-Shani, I., Rabbi, M., Russell, M. A., Walton, M. A., Yoo, H., and Murphy, S. A. (2022). The microrandomized trial for developing digital interventions: Experimental design and data analysis considerations. Psychological Methods , 27(5):874--894

2022

[70] [70]

Qian, T., Yoo, H., Klasnja, P., Almirall, D., and Murphy, S. A. (2021). Estimating time-varying causal excursion effects in mobile health with binary outcomes. Biometrika , 108(3):507--527

2021

[71] [71]

Robins, J. (1986). A new approach to causal inference in mortality studies with a sustained exposure period—application to control of the healthy worker survivor effect. Mathematical Modelling , 7(9):1393--1512

1986

[72] [72]

M., Hernán, M

Robins, J. M., Hernán, M. A., and Brumback, B. (2000). Marginal structural models and causal inference in epidemiology. Epidemiology (Cambridge, Mass.) , 11(5):550--560

2000

[73] [73]

J., Bibaut, A., and Luedtke, A

van der Laan, M. J., Bibaut, A., and Luedtke, A. R. (2018). CV - TMLE for Nonpathwise Differentiable Target Parameters . In van der Laan, M. J. and Rose, S., editors, Targeted Learning in Data Science : Causal Inference for Complex Longitudinal Studies , pages 455--481. Springer International Publishing, Cham

2018

[74] [74]

van der Laan, M. J. and Petersen, M. L. (2007). Causal effect models for realistic individualized treatment and intention to treat rules. The International Journal of Biostatistics , 3(1):Article 3

2007

[75] [75]

van der Vaart, A. W. (2007). Asymptotic Statistics . Number 3 in Cambridge series on statistical and probabilistic mathematics. Cambridge University Press, Cambridge, first paperback edition, 8th printing edition

2007

[76] [76]

R., and Zeng, D

Zhao, Y., Kosorok, M. R., and Zeng, D. (2009). Reinforcement learning design for cancer clinical trials. Statistics in Medicine , 28(26):3294--3315

2009

[77] [77]

Zhou, X., Mayer-Hamblett, N., Khan, U., and Kosorok, M. R. (2017). Residual weighted learning for estimating individualized treatment rules. Journal of the American Statistical Association , 112(517):169--187

2017