An adaptive variance estimator for relative sparsity

Samuel Julian Weisenthal

arxiv: 2605.02112 · v1 · submitted 2026-05-04 · 📊 stat.ME

An adaptive variance estimator for relative sparsity

Samuel Julian Weisenthal This is my paper

Pith reviewed 2026-05-08 19:33 UTC · model grok-4.3

classification 📊 stat.ME

keywords theoremvarianceadaptiveestimatorfullypolicyrelativeselection

0 comments

The pith

A new adaptive variance estimator for relative sparsity coefficients is introduced that fully utilizes the prior asymptotic normality theorem and incorporates variable selection effects.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Relative sparsity is a statistical approach used in policy learning where only a subset of available variables are selected as important for making decisions, such as treatment policies in medicine. Earlier work created methods for inference under relative sparsity and proved an asymptotic normality result for the adaptive lasso estimator, but that result was not completely applied when calculating the variance of the selected policy coefficients. This paper creates a new variance estimator designed to use the full theorem and to adjust for the fact that variable selection has already occurred. The goal is to produce more accurate uncertainty measures that can be shown in graphical selection diagrams. These diagrams help visualize which variables were chosen and how reliable the choices are. The authors argue this will support safer use of such methods when learning policies for clinical practice.

Core claim

Here, we develop a new coefficient variance estimator that fully uses this theorem and, in the process, takes into account the variable selection.

Load-bearing premise

That the adaptive lasso asymptotic normality theorem from prior work applies directly to the new variance estimator and that incorporating variable selection will meaningfully improve uncertainty representation in the graphical diagrams without additional unstated conditions.

Figures

Figures reproduced from arXiv: 2605.02112 by Samuel Julian Weisenthal.

**Figure 1.** Figure 1: Selection diagram. We show an identical selection diagram to the one in view at source ↗

**Figure 2.** Figure 2: Selection diagrams for the real data. We show an identical real data selection diagram to the one in Weisenthal et al. [2023b], but the estimator for the variance of the coefficients is now (1) rather than (25) in Weisenthal et al. [2023b]. Recall that the shaded regions in the coefficient panels correspond to the theoretical variances (using (1) here), and the dotted lines to the empirical variances (one … view at source ↗

read the original abstract

An approach to inference for relative sparsity was developed in prior work, and an adaptive lasso asymptotic normality theorem was given there, but this theorem was not fully used when estimating the variance of the policy coefficients. Here, we develop a new coefficient variance estimator that fully uses this theorem and, in the process, takes into account the variable selection. This improves the uncertainty representation in the graphical selection diagrams, ultimately facilitating the safe use of policy learning in clinical medicine.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper supplies the missing variance estimator for adaptive lasso coefficients under relative sparsity by fully using the prior asymptotic normality theorem and adjusting for selection.

read the letter

The main point is that this paper supplies the missing variance estimator for the coefficients in relative sparsity by fully deploying the adaptive lasso asymptotic normality theorem and folding in the effects of variable selection. It does this in a direct way that avoids new parameters or conditions. The result is better uncertainty quantification visible in the selection diagrams, which supports safer application of these policy learning methods. The extension feels like a logical completion of the prior work rather than a big departure. Where it is thinner is in the empirical side. The improvement is shown conceptually and in the figures, but there are no simulation experiments that compare the new estimator to the old one or to truth in controlled settings. This makes it harder to judge the magnitude of the gain for typical data sizes. The clinical medicine use case is mentioned but stays at the level of motivation. Statisticians focused on inference after selection in high dimensions will get the most from this. It is a narrow but solid addition to that literature. I think it should go to peer review. The central derivation appears sound and addresses a real incompleteness in the earlier approach.

Referee Report

2 major / 3 minor

Summary. The manuscript develops a new adaptive variance estimator for policy coefficients under relative sparsity. Building on an adaptive lasso asymptotic normality theorem from prior work, the estimator incorporates variable selection effects into the variance formula to improve uncertainty representation in graphical selection diagrams, with the goal of enabling safer policy learning in clinical medicine.

Significance. If the derivation correctly extends the prior theorem without bias or condition violations and the improvement is empirically confirmed, the estimator could strengthen inference tools for sparse high-dimensional models in applied settings. The emphasis on practical medical applications and direct use of existing asymptotic results is a strength, though the absence of explicit error bounds or benchmark comparisons in the provided description limits assessment of robustness.

major comments (2)

[Derivation of the variance estimator] The central extension of the adaptive lasso asymptotic normality theorem to the new variance estimator (described after the theorem statement) must explicitly confirm that incorporating the variable selection step preserves the theorem's regularity conditions; otherwise the claimed improvement in uncertainty quantification may not hold and could reduce to quantities already fitted in the prior result.
[Empirical validation or simulation results] Table or figure comparing the new variance estimates to the prior estimator and to external benchmarks is needed to demonstrate that the adaptive adjustment provides independent grounding rather than circular dependence on the selection procedure; without this, the claim of improved uncertainty representation in the selection diagrams remains unverified.

minor comments (3)

[Abstract] The abstract would benefit from a concise statement of any simulation or real-data validation performed to support the uncertainty improvement claim.
[Notation and definitions] Notation for the new variance estimator should be introduced with explicit reference to the corresponding quantities in the cited prior theorem to aid readability.
[Introduction and background] Ensure all references to the prior work include specific equation numbers from that paper when invoking the asymptotic normality result.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their detailed and constructive report. We address each major comment below and have revised the manuscript to strengthen the presentation of the theoretical extension and to include empirical validation.

read point-by-point responses

Referee: The central extension of the adaptive lasso asymptotic normality theorem to the new variance estimator (described after the theorem statement) must explicitly confirm that incorporating the variable selection step preserves the theorem's regularity conditions; otherwise the claimed improvement in uncertainty quantification may not hold and could reduce to quantities already fitted in the prior result.

Authors: We agree that explicit confirmation of the preserved regularity conditions is essential. In the revised manuscript we have added a dedicated paragraph immediately after the theorem statement. This paragraph verifies that the adaptive lasso selection step remains consistent under the relative sparsity assumption from the prior work, so that the original regularity conditions (including the required rate conditions on the penalty and the design matrix) continue to hold. The new variance estimator is then derived directly from the asymptotic normality result without introducing additional bias or circular dependence on the selection indicators. We believe this clarification fully addresses the concern. revision: yes
Referee: Table or figure comparing the new variance estimates to the prior estimator and to external benchmarks is needed to demonstrate that the adaptive adjustment provides independent grounding rather than circular dependence on the selection procedure; without this, the claim of improved uncertainty representation in the selection diagrams remains unverified.

Authors: We concur that simulation evidence is needed to substantiate the practical gain. The revised manuscript now contains a new simulation section (Section 4) with a table and accompanying figure. The table reports Monte Carlo estimates of variance bias and coverage for the new adaptive estimator, the estimator from the prior paper, and an oracle benchmark across a range of sparsity levels, dimensions, and sample sizes. The results show that the new estimator reduces bias in the variance estimates for the selected coefficients and improves coverage of the resulting intervals in the selection diagrams, confirming that the adjustment supplies independent information beyond the selection procedure itself. revision: yes

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No specific free parameters, axioms, or invented entities can be identified from the abstract alone. The contribution is described as a methodological extension of existing asymptotic results without introducing new postulated entities or ad-hoc assumptions visible here.

pith-pipeline@v0.9.0 · 5351 in / 1172 out tokens · 75881 ms · 2026-05-08T19:33:44.352343+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

246 extracted references · 246 canonical work pages · 2 internal anchors

[1]

2018 , publisher=

Targeted learning in data science , author=. 2018 , publisher=

work page 2018
[2]

Biometrics , volume=

Direct estimation for adaptive treatment length policies: methods and application to evaluating the effect of delayed peg insertion , author=. Biometrics , volume=. 2017 , publisher=

work page 2017
[3]

Jama , volume=

Optimal vasopressin initiation in septic shock: the OVISS reinforcement learning study , author=. Jama , volume=. 2025 , publisher=

work page 2025
[4]

2023 , publisher=

Relative Sparsity and Optimality-Based Reward Learning With Applications to Medical Decisions and Toxicology , author=. 2023 , publisher=

work page 2023
[5]

Nature medicine , volume=

Individualized sepsis treatment using reinforcement learning , author=. Nature medicine , volume=. 2018 , publisher=

work page 2018
[6]

Advances in Neural Information Processing Systems , volume=

Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima , author=. Advances in Neural Information Processing Systems , volume=

work page
[7]

arXiv preprint arXiv:2306.14297 , year=

Inference for relative sparsity , author=. arXiv preprint arXiv:2306.14297 , year=

work page arXiv
[8]

Biometrics , volume=

One-step targeted maximum likelihood estimation for time-to-event outcomes , author=. Biometrics , volume=. 2020 , publisher=

work page 2020
[9]

arXiv preprint arXiv:2011.14762 , year=

Testing for uniqueness of estimators , author=. arXiv preprint arXiv:2011.14762 , year=

work page arXiv 2011
[10]

Implementation Science , volume=

The physician’s experience of changing clinical practice: a struggle to unlearn , author=. Implementation Science , volume=. 2017 , publisher=

work page 2017
[11]

2023 , journal=

Relative Sparsity for Medical Decision Problems , author=. 2023 , journal=

work page 2023
[12]

2002 , publisher=

Statistical inference , author=. 2002 , publisher=

work page 2002
[13]

SIAM journal on computing , volume=

Sparse approximate solutions to linear systems , author=. SIAM journal on computing , volume=. 1995 , publisher=

work page 1995
[14]

2009 , publisher=

The elements of statistical learning: data mining, inference, and prediction , author=. 2009 , publisher=

work page 2009
[15]

Scientific reports , volume=

Hypotension in ICU patients receiving vasopressor therapy , author=. Scientific reports , volume=. 2017 , publisher=

work page 2017
[16]

arXiv preprint arXiv:2208.03233 , year=

Valid post-selection inference in Robust Q-learning , author=. arXiv preprint arXiv:2208.03233 , year=

work page arXiv
[17]

Patient safety in Surgery , volume=

Artificial intelligence systems for complex decision-making in acute care medicine: a review , author=. Patient safety in Surgery , volume=. 2019 , publisher=

work page 2019
[18]

Biometrics , volume=

Population intervention causal effects based on stochastic interventions , author=. Biometrics , volume=. 2012 , publisher=

work page 2012
[19]

Biometrika , volume=

A note on data-splitting for the evaluation of significance levels , author=. Biometrika , volume=. 1975 , publisher=

work page 1975
[20]

2014 , publisher=

Markov decision processes: discrete stochastic dynamic programming , author=. 2014 , publisher=

work page 2014
[21]

arXiv preprint arXiv:2201.08262 , year=

Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making , author=. arXiv preprint arXiv:2201.08262 , year=

work page arXiv
[22]

Journal of biomedical informatics , volume=

A comparison of models for predicting early hospital readmissions , author=. Journal of biomedical informatics , volume=. 2015 , publisher=

work page 2015
[23]

Journal of mathematics and mechanics , pages=

A Markovian decision process , author=. Journal of mathematics and mechanics , pages=. 1957 , publisher=

work page 1957
[24]

Learning to Diagnose with LSTM Recurrent Neural Networks

Learning to diagnose with LSTM recurrent neural networks , author=. arXiv preprint arXiv:1511.03677 , year=

work page Pith review arXiv
[25]

2010 , publisher=

Econometric analysis of cross section and panel data , author=. 2010 , publisher=

work page 2010
[26]

Econometrica: Journal of the Econometric Society , pages=

Bayesian estimates of equation system parameters: an application of integration by Monte Carlo , author=. Econometrica: Journal of the Econometric Society , pages=. 1978 , publisher=

work page 1978
[27]

Annual Review of Statistics and Its Application , volume=

Post-selection inference , author=. Annual Review of Statistics and Its Application , volume=. 2022 , publisher=

work page 2022
[28]

Journal of clinical epidemiology , volume=

A calibration hierarchy for risk models was defined: from utopia to empirical data , author=. Journal of clinical epidemiology , volume=. 2016 , publisher=

work page 2016
[29]

Proceedings of the 22nd international conference on Machine learning , pages=

Predicting good probabilities with supervised learning , author=. Proceedings of the 22nd international conference on Machine learning , pages=

work page
[30]

Artificial intelligence , volume=

Explanation in artificial intelligence: Insights from the social sciences , author=. Artificial intelligence , volume=. 2019 , publisher=

work page 2019
[31]

arXiv preprint arXiv:2110.00641 , year=

Batch size-invariance for policy optimization , author=. arXiv preprint arXiv:2110.00641 , year=

work page arXiv
[32]

Physio Net , volume=

MIMIC-III clinical database , author=. Physio Net , volume=

work page
[33]

Technometrics , volume=

Ridge regression: Biased estimation for nonorthogonal problems , author=. Technometrics , volume=. 1970 , publisher=

work page 1970
[34]

Journal of Statistical Planning and Inference , volume=

On the distribution of the adaptive LASSO estimator , author=. Journal of Statistical Planning and Inference , volume=. 2009 , publisher=

work page 2009
[35]

BMJ open , volume=

Interrogating a clinical database to study treatment of hypotension in the critically ill , author=. BMJ open , volume=. 2012 , publisher=

work page 2012
[36]

Health services research , volume=

Diminishing efficacy of combination therapy, response-heterogeneity, and treatment intolerance limit the attainability of tight risk factor control in patients with diabetes , author=. Health services research , volume=. 2010 , publisher=

work page 2010
[37]

Scientific data , volume=

MIMIC-III, a freely accessible critical care database , author=. Scientific data , volume=. 2016 , publisher=

work page 2016
[38]

circulation , volume=

PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals , author=. circulation , volume=. 2000 , publisher=

work page 2000
[39]

StatPearls [Internet] , year=

Norepinephrine , author=. StatPearls [Internet] , year=

work page
[40]

Nature medicine , volume=

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care , author=. Nature medicine , volume=. 2018 , publisher=

work page 2018
[41]

BMC medical research methodology , volume=

A scoping review of studies using observational data to optimise dynamic treatment regimens , author=. BMC medical research methodology , volume=. 2021 , publisher=

work page 2021
[42]

Journal of the American statistical Association , volume=

Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American statistical Association , volume=. 1994 , publisher=

work page 1994
[43]

Journal of the American statistical Association , volume=

A generalization of sampling without replacement from a finite universe , author=. Journal of the American statistical Association , volume=. 1952 , publisher=

work page 1952
[44]

Machine Learning , volume=

Importance sampling in reinforcement learning with an estimated behavior policy , author=. Machine Learning , volume=. 2021 , publisher=

work page 2021
[45]

arXiv preprint arXiv:1812.00699 , year=

Predicting blood pressure response to fluid bolus therapy using attention-based neural networks for clinical interpretability , author=. arXiv preprint arXiv:1812.00699 , year=

work page arXiv
[46]

International Journal for Quality in Health Care , volume=

The characteristics of very short stay ICU admissions and implications for optimizing ICU resource utilization: the Saudi experience , author=. International Journal for Quality in Health Care , volume=. 2004 , publisher=

work page 2004
[47]

Econometric Theory , volume=

Model selection and inference: Facts and fiction , author=. Econometric Theory , volume=. 2005 , publisher=

work page 2005
[48]

Thorax , volume=

Serum bilirubin levels on ICU admission are associated with ARDS development and mortality in sepsis , author=. Thorax , volume=. 2009 , publisher=

work page 2009
[49]

Seminars in Respiratory and Critical Care Medicine , volume=

Vasopressor therapy in the intensive care unit , author=. Seminars in Respiratory and Critical Care Medicine , volume=. 2021 , organization=

work page 2021
[50]

Journal of Econometrics , volume=

Sparse estimators and the oracle property, or the return of Hodges’ estimator , author=. Journal of Econometrics , volume=. 2008 , publisher=

work page 2008
[51]

2013 , publisher=

Medical Decision Making , author=. 2013 , publisher=

work page 2013
[52]

Annals of Pharmacotherapy , volume=

Narrative review of controversies involving vasopressin use in septic shock and practical considerations , author=. Annals of Pharmacotherapy , volume=. 2020 , publisher=

work page 2020
[53]

arXiv preprint arXiv:2010.14274 , year=

Behavior priors for efficient reinforcement learning , author=. arXiv preprint arXiv:2010.14274 , year=

work page arXiv 2010
[54]

Keep doing what worked: Behavioral modelling priors for offline reinforcement learning.arXiv preprint arXiv:2002.08396,

Keep doing what worked: Behavioral modelling priors for offline reinforcement learning , author=. arXiv preprint arXiv:2002.08396 , year=

work page arXiv 2002
[55]

Twenty-second international joint conference on artificial intelligence , year=

Bayesian policy search with policy priors , author=. Twenty-second international joint conference on artificial intelligence , year=

work page
[56]

Theory of Probability & Its Applications , volume=

On a problem of adaptive estimation in Gaussian white noise , author=. Theory of Probability & Its Applications , volume=. 1991 , publisher=

work page 1991
[57]

2018 , publisher=

Targeted learning in data science: causal inference for complex longitudinal studies , author=. 2018 , publisher=

work page 2018
[58]

1991 , publisher=

Making Decisions , author=. 1991 , publisher=

work page 1991
[59]

Annals of statistics , volume=

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy , author=. Annals of statistics , volume=. 2016 , publisher=

work page 2016
[60]

Jonathan-Hui Blog , author=

work page
[61]

Proceedings of the eleventh annual conference on Computational learning theory , pages=

Learning agents for uncertain environments , author=. Proceedings of the eleventh annual conference on Computational learning theory , pages=

work page
[62]

, author=

Bayesian Inverse Reinforcement Learning. , author=. IJCAI , volume=

work page
[63]

Journal of Open Source Software , volume=

hal9001: Scalable highly adaptive lasso regression inR , author=. Journal of Open Source Software , volume=

work page
[64]

International cross-domain conference for machine learning and knowledge extraction , pages=

Explainable reinforcement learning: A survey , author=. International cross-domain conference for machine learning and knowledge extraction , pages=. 2020 , organization=

work page 2020
[65]

Communications of the ACM , volume=

Techniques for interpretable machine learning , author=. Communications of the ACM , volume=. 2019 , publisher=

work page 2019
[66]

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive explanations for reinforcement learning in terms of expected consequences , author=. arXiv preprint arXiv:1807.08706 , year=

work page Pith review arXiv
[67]

arXiv preprint arXiv:2207.06269 , year=

Policy Optimization with Sparse Global Contrastive Explanations , author=. arXiv preprint arXiv:2207.06269 , year=

work page arXiv
[68]

The annals of statistics , volume=

Best subset selection via a modern optimization lens , author=. The annals of statistics , volume=. 2016 , publisher=

work page 2016
[69]

, author=

Algorithms for inverse reinforcement learning. , author=. Icml , volume=

work page
[70]

Journal of the royal statistical society: series B (statistical methodology) , volume=

Regularization and variable selection via the elastic net , author=. Journal of the royal statistical society: series B (statistical methodology) , volume=. 2005 , publisher=

work page 2005
[71]

Communication and persuasion , pages=

The elaboration likelihood model of persuasion , author=. Communication and persuasion , pages=. 1986 , publisher=

work page 1986
[72]

Journal of evaluation in clinical practice , volume=

Rational decision making in medicine: implications for overuse and underuse , author=. Journal of evaluation in clinical practice , volume=. 2018 , publisher=

work page 2018
[73]

Annals of the Institute of Statistical Mathematics , volume=

Semiparametric M-estimation with non-smooth criterion functions , author=. Annals of the Institute of Statistical Mathematics , volume=. 2020 , publisher=

work page 2020
[74]

International Conference on Machine Learning , pages=

Interpretable off-policy evaluation in reinforcement learning by highlighting influential transitions , author=. International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020
[75]

Biometrika , volume=

Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions , author=. Biometrika , volume=. 2013 , publisher=

work page 2013
[76]

, author=

Introduction to empirical processes and semiparametric inference. , author=. 2008 , publisher=

work page 2008
[77]

Journal of the American Statistical Association , year=

Estimating dynamic treatment regimes in mobile health using v-learning , author=. Journal of the American Statistical Association , year=

work page
[78]

arXiv preprint arXiv:2101.03309 , year=

Identifying Decision Points for Safe and Interpretable Reinforcement Learning in Hypotension Treatment , author=. arXiv preprint arXiv:2101.03309 , year=

work page arXiv
[79]

Conservative Q-Learning for Offline Reinforcement Learning, August 2020

Conservative q-learning for offline reinforcement learning , author=. arXiv preprint arXiv:2006.04779 , year=

work page arXiv 2006
[80]

Maximum a Posteriori Policy Optimisation

Maximum a posteriori policy optimisation , author=. arXiv preprint arXiv:1806.06920 , year=

work page Pith review arXiv

Showing first 80 references.

[1] [1]

2018 , publisher=

Targeted learning in data science , author=. 2018 , publisher=

work page 2018

[2] [2]

Biometrics , volume=

Direct estimation for adaptive treatment length policies: methods and application to evaluating the effect of delayed peg insertion , author=. Biometrics , volume=. 2017 , publisher=

work page 2017

[3] [3]

Jama , volume=

Optimal vasopressin initiation in septic shock: the OVISS reinforcement learning study , author=. Jama , volume=. 2025 , publisher=

work page 2025

[4] [4]

2023 , publisher=

Relative Sparsity and Optimality-Based Reward Learning With Applications to Medical Decisions and Toxicology , author=. 2023 , publisher=

work page 2023

[5] [5]

Nature medicine , volume=

Individualized sepsis treatment using reinforcement learning , author=. Nature medicine , volume=. 2018 , publisher=

work page 2018

[6] [6]

Advances in Neural Information Processing Systems , volume=

Regularized M-estimators with nonconvexity: Statistical and algorithmic theory for local optima , author=. Advances in Neural Information Processing Systems , volume=

work page

[7] [7]

arXiv preprint arXiv:2306.14297 , year=

Inference for relative sparsity , author=. arXiv preprint arXiv:2306.14297 , year=

work page arXiv

[8] [8]

Biometrics , volume=

One-step targeted maximum likelihood estimation for time-to-event outcomes , author=. Biometrics , volume=. 2020 , publisher=

work page 2020

[9] [9]

arXiv preprint arXiv:2011.14762 , year=

Testing for uniqueness of estimators , author=. arXiv preprint arXiv:2011.14762 , year=

work page arXiv 2011

[10] [10]

Implementation Science , volume=

The physician’s experience of changing clinical practice: a struggle to unlearn , author=. Implementation Science , volume=. 2017 , publisher=

work page 2017

[11] [11]

2023 , journal=

Relative Sparsity for Medical Decision Problems , author=. 2023 , journal=

work page 2023

[12] [12]

2002 , publisher=

Statistical inference , author=. 2002 , publisher=

work page 2002

[13] [13]

SIAM journal on computing , volume=

Sparse approximate solutions to linear systems , author=. SIAM journal on computing , volume=. 1995 , publisher=

work page 1995

[14] [14]

2009 , publisher=

The elements of statistical learning: data mining, inference, and prediction , author=. 2009 , publisher=

work page 2009

[15] [15]

Scientific reports , volume=

Hypotension in ICU patients receiving vasopressor therapy , author=. Scientific reports , volume=. 2017 , publisher=

work page 2017

[16] [16]

arXiv preprint arXiv:2208.03233 , year=

Valid post-selection inference in Robust Q-learning , author=. arXiv preprint arXiv:2208.03233 , year=

work page arXiv

[17] [17]

Patient safety in Surgery , volume=

Artificial intelligence systems for complex decision-making in acute care medicine: a review , author=. Patient safety in Surgery , volume=. 2019 , publisher=

work page 2019

[18] [18]

Biometrics , volume=

Population intervention causal effects based on stochastic interventions , author=. Biometrics , volume=. 2012 , publisher=

work page 2012

[19] [19]

Biometrika , volume=

A note on data-splitting for the evaluation of significance levels , author=. Biometrika , volume=. 1975 , publisher=

work page 1975

[20] [20]

2014 , publisher=

Markov decision processes: discrete stochastic dynamic programming , author=. 2014 , publisher=

work page 2014

[21] [21]

arXiv preprint arXiv:2201.08262 , year=

Generalizing Off-Policy Evaluation From a Causal Perspective For Sequential Decision-Making , author=. arXiv preprint arXiv:2201.08262 , year=

work page arXiv

[22] [22]

Journal of biomedical informatics , volume=

A comparison of models for predicting early hospital readmissions , author=. Journal of biomedical informatics , volume=. 2015 , publisher=

work page 2015

[23] [23]

Journal of mathematics and mechanics , pages=

A Markovian decision process , author=. Journal of mathematics and mechanics , pages=. 1957 , publisher=

work page 1957

[24] [24]

Learning to Diagnose with LSTM Recurrent Neural Networks

Learning to diagnose with LSTM recurrent neural networks , author=. arXiv preprint arXiv:1511.03677 , year=

work page Pith review arXiv

[25] [25]

2010 , publisher=

Econometric analysis of cross section and panel data , author=. 2010 , publisher=

work page 2010

[26] [26]

Econometrica: Journal of the Econometric Society , pages=

Bayesian estimates of equation system parameters: an application of integration by Monte Carlo , author=. Econometrica: Journal of the Econometric Society , pages=. 1978 , publisher=

work page 1978

[27] [27]

Annual Review of Statistics and Its Application , volume=

Post-selection inference , author=. Annual Review of Statistics and Its Application , volume=. 2022 , publisher=

work page 2022

[28] [28]

Journal of clinical epidemiology , volume=

A calibration hierarchy for risk models was defined: from utopia to empirical data , author=. Journal of clinical epidemiology , volume=. 2016 , publisher=

work page 2016

[29] [29]

Proceedings of the 22nd international conference on Machine learning , pages=

Predicting good probabilities with supervised learning , author=. Proceedings of the 22nd international conference on Machine learning , pages=

work page

[30] [30]

Artificial intelligence , volume=

Explanation in artificial intelligence: Insights from the social sciences , author=. Artificial intelligence , volume=. 2019 , publisher=

work page 2019

[31] [31]

arXiv preprint arXiv:2110.00641 , year=

Batch size-invariance for policy optimization , author=. arXiv preprint arXiv:2110.00641 , year=

work page arXiv

[32] [32]

Physio Net , volume=

MIMIC-III clinical database , author=. Physio Net , volume=

work page

[33] [33]

Technometrics , volume=

Ridge regression: Biased estimation for nonorthogonal problems , author=. Technometrics , volume=. 1970 , publisher=

work page 1970

[34] [34]

Journal of Statistical Planning and Inference , volume=

On the distribution of the adaptive LASSO estimator , author=. Journal of Statistical Planning and Inference , volume=. 2009 , publisher=

work page 2009

[35] [35]

BMJ open , volume=

Interrogating a clinical database to study treatment of hypotension in the critically ill , author=. BMJ open , volume=. 2012 , publisher=

work page 2012

[36] [36]

Health services research , volume=

Diminishing efficacy of combination therapy, response-heterogeneity, and treatment intolerance limit the attainability of tight risk factor control in patients with diabetes , author=. Health services research , volume=. 2010 , publisher=

work page 2010

[37] [37]

Scientific data , volume=

MIMIC-III, a freely accessible critical care database , author=. Scientific data , volume=. 2016 , publisher=

work page 2016

[38] [38]

circulation , volume=

PhysioBank, PhysioToolkit, and PhysioNet: components of a new research resource for complex physiologic signals , author=. circulation , volume=. 2000 , publisher=

work page 2000

[39] [39]

StatPearls [Internet] , year=

Norepinephrine , author=. StatPearls [Internet] , year=

work page

[40] [40]

Nature medicine , volume=

The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care , author=. Nature medicine , volume=. 2018 , publisher=

work page 2018

[41] [41]

BMC medical research methodology , volume=

A scoping review of studies using observational data to optimise dynamic treatment regimens , author=. BMC medical research methodology , volume=. 2021 , publisher=

work page 2021

[42] [42]

Journal of the American statistical Association , volume=

Estimation of regression coefficients when some regressors are not always observed , author=. Journal of the American statistical Association , volume=. 1994 , publisher=

work page 1994

[43] [43]

Journal of the American statistical Association , volume=

A generalization of sampling without replacement from a finite universe , author=. Journal of the American statistical Association , volume=. 1952 , publisher=

work page 1952

[44] [44]

Machine Learning , volume=

Importance sampling in reinforcement learning with an estimated behavior policy , author=. Machine Learning , volume=. 2021 , publisher=

work page 2021

[45] [45]

arXiv preprint arXiv:1812.00699 , year=

Predicting blood pressure response to fluid bolus therapy using attention-based neural networks for clinical interpretability , author=. arXiv preprint arXiv:1812.00699 , year=

work page arXiv

[46] [46]

International Journal for Quality in Health Care , volume=

The characteristics of very short stay ICU admissions and implications for optimizing ICU resource utilization: the Saudi experience , author=. International Journal for Quality in Health Care , volume=. 2004 , publisher=

work page 2004

[47] [47]

Econometric Theory , volume=

Model selection and inference: Facts and fiction , author=. Econometric Theory , volume=. 2005 , publisher=

work page 2005

[48] [48]

Thorax , volume=

Serum bilirubin levels on ICU admission are associated with ARDS development and mortality in sepsis , author=. Thorax , volume=. 2009 , publisher=

work page 2009

[49] [49]

Seminars in Respiratory and Critical Care Medicine , volume=

Vasopressor therapy in the intensive care unit , author=. Seminars in Respiratory and Critical Care Medicine , volume=. 2021 , organization=

work page 2021

[50] [50]

Journal of Econometrics , volume=

Sparse estimators and the oracle property, or the return of Hodges’ estimator , author=. Journal of Econometrics , volume=. 2008 , publisher=

work page 2008

[51] [51]

2013 , publisher=

Medical Decision Making , author=. 2013 , publisher=

work page 2013

[52] [52]

Annals of Pharmacotherapy , volume=

Narrative review of controversies involving vasopressin use in septic shock and practical considerations , author=. Annals of Pharmacotherapy , volume=. 2020 , publisher=

work page 2020

[53] [53]

arXiv preprint arXiv:2010.14274 , year=

Behavior priors for efficient reinforcement learning , author=. arXiv preprint arXiv:2010.14274 , year=

work page arXiv 2010

[54] [54]

Keep doing what worked: Behavioral modelling priors for offline reinforcement learning.arXiv preprint arXiv:2002.08396,

Keep doing what worked: Behavioral modelling priors for offline reinforcement learning , author=. arXiv preprint arXiv:2002.08396 , year=

work page arXiv 2002

[55] [55]

Twenty-second international joint conference on artificial intelligence , year=

Bayesian policy search with policy priors , author=. Twenty-second international joint conference on artificial intelligence , year=

work page

[56] [56]

Theory of Probability & Its Applications , volume=

On a problem of adaptive estimation in Gaussian white noise , author=. Theory of Probability & Its Applications , volume=. 1991 , publisher=

work page 1991

[57] [57]

2018 , publisher=

Targeted learning in data science: causal inference for complex longitudinal studies , author=. 2018 , publisher=

work page 2018

[58] [58]

1991 , publisher=

Making Decisions , author=. 1991 , publisher=

work page 1991

[59] [59]

Annals of statistics , volume=

Statistical inference for the mean outcome under a possibly non-unique optimal treatment strategy , author=. Annals of statistics , volume=. 2016 , publisher=

work page 2016

[60] [60]

Jonathan-Hui Blog , author=

work page

[61] [61]

Proceedings of the eleventh annual conference on Computational learning theory , pages=

Learning agents for uncertain environments , author=. Proceedings of the eleventh annual conference on Computational learning theory , pages=

work page

[62] [62]

, author=

Bayesian Inverse Reinforcement Learning. , author=. IJCAI , volume=

work page

[63] [63]

Journal of Open Source Software , volume=

hal9001: Scalable highly adaptive lasso regression inR , author=. Journal of Open Source Software , volume=

work page

[64] [64]

International cross-domain conference for machine learning and knowledge extraction , pages=

Explainable reinforcement learning: A survey , author=. International cross-domain conference for machine learning and knowledge extraction , pages=. 2020 , organization=

work page 2020

[65] [65]

Communications of the ACM , volume=

Techniques for interpretable machine learning , author=. Communications of the ACM , volume=. 2019 , publisher=

work page 2019

[66] [66]

Contrastive Explanations for Reinforcement Learning in terms of Expected Consequences

Contrastive explanations for reinforcement learning in terms of expected consequences , author=. arXiv preprint arXiv:1807.08706 , year=

work page Pith review arXiv

[67] [67]

arXiv preprint arXiv:2207.06269 , year=

Policy Optimization with Sparse Global Contrastive Explanations , author=. arXiv preprint arXiv:2207.06269 , year=

work page arXiv

[68] [68]

The annals of statistics , volume=

Best subset selection via a modern optimization lens , author=. The annals of statistics , volume=. 2016 , publisher=

work page 2016

[69] [69]

, author=

Algorithms for inverse reinforcement learning. , author=. Icml , volume=

work page

[70] [70]

Journal of the royal statistical society: series B (statistical methodology) , volume=

Regularization and variable selection via the elastic net , author=. Journal of the royal statistical society: series B (statistical methodology) , volume=. 2005 , publisher=

work page 2005

[71] [71]

Communication and persuasion , pages=

The elaboration likelihood model of persuasion , author=. Communication and persuasion , pages=. 1986 , publisher=

work page 1986

[72] [72]

Journal of evaluation in clinical practice , volume=

Rational decision making in medicine: implications for overuse and underuse , author=. Journal of evaluation in clinical practice , volume=. 2018 , publisher=

work page 2018

[73] [73]

Annals of the Institute of Statistical Mathematics , volume=

Semiparametric M-estimation with non-smooth criterion functions , author=. Annals of the Institute of Statistical Mathematics , volume=. 2020 , publisher=

work page 2020

[74] [74]

International Conference on Machine Learning , pages=

Interpretable off-policy evaluation in reinforcement learning by highlighting influential transitions , author=. International Conference on Machine Learning , pages=. 2020 , organization=

work page 2020

[75] [75]

Biometrika , volume=

Robust estimation of optimal dynamic treatment regimes for sequential treatment decisions , author=. Biometrika , volume=. 2013 , publisher=

work page 2013

[76] [76]

, author=

Introduction to empirical processes and semiparametric inference. , author=. 2008 , publisher=

work page 2008

[77] [77]

Journal of the American Statistical Association , year=

Estimating dynamic treatment regimes in mobile health using v-learning , author=. Journal of the American Statistical Association , year=

work page

[78] [78]

arXiv preprint arXiv:2101.03309 , year=

Identifying Decision Points for Safe and Interpretable Reinforcement Learning in Hypotension Treatment , author=. arXiv preprint arXiv:2101.03309 , year=

work page arXiv

[79] [79]

Conservative Q-Learning for Offline Reinforcement Learning, August 2020

Conservative q-learning for offline reinforcement learning , author=. arXiv preprint arXiv:2006.04779 , year=

work page arXiv 2006

[80] [80]

Maximum a Posteriori Policy Optimisation

Maximum a posteriori policy optimisation , author=. arXiv preprint arXiv:1806.06920 , year=

work page Pith review arXiv