A new framework enables conditional independence testing for single realizations of nonstationary nonlinear multivariate time series using time-varying nonlinear regression, local long-run covariance estimation, and distribution-uniform Gaussian approximation.
Cross-Fitting and Fast Remainder Rates for Semiparametric Estimation
9 Pith papers cite this work. Polarity classification is still indexing.
abstract
There are many interesting and widely used estimators of a functional with finite semiparametric variance bound that depend on nonparametric estimators of nuisance functions. We use cross-fitting (i.e. sample splitting) to construct novel estimators with fast remainder rates. We give cross-fit doubly robust estimators that use separate subsamples to estimate different nuisance functions. We obtain general, precise results for regression spline estimation of average linear functionals of conditional expectations with a finite semiparametric variance bound. We show that a cross-fit doubly robust spline regression estimator of the expected conditional covariance is semiparametric efficient under minimal conditions. Cross-fit doubly robust estimators of other average linear functionals of a conditional expectation are shown to have the fastest known remainder rates for the Haar basis or under certain smoothness conditions. Surprisingly, the cross-fit plug-in estimator also has nearly the fastest known remainder rate, but the remainder converges to zero slower than the cross-fit doubly robust estimator. As specific examples we consider the expected conditional covariance, mean with randomly missing data, and a weighted average derivative.
verdicts
UNVERDICTED 9representative citing papers
Causal k-Means Clustering applies k-means to estimated counterfactual functions via plug-in and double machine learning bias-corrected estimators to identify subgroups with heterogeneous treatment effects and achieves root-n rates.
The Sinkhorn treatment effect is a new entropic optimal transport measure of divergence between counterfactual distributions that admits first- and second-order pathwise differentiability, debiased estimators, and asymptotically valid tests for distributional treatment effects.
A conditional adaptive perturbation approach enables valid in-sample inference for machine learning-identified subgroups with nonregular boundaries via triple robustness.
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
The IF-LOO variance estimator for covariate-adjusted treatment effects with binary outcomes provides appropriate type I error control in simulations, especially for rare events or small samples, with a closed-form implementation.
UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.
A semi-supervised kernel two-sample test integrates unlabeled covariate data to achieve asymptotic normality under the null, higher power than standard kernel tests, and consistency against fixed and local alternatives.
crossfit is an R package that supplies a general-purpose cross-fitting engine driven by user-specified DAGs of nuisance models with configurable fold allocations and reproducibility features.
citing papers explorer
-
Conditional independence testing with a single realization of a multivariate nonstationary nonlinear time series
A new framework enables conditional independence testing for single realizations of nonstationary nonlinear multivariate time series using time-varying nonlinear regression, local long-run covariance estimation, and distribution-uniform Gaussian approximation.
-
Causal K-Means Clustering
Causal k-Means Clustering applies k-means to estimated counterfactual functions via plug-in and double machine learning bias-corrected estimators to identify subgroups with heterogeneous treatment effects and achieves root-n rates.
-
Sinkhorn Treatment Effects: A Causal Optimal Transport Measure
The Sinkhorn treatment effect is a new entropic optimal transport measure of divergence between counterfactual distributions that admits first- and second-order pathwise differentiability, debiased estimators, and asymptotically valid tests for distributional treatment effects.
-
In-Sample Evaluation of Subgroups Identified by Generic Machine Learning
A conditional adaptive perturbation approach enables valid in-sample inference for machine learning-identified subgroups with nonregular boundaries via triple robustness.
-
Prognostic Value of Lung Ultrasound Biomarkers for Readmission Risk in Congestive Heart Failure: A Pilot Data-Driven Analysis
Pilot study uses pretrained video encoder features from lung ultrasound to predict 30-day CHF readmission, finding lower-lung views and temporal differences most informative with top MLP F1 of 0.80.
-
Improving Variance Estimation for Covariate Adjustment with Binary Outcomes
The IF-LOO variance estimator for covariate-adjusted treatment effects with binary outcomes provides appropriate type I error control in simulations, especially for rare events or small samples, with a closed-form implementation.
-
UD-DML: Uniform Design Subsampling for Double Machine Learning over Massive Data
UD-DML creates balanced representative subsamples via uniform design in PCA space for efficient double machine learning estimation of average treatment effects on large datasets.
-
A Semi-Supervised Kernel Two-Sample Test
A semi-supervised kernel two-sample test integrates unlabeled covariate data to achieve asymptotic normality under the null, higher power than standard kernel tests, and consistency against fixed and local alternatives.
-
crossfit: A Graph-Based Cross-Fitting Engine in R
crossfit is an R package that supplies a general-purpose cross-fitting engine driven by user-specified DAGs of nuisance models with configurable fold allocations and reproducibility features.