pith. machine review for the scientific record. sign in

arxiv: 2605.06386 · v1 · submitted 2026-05-07 · 💰 econ.EM · cs.LG· math.ST· stat.ME· stat.ML· stat.TH

Recognition: unknown

Covariate Balancing and Riesz Regression Should Be Guided by the Neyman Orthogonal Score in Debiased Machine Learning

Masahiro Kato

Pith reviewed 2026-05-08 03:36 UTC · model grok-4.3

classification 💰 econ.EM cs.LGmath.STstat.MEstat.MLstat.TH
keywords debiased machine learningNeyman orthogonalitycovariate balancingRiesz regressionaverage treatment effecttreatment effect heterogeneitydouble robustness
0
0 comments X

The pith

Balancing in debiased machine learning must follow the Neyman orthogonal score, not just covariates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This position paper contends that balancing functions in debiased machine learning should be chosen based on the Neyman orthogonal score instead of being limited to functions of covariates. A sympathetic reader would care because correct balancing ensures the double robustness property holds in finite samples by making the relevant regression errors orthogonal. The paper shows that covariate balancing works when the score error depends only on covariates, as in ATT estimation, but for ATE with heterogeneous effects the error includes treatment-specific parts from the full regressor X=(D,Z). Therefore, it advocates Riesz regression using basis functions of the entire X as the general method, viewing covariate balancing as the appropriate special case for covariate-only errors.

Core claim

The paper claims that in debiased machine learning, regressor balancing implemented by Riesz regression with basis functions of X should serve as the general balancing principle, because covariate balancing leaves the treatment-specific component of the score error unbalanced under treatment effect heterogeneity where the outcome regression is a function of the full regressor X=(D,Z). Covariate balancing is presented as the special case suited to targets where the score-relevant regression error is a function of covariates alone.

What carries the argument

The Neyman orthogonal score, which identifies the precise regression error components that must be balanced to achieve debiasing in DML.

If this is right

  • Balancing common functions of Z alone can leave treatment-specific errors unbalanced for ATE under heterogeneity.
  • Riesz regression on basis functions of full X=(D,Z) provides the general balancing.
  • For ATT counterfactual means, covariate balancing remains the natural finite-dimensional approximation.
  • This framework unifies different balancing approaches under the orthogonal score.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Practitioners estimating ATE with ML methods should consider deriving balancing weights from the full score rather than defaulting to covariate-only methods.
  • The position implies that in settings with high-dimensional or complex regressors, basis selection for Riesz regression becomes critical for practical implementation.
  • Extensions could include adapting this to other causal estimands or to non-binary treatments.

Load-bearing premise

That the score error generally contains treatment-specific components when the outcome regression depends on both treatment and covariates under heterogeneity.

What would settle it

A dataset or simulation with heterogeneous treatment effects where ATE estimates using only covariate balancing show higher bias or variance than those using full regressor balancing via Riesz regression.

Figures

Figures reproduced from arXiv: 2605.06386 by Masahiro Kato.

Figure 1
Figure 1. Figure 1: Regularization sensitivity with squared loss. view at source ↗
Figure 2
Figure 2. Figure 2: Cross fitting comparison with squared loss. view at source ↗
Figure 3
Figure 3. Figure 3: Distribution of estimation errors in the cross fitting comparison. view at source ↗
Figure 4
Figure 4. Figure 4: Distribution of estimation errors in the simulation study. R    R R R    R R R   view at source ↗
read the original abstract

This position paper argues that, in debiased machine learning, balancing functions should be derived from the Neyman orthogonal score, not chosen only as functions of covariates. Covariate balancing is effective when the regression error entering the score can be represented by functions of covariates alone, and it is the natural finite-dimensional approximation for targets such as ATT counterfactual means. For ATE estimation under treatment effect heterogeneity, however, the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor $X=(D,Z)$. In that case, balancing common functions of $Z$ can leave the treatment-specific component unbalanced. We therefore advocate regressor balancing, implemented by Riesz regression with basis functions of $X$, as the general balancing principle for DML. The position is not that covariate balancing is invalid, but that covariate balancing should be understood as the special case that is appropriate when the score-relevant regression error is a function of covariates alone.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. This position paper argues that balancing functions in debiased machine learning should be derived from the Neyman orthogonal score rather than chosen solely as functions of covariates. Covariate balancing suffices when the score-relevant regression error depends only on covariates (a special case appropriate for targets such as ATT counterfactual means), but for ATE estimation under treatment effect heterogeneity the outcome regression μ(D, Z) introduces treatment-specific components into the score error; balancing only functions of Z can therefore leave residuals unbalanced. The paper advocates regressor balancing implemented via Riesz regression with basis functions of the full regressor X = (D, Z) as the general principle, while clarifying that covariate balancing remains valid under the stated condition.

Significance. If the distinction holds, the manuscript supplies a principled criterion for selecting balancing methods inside DML pipelines, directly tying the choice to the structure of the efficient influence function. This clarification could reduce misspecification risk in heterogeneous settings without introducing new free parameters or circular constructions, and it strengthens the link between orthogonal scores and practical balancing implementations.

major comments (1)
  1. Abstract: the central assertion that 'the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor X=(D,Z)' is load-bearing for the recommendation of regressor balancing, yet the manuscript provides no explicit expansion of the EIF or a minimal analytic example showing the nonzero residual bias term that arises when only Z-functions are balanced.
minor comments (2)
  1. The position would be strengthened by a short self-contained derivation (perhaps in a new subsection) that starts from the standard Neyman score for the heterogeneous ATE and isolates the treatment-specific error component.
  2. A small Monte Carlo illustration comparing covariate-only versus full-X balancing under known heterogeneity would quantify the practical difference alluded to in the abstract and make the argument more accessible to practitioners.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the careful reading and for identifying a point where the manuscript's central claim would benefit from greater explicitness. We agree that the absence of a direct EIF expansion and minimal example weakens the presentation of the key distinction, and we will address this directly in revision.

read point-by-point responses
  1. Referee: [—] Abstract: the central assertion that 'the score error generally contains treatment-specific components because the outcome regression is a function of the full regressor X=(D,Z)' is load-bearing for the recommendation of regressor balancing, yet the manuscript provides no explicit expansion of the EIF or a minimal analytic example showing the nonzero residual bias term that arises when only Z-functions are balanced.

    Authors: We accept the observation. The manuscript derives the general recommendation from the structure of the Neyman orthogonal score and notes that the outcome regression μ(D,Z) introduces treatment-specific components into the score error for heterogeneous ATE, but it does not supply the explicit EIF expansion or a low-dimensional analytic counter-example. In the revised version we will add a short derivation of the EIF for the ATE under treatment-effect heterogeneity that isolates the nonzero residual term arising from balancing only functions of Z, together with a minimal analytic example (e.g., a two-point support design with linear conditional expectations) that quantifies the resulting bias. This material will be placed in the main text or a brief appendix and will not alter the paper's core position or length. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper's recommendation to guide balancing by the Neyman orthogonal score, treating covariate balancing as the special case appropriate when score error depends only on covariates, follows directly from the established structure of the efficient influence function for ATE under heterogeneity. The argument that mu(D,Z) introduces treatment-specific components unbalanced by Z-only balancing is a logical consequence of the EIF definition in standard DML literature, not a self-referential reduction, fitted parameter renamed as prediction, or load-bearing self-citation. The position paper applies prior theory without internal construction that equates output to input by definition.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the standard properties of Neyman orthogonal scores in DML and the decomposition of regression errors under treatment heterogeneity.

axioms (2)
  • standard math The Neyman orthogonal score provides double robustness for DML estimators.
    This is a foundational property assumed from the debiased machine learning literature.
  • domain assumption Under treatment effect heterogeneity the outcome regression depends on the full regressor X = (D, Z).
    Invoked to argue that score errors contain treatment-specific components.

pith-pipeline@v0.9.0 · 5480 in / 1357 out tokens · 84426 ms · 2026-05-08T03:36:53.339149+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

18 extracted references · 6 canonical work pages

  1. [1]

    arXiv preprint arXiv:2110.14831 , year=

    Eli Ben-Michael, Avi Feller, David A. Hirshberg, and José R. Zubizarreta. The balancing act in causal inference, 2021. a rXiv: 2110.14831

  2. [2]

    P. J. Bickel, C. A. J. Klaassen, Y. Ritov, and J. A. Wellner. Efficient and Adaptive Estimation for Semiparametric Models. Springer, 1998

  3. [3]

    Augmented balancing weights as linear regression

    David Bruns-Smith, Oliver Dukes, Avi Feller, and Elizabeth L Ogburn. Augmented balancing weights as linear regression. Journal of the Royal Statistical Society Series B: Statistical Methodology, 04 2025

  4. [4]

    Sieve m inference on irregular parameters

    Xiaohong Chen and Zhipeng Liao. Sieve m inference on irregular parameters. Journal of Econometrics, 182 0 (1): 0 70--86, 2014

  5. [5]

    Sieve wald and qlr inferences on semi/nonparametric conditional moment models

    Xiaohong Chen and Demian Pouzo. Sieve wald and qlr inferences on semi/nonparametric conditional moment models. Econometrica, 83 0 (3): 0 1013--1079, 2015

  6. [6]

    Double/debiased machine learning for treatment and structural parameters

    Victor Chernozhukov, Denis Chetverikov, Mert Demirer, Esther Duflo, Christian Hansen, Whitney Newey, and James Robins. Double/debiased machine learning for treatment and structural parameters. The Econometrics Journal, 2018

  7. [7]

    Newey, Victor Quintas-Martinez, and Vasilis Syrgkanis

    Victor Chernozhukov, Whitney K. Newey, Victor Quintas-Martinez, and Vasilis Syrgkanis. Automatic debiased machine learning via riesz regression, 2021. a rXiv:2104.14737

  8. [8]

    Applied Causal Inference Powered by ML and AI

    Victor Chernozhukov, Christian Hansen, Nathan Kallus, Martin Spindler, and Vasilis Syrgkanis. Applied Causal Inference Powered by ML and AI. CausalML-book.org, 2024. URL https://arxiv.org/abs/2403.02467. a rXiv:2403.02467

  9. [9]

    Optimal covariate balancing conditions in propensity score estimation, 2021

    Jianqing Fan, Kosuke Imai, Inbeom Lee, Han Liu, Yang Ning, and Xiaolin Yang. Optimal covariate balancing conditions in propensity score estimation, 2021. a rXiv: 2108.01255

  10. [10]

    Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies

    Jens Hainmueller. Entropy balancing for causal effects: A multivariate reweighting method to produce balanced samples in observational studies. Political Analysis, 20 0 (1): 0 25--46, 2012

  11. [11]

    Covariate balancing propensity score

    Kosuke Imai and Marc Ratkovic. Covariate balancing propensity score. Journal of the Royal Statistical Society Series B: Statistical Methodology, 76 0 (1): 0 243--263, 07 2013. ISSN 1369-7412

  12. [12]

    Imbens and Donald B

    Guido W. Imbens and Donald B. Rubin. Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, 2015

  13. [13]

    Direct bias-correction term estimation for propensity scores and average treatment effect estimation, 2025

    Masahiro Kato. Direct bias-correction term estimation for propensity scores and average treatment effect estimation, 2025. a rXiv: 2509.22122

  14. [14]

    A unified framework for debiased machine learning: Riesz representer fitting under bregman divergence, 2026

    Masahiro Kato. A unified framework for debiased machine learning: Riesz representer fitting under bregman divergence, 2026. a rXiv: 2601.07752

  15. [15]

    Asymptotic Methods in Statistical Decision Theory (Springer Series in Statistics)

    Lucien Le Cam. Asymptotic Methods in Statistical Decision Theory (Springer Series in Statistics). Springer, 1986

  16. [16]

    van der Vaart

    Aad W. van der Vaart. Asymptotic Statistics. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 1998

  17. [17]

    Covariate balancing propensity score by tailored loss functions

    Qingyuan Zhao. Covariate balancing propensity score by tailored loss functions. The Annals of Statistics, 47 0 (2): 0 965 -- 993, 2019

  18. [18]

    Zubizarreta

    Jos \'e R. Zubizarreta. Stable weights that balance covariates for estimation with incomplete outcome data. Journal of the American Statistical Association, 110 0 (511): 0 910--922, 2015