Recognition: unknown
Integrating Diagnostic Checks into Estimation
Pith reviewed 2026-05-10 06:44 UTC · model grok-4.3
The pith
Residualizing an estimator against its diagnostic check statistics removes selective reporting distortions, cuts variance when the model is correct, and controls worst-case bias under local misspecification.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The residualized estimator is formed by regressing the baseline estimator on the full vector of diagnostic check statistics and retaining the residual. When the baseline model is correctly specified, this residualization leaves the estimand unchanged while strictly reducing variance. When the baseline model is locally misspecified, the same linear adjustment minimizes the worst-case bias within the class of all linear adjustments to the estimator. In addition, the procedure eliminates the inferential distortions that selective reporting of results conditional on passing diagnostic checks would otherwise produce.
What carries the argument
The residualized estimator obtained by projecting the baseline estimator orthogonal to the vector of diagnostic check statistics.
If this is right
- Inference based on the residualized estimator remains valid even when researchers decide which results to report after seeing the diagnostic checks.
- When the baseline specification is correct, the residualized estimator is more precise than the original without any change in the parameter being estimated.
- Under bounded local misspecification the residualized estimator achieves the lowest possible worst-case bias among all linear adjustments to the baseline estimator.
- In the Kaur et al. (2024) RCT reanalysis the procedure produced a larger point estimate and a standard error reduction equivalent to a 10 percent increase in sample size.
Where Pith is reading between the lines
- The approach could be extended to settings where researchers choose among multiple candidate specifications by using the same residualization step to protect against post-selection inference.
- Because the adjustment is linear and data-driven, it may be possible to derive analytic standard errors that account for the estimation of the residualization coefficients themselves.
- In designs with many diagnostic checks the method implicitly performs a form of dimension reduction that could be useful for high-dimensional balance or validity testing.
Load-bearing premise
Misspecification must be local and bounded so that a linear adjustment to the estimator suffices to control worst-case bias, and the diagnostic statistics themselves must remain uncontaminated by the selection or reporting process they are intended to correct.
What would settle it
Simulate data under a known data-generating process, apply selective reporting that conditions on diagnostic checks passing, and check whether the residualized estimator recovers nominal coverage rates and lower mean-squared error than the unadjusted estimator.
read the original abstract
Empirical researchers often use diagnostic checks to assess the plausibility of their modeling assumptions, such as testing for covariate balance in RCTs, pre-trends in event studies, or instrument validity in IV designs. While these checks are traditionally treated as external hurdles to estimation, we argue they should be integrated into the estimation process itself. In particular, we propose residualizing one's baseline estimator against the vector of diagnostic check statistics to remove the component of baseline sampling variation explained by the diagnostic checks. This residualized estimator offers researchers a "free lunch," delivering three properties simultaneously: (i) eliminating inference distortions from check-based selective reporting; (ii) reducing variance without changing the estimand when the baseline model is correctly specified; and (iii) minimizing worst-case bias under bounded local misspecification within the class of linear adjustments. We apply our method to the RCT in Kaur et al. (2024) and find that, even in a setting where all balance checks pass comfortably, residualization increases the magnitude of the baseline point estimate and reduces its standard error, equivalent to approximately a 10% increase in sample size.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes integrating diagnostic checks (e.g., balance tests in RCTs, pre-trends in event studies) into estimation by residualizing the baseline estimator against the vector of diagnostic statistics. This yields a residualized estimator claimed to simultaneously (i) eliminate inference distortions arising from check-based selective reporting, (ii) reduce variance without changing the estimand under correct specification of the baseline model, and (iii) minimize worst-case bias under bounded local misspecification within the class of linear adjustments. An application to the Kaur et al. (2024) RCT reports that residualization increases the point-estimate magnitude and reduces its standard error, equivalent to a roughly 10 percent effective-sample-size gain even though all balance checks pass.
Significance. If the three properties can be formally established, the approach would offer a practically valuable way to convert diagnostic checks from external validation steps into an integral part of estimation, potentially improving efficiency and robustness in common empirical designs. The reported numerical improvement in a real RCT illustrates possible gains even in well-specified settings.
major comments (3)
- [Abstract] Abstract and method description: the central claim that residualization eliminates inference distortions from selective reporting (property i) assumes the diagnostic vector is fixed and exogenous to data-dependent reporting decisions. No derivation or formal argument is supplied showing that the orthogonality condition survives when researchers choose which checks to compute and report after inspecting results; this selection step is load-bearing for property (i) and must be addressed explicitly.
- [Method] Method section: the three properties are asserted without the explicit definition of the residualized estimator, the orthogonality conditions, or the steps establishing variance reduction (ii) and minimax bias (iii). For example, the claim that the adjustment is parameter-free and does not alter the estimand under correct specification requires an equation-level derivation; its absence prevents verification that the construction is not circular.
- [Application] Application section: the 10 percent effective-sample-size gain is reported for the Kaur et al. (2024) RCT, yet neither the underlying data, code, nor the precise residualization steps are provided. Without these, it is impossible to confirm that the reported improvement is robust or that it arises from the claimed mechanism rather than from the particular sample.
minor comments (2)
- [Abstract] Clarify whether the diagnostic vector must be pre-registered or can be chosen from a fixed menu; the current verbal description leaves this ambiguous.
- [Abstract] The phrase 'free lunch' should be qualified to indicate the assumptions (local misspecification, pre-specified diagnostics) under which the three properties hold simultaneously.
Simulated Author's Rebuttal
We thank the referee for their detailed and constructive comments on our manuscript. We address each of the major comments below and outline the revisions we will make.
read point-by-point responses
-
Referee: [Abstract] Abstract and method description: the central claim that residualization eliminates inference distortions from selective reporting (property i) assumes the diagnostic vector is fixed and exogenous to data-dependent reporting decisions. No derivation or formal argument is supplied showing that the orthogonality condition survives when researchers choose which checks to compute and report after inspecting results; this selection step is load-bearing for property (i) and must be addressed explicitly.
Authors: We appreciate the referee pointing out the need for a formal treatment of data-dependent selection of diagnostic checks. The manuscript currently states the property at a conceptual level. In the revision, we will add a dedicated subsection deriving that the residualized estimator eliminates inference distortions even when the set of reported checks is chosen after data inspection. The argument will show that by residualizing against the actually reported diagnostics, the estimator is made orthogonal to the selection rule, thereby removing the bias in inference that arises from selective reporting. We will provide the necessary mathematical steps to establish this. revision: yes
-
Referee: [Method] Method section: the three properties are asserted without the explicit definition of the residualized estimator, the orthogonality conditions, or the steps establishing variance reduction (ii) and minimax bias (iii). For example, the claim that the adjustment is parameter-free and does not alter the estimand under correct specification requires an equation-level derivation; its absence prevents verification that the construction is not circular.
Authors: We agree that explicit definitions and derivations are essential for rigor. The revised manuscript will begin the method section with the formal definition of the residualized estimator: let hat theta be the baseline estimator and D the vector of diagnostic statistics; the residualized estimator is hat theta^res = hat theta - hat lambda' D, where hat lambda is the coefficient from regressing hat theta on D. We will then derive the three properties step by step. For (ii), under correct specification, E[D] = 0 and the adjustment is mean-zero, preserving the estimand while reducing variance by the explained component. For (iii), we will show it achieves the minimax bias among linear adjustments under local misspecification bounded by a constant. These derivations will be presented with all intermediate equations to avoid any appearance of circularity. revision: yes
-
Referee: [Application] Application section: the 10 percent effective-sample-size gain is reported for the Kaur et al. (2024) RCT, yet neither the underlying data, code, nor the precise residualization steps are provided. Without these, it is impossible to confirm that the reported improvement is robust or that it arises from the claimed mechanism rather than from the particular sample.
Authors: We acknowledge this limitation in the current draft. Upon revision, we will make available a complete replication package including the code used to implement the residualization on the Kaur et al. (2024) data, the exact steps followed, and either the processed data or clear instructions for obtaining the original data. This will enable independent verification of the approximately 10% effective sample size gain and confirm that it results from the residualization procedure. revision: yes
Circularity Check
No circularity: residualization properties follow from linear projection without reducing to input tautology
full rationale
The paper defines the residualized estimator explicitly as the baseline estimator minus its linear projection onto the vector of diagnostic check statistics. The three claimed properties are presented as consequences of this construction: orthogonality to the diagnostics removes dependence on selection rules that are functions of those diagnostics; under correct specification the projection term has zero expectation so the estimand is unchanged while variance falls; and within the linear-adjustment class the projection minimizes worst-case bias under local misspecification. None of these steps substitutes a fitted parameter for a prediction, renames a known result, or relies on a self-citation whose content is itself unverified. The derivation remains self-contained once the diagnostics are treated as a fixed, pre-specified vector; any practical difficulty arising from data-dependent choice of which checks to include is a question of implementation scope rather than an internal reduction of the claimed results to their own inputs.
Axiom & Free-Parameter Ledger
axioms (2)
- standard math Linear projection theorems and OLS algebra hold for the estimator and the vector of diagnostic statistics.
- domain assumption Misspecification, if present, is local and bounded so that the worst-case bias is attained within the class of linear adjustments.
Reference graph
Works this paper leans on
-
[1]
Adusumilli, K. (2026). You’ve got to be efficient: Ambiguity, misspecification and variational preferences. arXiv preprint arXiv:2604.05327. Andrews, I., Chen, J., and Tecchio, O. (2025). The purpose of an estimator is what it does: Misspecification, estimands, and over-identification.arXiv preprint arXiv:2508.13076. Andrews, I., Gentzkow, M., and Shapiro...
work page internal anchor Pith review Pith/arXiv arXiv 2026
-
[2]
Optimal inference after model selection.arXiv preprint arXiv:1410.2597, 2014
Springer. Bilinski, A. and Hatfield, L. A. (2026). Nothing to see here? a non-inferiority approach to parallel trends.Statistics in Medicine, 45(3-5):e70296. Bonhomme, S. and Weidner, M. (2022). Minimizing sensitivity to model misspecification. Quantitative Economics, 13(3):907–954. Borusyak, K., Jaravel, X., and Spiess, J. (2024). Revisiting event-study ...
-
[3]
Next, writeˆcL = ˆcS−ˆβ′ Lˆγand treatβL as fixed at the probability limit of ˆβL (this suffices for first-order variance comparisons). Then Var(ˆcL)=Var(ˆcS−β′ Lˆγ) =σ2 S−2β′ LΣγcS +β′ LΣγγβL = ( σ2 S−ΣcSγΣ−1 γγΣγcS ) Var(ˆcR) +(βL−βR)′Σγγ(βL−βR) ≥Var(ˆcR), whereβR =Σ−1 γγΣγcS, and the last inequality usesΣγγ≻0. Equality holds iffβL =βR. ■ Proof of...
2000
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.