Recognition: unknown
Jackknife Instrumental Variable Inference
Pith reviewed 2026-05-10 09:25 UTC · model grok-4.3
The pith
Jackknife-based tests for IV models with many weak instruments reach chi-square limits after a modification to the objective function.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors introduce jackknife instrumental-variable test statistics whose limiting distributions under the null, in the presence of many potentially weak instruments and heteroskedasticity, are combinations of chi-square random variables; by modifying the objective function these limits become ordinary chi-square distributions, delivering usable critical values for inference.
What carries the argument
Jackknife instrumental-variable test statistics obtained by deleting one observation at a time and adjusting the objective function to produce chi-square limits.
Load-bearing premise
The regularity conditions hold that deliver the stated limiting distributions for the jackknife statistics when the number of instruments grows with the sample size and heteroskedasticity is present.
What would settle it
A simulation experiment in which the empirical rejection rate of the proposed tests under the null deviates substantially from the nominal level when the number of instruments is large and heteroskedasticity follows a pattern allowed by the maintained assumptions would falsify the claimed size control.
Figures
read the original abstract
This paper introduces a class of jackknife-based test statistics for linear regression models with endogeneity and heteroskedasticity in the presence of many potentially weak instrumental variables. The tests may be used when considering hypotheses on the full parameter vector or hypotheses defined as linear restrictions. We show that in the limit and under the null the proposed statistics are distributed as a combination of chi squares but by modifying the objective function we derive more familiar chi square limits. An extensive simulation study shows the competitive finite sample properties of the proposed tests in particular against Anderson-Rubin-type of statistics. Finally, we provide an empirical illustration that applies the proposed tests to study the effect of alcohol consumption on body mass index using genetic variants as instrumental variables using the UK Biobank.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. This paper introduces a class of jackknife-based test statistics for linear IV regression models with endogeneity and heteroskedasticity, suitable for hypotheses on the full parameter vector or linear restrictions. Under the null and in the limit (including many potentially weak instruments), the statistics converge in distribution to a combination of chi-squares; a modification to the objective function yields standard chi-square limits. An extensive simulation study demonstrates competitive finite-sample size and power properties relative to Anderson-Rubin-type tests, and the methods are illustrated empirically by estimating the effect of alcohol consumption on BMI using genetic variants as instruments in UK Biobank data.
Significance. If the asymptotic derivations hold, the paper makes a useful contribution to the literature on robust inference in IV settings with many weak instruments and heteroskedasticity by extending jackknife techniques to deliver tests with tractable limiting distributions and good finite-sample behavior. The simulation comparisons to Anderson-Rubin benchmarks and the genetic-IV application provide practical value for empirical researchers facing similar identification challenges. The approach's emphasis on handling heteroskedasticity alongside many instruments is a strength, as these features are prevalent in modern microeconometric applications.
major comments (2)
- [Asymptotic theory] Asymptotic theory section: the claim that modifying the objective function produces standard chi-square limits under the null requires an explicit verification that this modification preserves the test's validity (i.e., does not alter the null distribution or introduce inconsistencies under local alternatives); the current description leaves unclear whether the modification is data-dependent or fixed in a way that affects the jackknife's robustness properties.
- [Simulation study] Simulation study: the data-generating processes, instrument counts, and heteroskedasticity specifications used to demonstrate competitive performance against Anderson-Rubin statistics should be detailed with respect to the many-weak-instrument regime (e.g., number of instruments relative to sample size and concentration parameters); without these, it is difficult to confirm that the reported size control and power gains generalize beyond the specific designs examined.
minor comments (2)
- [Abstract and Introduction] The abstract and introduction could more precisely define the jackknife statistics (e.g., the exact form of the leave-one-out estimator or weighting) to aid readers before the technical sections.
- [Empirical illustration] In the empirical illustration, reporting the effective number of instruments, first-stage strength diagnostics, and any data exclusion rules would strengthen the connection between the theoretical results and the UK Biobank application.
Simulated Author's Rebuttal
We thank the referee for the positive assessment, constructive comments, and recommendation for minor revision. We address each major comment below.
read point-by-point responses
-
Referee: [Asymptotic theory] Asymptotic theory section: the claim that modifying the objective function produces standard chi-square limits under the null requires an explicit verification that this modification preserves the test's validity (i.e., does not alter the null distribution or introduce inconsistencies under local alternatives); the current description leaves unclear whether the modification is data-dependent or fixed in a way that affects the jackknife's robustness properties.
Authors: We agree that additional explicit verification would strengthen the presentation. The modification is a fixed, non-data-dependent adjustment to the objective function. In the revised manuscript we will insert a new proposition (or remark) in the asymptotic theory section that formally verifies the modified statistic retains the same null limiting distribution (standard chi-square) and that the adjustment is of lower order under local alternatives, thereby preserving consistency and power properties. Because the adjustment does not depend on the data or on the heteroskedasticity structure, it leaves the jackknife's robustness properties unchanged. revision: yes
-
Referee: [Simulation study] Simulation study: the data-generating processes, instrument counts, and heteroskedasticity specifications used to demonstrate competitive performance against Anderson-Rubin statistics should be detailed with respect to the many-weak-instrument regime (e.g., number of instruments relative to sample size and concentration parameters); without these, it is difficult to confirm that the reported size control and power gains generalize beyond the specific designs examined.
Authors: We appreciate the referee's request for greater transparency on the many-weak-instrument aspects of the designs. In the revised version we will expand the simulation section with explicit statements of the instrument-to-sample-size ratios (k/n), the concentration-parameter values, and the precise heteroskedasticity specifications used in each Monte Carlo experiment. These additions will be presented in a new table or subsection so that readers can directly assess how the reported size and power results relate to the many-weak-instrument regime. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper derives limiting distributions for its jackknife IV statistics (combination of chi-squares under the null, or standard chi-square after objective-function modification) from standard many-weak-instrument asymptotics under heteroskedasticity. These steps rely on external regularity conditions and conventional IV limit theory rather than self-definitions, fitted parameters renamed as predictions, or load-bearing self-citations. The simulation comparisons to Anderson-Rubin statistics supply independent finite-sample evidence, and the overall argument remains self-contained against external benchmarks without reducing any central claim to its own inputs by construction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Standard linear IV assumptions including instrument exogeneity, relevance, and heteroskedasticity of errors
- domain assumption Regularity conditions for jackknife statistics to converge to chi-square limits with many weak instruments
Reference graph
Works this paper leans on
-
[1]
Anderson, T. W. and Rubin, H. (1949). Estimators of the parameters of a single equation in a complete set of stochastic equations.The Annals of Mathematical Statistics, 21:570–582. Andrews, I. (2016). Conditional linear combination tests for weakly identified models. Econometrica, 84(6):2155–2182. Andrews, I., Stock, J. H., and Sun, L. (2019). Weak instru...
1949
-
[2]
Blomquist, S. and Dahlberg, M. (1999). Small sample properties of liml and jackknife iv estimators: Experiments with weak instruments.Journal of Applied Econometrics, 14(1):69–88. 99 Borusyak, K., Hull, P., and Jaravel, X. (2022). Quasi-experimental shift-share research designs.Review of Economic Studies, 89(1):181–213. Bound, J., Jaeger, D. A., and Baker...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.