Closed-form estimation and inference for panels with attrition and refreshment samples
Pith reviewed 2026-05-23 19:10 UTC · model grok-4.3
The pith
An alternative identifying assumption permits closed-form consistent estimation for panels with attrition via empirical CDF transformation using refreshment samples.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the proposed alternative identifying assumption, the estimator obtained by a transformation of the empirical cumulative distribution function is consistent and asymptotically normal and requires neither tuning parameters nor optimization in the first step.
What carries the argument
The transformation of the empirical cumulative distribution function, justified by the alternative identifying assumption that restores identification while permitting nontrivial attrition.
If this is right
- The estimator is consistent and asymptotically normal under the maintained assumption.
- Estimation avoids tuning parameters and numerical optimization in the initial step.
- Finite-sample performance is reliable according to the reported Monte Carlo experiments.
- The procedure can be applied directly to empirical panel data such as income observations from refreshment-augmented surveys.
Where Pith is reading between the lines
- The closed-form nature may reduce barriers to routine use of refreshment samples in applied longitudinal analysis.
- Because the first step is non-iterative, the method could be combined with standard second-step estimators for additional parameters without compounding computational cost.
- The same CDF transformation idea might be examined for related missing-data problems such as item nonresponse in cross-sections.
Load-bearing premise
The alternative identifying assumption restores full identification while still allowing nontrivial attrition mechanisms.
What would settle it
Apply the estimator to a generated panel dataset in which the alternative identifying assumption is deliberately violated and check whether the estimates fail to converge to the true parameters.
read the original abstract
It has long been established that, if a panel dataset suffers from attrition, auxiliary (refreshment) sampling restores full identification under additional assumptions that still allow for nontrivial attrition mechanisms. Such identification results rely on implausible assumptions about the attrition process or lead to theoretically and computationally challenging estimation procedures. We propose an alternative identifying assumption that, despite its nonparametric nature, suggests a simple estimation algorithm based on a transformation of the empirical cumulative distribution function of the data. This estimation procedure requires neither tuning parameters nor optimization in the first step, i.e., it has a closed form. We prove that our estimator is consistent and asymptotically normal and demonstrate its good performance in simulations. We provide an empirical illustration with income data from the Understanding America Study.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an alternative nonparametric identifying assumption for panel data subject to attrition when refreshment samples are available. This assumption restores identification while permitting nontrivial attrition and yields a closed-form estimator obtained by a direct transformation of the empirical CDF; the authors establish consistency and asymptotic normality, report simulation evidence of good finite-sample performance, and illustrate the method with income data from the Understanding America Study.
Significance. If the identifying assumption is maintained and the consistency proof holds, the closed-form estimator constitutes a computationally attractive alternative to existing procedures that require optimization or tuning parameters. The absence of first-step estimation and tuning parameters, together with the explicit asymptotic normality result, would be a practical contribution for applied work in econometrics.
major comments (2)
- [§3] §3, Assumption 3: the paper invokes a new nonparametric restriction on the joint distribution of (Y,T,R) that is claimed to be distinct from prior attrition literature; however, the text does not provide a formal proof that this restriction is strictly weaker than the assumptions in the cited refreshment-sample papers while still delivering point identification of the target parameters.
- [Theorem 1] Theorem 1 (consistency): the derivation relies on the empirical CDF converging uniformly to the population CDF under the new assumption, but the argument does not explicitly address whether the refreshment-sample size must grow at the same rate as the main panel or whether additional regularity conditions on the support of the outcome are required.
minor comments (2)
- [Table 1] Table 1: the simulation design reports bias and RMSE but does not include coverage probabilities for the asymptotic confidence intervals whose validity is claimed in Theorem 2.
- [§5] The empirical illustration in §5 would benefit from a brief comparison of point estimates and standard errors obtained under the new assumption versus a standard complete-case analysis.
Simulated Author's Rebuttal
We thank the referee for the insightful comments, which have helped us improve the clarity of our manuscript. We address each major comment below and will revise the paper accordingly.
read point-by-point responses
-
Referee: §3, Assumption 3: the paper invokes a new nonparametric restriction on the joint distribution of (Y,T,R) that is claimed to be distinct from prior attrition literature; however, the text does not provide a formal proof that this restriction is strictly weaker than the assumptions in the cited refreshment-sample papers while still delivering point identification of the target parameters.
Authors: We agree that a more formal comparison would strengthen the paper. Assumption 3 is designed to be a distinct nonparametric condition that enables closed-form estimation while maintaining point identification. In the revised manuscript, we will include an additional proposition that formally relates Assumption 3 to the assumptions in the cited refreshment-sample papers, demonstrating that it is strictly weaker in relevant cases and still yields point identification of the parameters of interest. This will be added to Section 3. revision: yes
-
Referee: Theorem 1 (consistency): the derivation relies on the empirical CDF converging uniformly to the population CDF under the new assumption, but the argument does not explicitly address whether the refreshment-sample size must grow at the same rate as the main panel or whether additional regularity conditions on the support of the outcome are required.
Authors: Thank you for pointing this out. The proof of Theorem 1 relies on the Glivenko-Cantelli theorem for uniform convergence, which holds under standard conditions. However, to make the asymptotic framework explicit, we will revise the statement of Theorem 1 to specify that the refreshment sample size grows at the same rate as the main panel (i.e., n_r / n -> c > 0) and add regularity conditions on the support of Y being compact or satisfying appropriate moment conditions. These clarifications will be incorporated into the revised version without altering the main results. revision: yes
Circularity Check
No significant circularity; derivation is self-contained under new assumption
full rationale
The paper introduces an alternative nonparametric identifying assumption distinct from prior attrition literature. This assumption directly motivates a closed-form estimator obtained by transforming the empirical CDF, with consistency and asymptotic normality proved thereafter. No step reduces the estimator to a fitted quantity defined by the assumption itself, no self-citation chain bears the central load, and the procedure requires no tuning parameters or optimization. The structure is internally consistent without the estimator being equivalent to its inputs by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Alternative nonparametric identifying assumption that restores full identification with refreshment samples while permitting nontrivial attrition
Reference graph
Works this paper leans on
-
[1]
Bhattacharya, D. (2008). Inference in panel data models under attrition caused by unobservables. Journal of Econometrics , 144(2):430--446
work page 2008
- [2]
-
[3]
Davidson, R. and MacKinnon, J. G. (2000). Improving the reliability of bootstrap tests. Technical report, Queen's Economics Department Working Paper
work page 2000
-
[4]
Deng, Y., Hillygus, D. S., Reiter, J. P., Si, Y., and Zheng, S. (2013). Handling attrition in longitudinal studies: The case for refreshment samples. Statistical Science , 28(2):238--256
work page 2013
-
[5]
d’Haultfoeuille, X. (2010). A new instrumental method for dealing with endogenous selection. Journal of Econometrics , 154(1):1--15
work page 2010
-
[6]
Franguridi, G., Hahn, J., Hoonhout, P., Kapteyn, A., and Ridder, G. (2024a). Raking for panels with nonignorable attrition and refreshment. Working paper
-
[7]
Franguridi, G., Hahn, J., and Ridder, G. (2024b). Robust estimation and inference for panels with nonignorable attrition and refreshment. Working paper
-
[8]
Giacomini, R., Politis, D. N., and White, H. (2013). A warp-speed method for conducting monte carlo experiments involving bootstrap estimators. Econometric theory , 29(3):567--589
work page 2013
-
[9]
Hellerstein, J. K. and Imbens, G. W. (1999). Imposing moment restrictions from auxiliary data by weighting. Review of Economics and Statistics , 81(1):1--14
work page 1999
-
[10]
Hirano, K., Imbens, G. W., Ridder, G., and Rubin, D. B. (2001). Combining panel data sets with attrition and refreshment samples. Econometrica , 69(6):1645--1659
work page 2001
-
[11]
Hoonhout, P. and Ridder, G. (2019). Nonignorable attrition in multi-period panels with refreshment samples. Journal of Business & Economic Statistics , 37(3):377--390
work page 2019
-
[12]
Lang, S. (2012). Fundamentals of differential geometry , volume 191. Springer Science & Business Media
work page 2012
-
[13]
Nevo, A. (2003). Using weights to adjust for sample selection when auxiliary information is available. Journal of Business & Economic Statistics , 21(1):43--52
work page 2003
-
[14]
Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis testing. Handbook of econometrics , 4:2111--2245
work page 1994
-
[15]
Sadinle, M. and Reiter, J. P. (2019). Sequentially additive nonignorable missing data modelling using auxiliary marginal information. Biometrika , 106(4):889--911
work page 2019
-
[16]
Si, Y., Reiter, J. P., and Hillygus, D. S. (2015). Semi-parametric selection models for potentially non-ignorable attrition in panel studies with refreshment samples. Political Analysis , 23(1):92--112
work page 2015
-
[17]
Tauchen, G. (1985). Diagnostic testing and evaluation of maximum likelihood models. Journal of Econometrics , 30(1-2):415--443
work page 1985
-
[18]
Taylor, L. K., Tong, X., and Maxwell, S. E. (2020). Evaluating supplemental samples in longitudinal research: Replacement and refreshment approaches. Multivariate Behavioral Research , 55(2):277--299
work page 2020
-
[19]
van der Vaart, A. and Wellner, J. (2023). Weak Convergence and Empirical Processes: With Applications to Statistics . Springer
work page 2023
-
[20]
Villani, C. et al. (2009). Optimal transport: old and new , volume 338. Springer
work page 2009
-
[21]
Watson, N. and Lynn, P. (2021). Refreshment sampling for longitudinal surveys. Advances in longitudinal survey methodology , pages 1--25
work page 2021
-
[22]
White, H. (2000). A reality check for data snooping. Econometrica , 68(5):1097--1126
work page 2000
-
[23]
Young, W. (1917). On multiple integration by parts and the second theorem of the mean. Proceedings of the London Mathematical Society , 2(1):273--293
work page 1917
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.