Recognition: no theorem link
Flexible Imputation of Incomplete Network Data
Pith reviewed 2026-05-13 18:37 UTC · model grok-4.3
The pith
A nonparametric imputation combines covariate projection with local two-way fixed-effects regression to recover missing network links and deliver consistent GMM estimators.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By projecting sampled network observations onto covariates and then applying a local two-way fixed-effects regression, the method nonparametrically recovers the missing links, achieves entrywise convergence of the imputed matrix, and ensures consistency of GMM estimators constructed from the completed data without requiring parametric assumptions or low-rank restrictions.
What carries the argument
The imputation step that projects the observed network onto covariates and follows with a local two-way fixed-effects regression to recover unobserved entries while absorbing heterogeneity.
If this is right
- The imputed matrix converges entrywise at a rate established by the paper.
- GMM estimators that use the imputed network remain consistent.
- The estimator in the linear-in-means peer-effects model attains the derived convergence rate.
- Simulations show accurate imputation and reliable performance in downstream analysis.
- Application to real sampled networks produces estimates consistent with the method's theoretical guarantees.
Where Pith is reading between the lines
- If the same imputation logic extends to directed or weighted networks, it could broaden the set of empirical studies that can use sampled data without bias.
- The approach might be adapted to panel or dynamic network settings where missingness occurs over time.
- Combining the imputed networks with other semiparametric estimators could further relax assumptions in peer-effects research.
Load-bearing premise
The sampling mechanism and covariate structure must permit the projection-plus-local-fixed-effects combination to recover missing links without creating asymptotic bias.
What would settle it
Finding a data-generating process or simulation design where the imputed matrix produces GMM estimates that systematically differ from the estimates obtained with the true complete network would falsify the consistency result.
Figures
read the original abstract
Sampled network data are widely used in empirical research because collecting complete network information is costly. However, empirical analyses based on sampled networks may lead to biased estimators. We propose a nonparametric imputation method for sampled networks and show that empirical analyses based on imputed networks yield consistent estimates. Our approach imputes missing network links by combining a projection onto covariates with a local two-way fixed-effects regression. The method avoids parametric assumptions, does not rely on low-rank restrictions, and flexibly accommodates both observed covariates and unobserved heterogeneity. We establish entrywise convergence rates for the imputed matrix and prove the consistency of generalized method of moments (GMM) estimators based on imputed networks. We further derive the convergence rate of the corresponding estimator in the linear-in-means peer-effects model. Simulations show strong performance of our method both in terms of imputation accuracy and in downstream empirical analysis. We illustrate our method with an application to the microfinance network data of Banerjee et al. (2013).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proposes a nonparametric imputation method for sampled/incomplete network data that combines projection onto observed covariates with a local two-way fixed-effects regression. It claims to establish entrywise convergence rates for the imputed adjacency matrix, prove consistency of GMM estimators that use the imputed networks, and derive the convergence rate for the linear-in-means peer-effects estimator. The approach avoids parametric assumptions and low-rank restrictions; performance is illustrated via simulations and an application to the Banerjee et al. (2013) microfinance network.
Significance. If the theoretical claims hold, the method supplies a flexible, assumption-light tool for correcting bias in empirical network analyses that rely on sampled data. This is relevant for peer-effects, diffusion, and other network models in economics where complete network observation is costly. The combination of covariate projection and local FE is a practical innovation, though its asymptotic properties require careful verification.
major comments (2)
- [Theoretical results (consistency proofs)] The abstract asserts entrywise convergence rates for the imputed matrix and consistency of GMM estimators based on imputed networks, but entrywise rates alone do not automatically deliver the uniform control needed for network aggregates (sums over neighbors, quadratic forms in the adjacency matrix) that enter typical GMM moment conditions. Additional arguments establishing o_p(1) convergence of these aggregates under the local two-way FE imputation are required.
- [Assumptions and identification] The weakest assumption—that the sampling process and covariate structure permit recovery of missing links without asymptotic bias via the local two-way FE step—needs explicit conditions on bandwidth shrinkage, the correlation between missingness and unobserved heterogeneity outside covariate neighborhoods, and the locality of the FE regression. Without these, non-vanishing bias can remain in the imputed matrix and propagate into the GMM objective.
minor comments (2)
- [Abstract] The abstract and introduction should state the precise convergence rates (e.g., the order in n and the number of observed links) rather than referring only to “entrywise convergence rates.”
- [Simulations] Simulation designs should report the exact missingness mechanism and the dimension of the covariate space to allow readers to assess how well they match the maintained assumptions.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive comments. We address the two major points below and will revise the manuscript to incorporate additional theoretical arguments and explicit assumptions as suggested.
read point-by-point responses
-
Referee: [Theoretical results (consistency proofs)] The abstract asserts entrywise convergence rates for the imputed matrix and consistency of GMM estimators based on imputed networks, but entrywise rates alone do not automatically deliver the uniform control needed for network aggregates (sums over neighbors, quadratic forms in the adjacency matrix) that enter typical GMM moment conditions. Additional arguments establishing o_p(1) convergence of these aggregates under the local two-way FE imputation are required.
Authors: We agree that entrywise rates require supplementary arguments to control network aggregates in GMM moments. In the revised version we will add a dedicated lemma establishing o_p(1) convergence of neighbor sums and quadratic forms in the imputed adjacency matrix. The proof will combine the entrywise rate with network-specific concentration bounds that exploit the local two-way fixed-effects structure and the assumed sparsity of the network. revision: yes
-
Referee: [Assumptions and identification] The weakest assumption—that the sampling process and covariate structure permit recovery of missing links without asymptotic bias via the local two-way FE step—needs explicit conditions on bandwidth shrinkage, the correlation between missingness and unobserved heterogeneity outside covariate neighborhoods, and the locality of the FE regression. Without these, non-vanishing bias can remain in the imputed matrix and propagate into the GMM objective.
Authors: We will strengthen the assumption section by adding explicit conditions: (i) bandwidth shrinkage rates that balance bias and variance in the local regression, (ii) conditional independence of missingness from unobserved heterogeneity given covariates within local neighborhoods, and (iii) a precise definition of locality for the fixed-effects step. These additions will ensure the imputed matrix is asymptotically unbiased and that the bias does not affect the GMM objective. revision: yes
Circularity Check
No circularity: convergence rates and GMM consistency derived from nonparametric assumptions
full rationale
The paper's core claims rest on establishing entrywise convergence rates for the imputed matrix via a nonparametric combination of covariate projection and local two-way fixed-effects regression, followed by standard GMM consistency arguments under the stated sampling and covariate assumptions. No step reduces by construction to a fitted input renamed as prediction, a self-definitional equivalence, or a load-bearing self-citation chain; the derivations are self-contained asymptotic results that do not invoke prior author work to force uniqueness or smuggle ansatzes. External benchmarks (simulations and the Banerjee et al. application) are used only for illustration, not to close the theoretical loop.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Standard regularity conditions for nonparametric regression and entrywise matrix convergence rates hold.
Reference graph
Works this paper leans on
-
[1]
Representations for partially exchangeable arrays of random variables
Aldous, David J.1981. “Representations for partially exchangeable arrays of random variables.”Jour- nal of Multivariate Analysis, 11(4): 581–598. Arcones, Miguel A.1995. “A Bernstein-type inequality for U-statistics and U-processes.”Statistics & probability letters, 22(3): 239–247. Armstrong, Timothy B, and Michal Koles´ ar.2020. “Simple and honest confid...
-
[2]
Therefore, we have ˆai = P j′∈S2 ˆK(j) h,j′Aij′ P j′∈S2 ˆK(j) h,j′ , ˆbj = P i′∈S2 ˆK(i) h,i′Ai′j −P i′∈S2 ˆK(i) h,i′ˆai′ P i′∈S2 ˆK(i) h,i′ . For any ˜i∈ S 2, the first-order condition with respect toa ˜i is given by X j′∈S2 ˆK(j) h,j′A˜ij′ − X j′∈S2 ˆK(j) h,j′ˆa˜i − X j′∈S2 ˆK(j) h,j′ˆbj′ | {z } =0 + ˆK(j) h,jA˜ij − ˆK(j) h,jˆa˜i − ˆK(j) h,j ˆbj = 0 Thu...
work page 2019
-
[3]
12Here we drop the conditioning because it is straightforward verify that the analysis does not depend on specific realization of{ζ i}i∈Sc. 52 and therefore, P max i∈S c X i′∈S2 K(i) h,i′(ζi −ζ i′)′H(ij) i′ (ζi −ζ i′)− X i′∈S2 E(K(i) h,i′(ζi −ζ i′)′H(ij) i′ (ζi −ζ i′)| {ζ i}i∈S c) ≥t| {ζ i}i∈S c ! ≤2Nexp −t2 M2nhdζ+4 + 1 3 M t It follows that we can find ...
work page 1995
-
[4]
54 to show that there exists constants 0< D 4 < D5 <∞such that 14 P D4n2h2dζ ≤min i,j∈S c X i′,j′∈S2 1( ˆdii′ ≤h)1( ˆdjj ′ ≤h)≤max i,j∈S c X i′,j′∈S2 1( ˆdii′ ≤h)1( ˆdjj ′ ≤h)≤D 5n2h2dζ ≤1−δn −1/2 ⇒P D4n2h2dζ −1δN,n ≤min i,j∈S c X i′,j′∈S2 ( ˆK(ij) h,i′j′ −K (ij) h,i′j′) ≤max i,j∈S c X i′,j′∈S2 ( ˆK(ij) h,i′j′ −K (ij) h,i′j′) ≤D 5n2h2dζ −1δN,n ...
work page 1995
-
[5]
sup α ψ( ˆA, Ym, Wm, α)−µ(α) ≤sup α 1 M MX m=1 ψ( ˆA, Wm, Ym, α)− 1 M MX m=1 ψ(Pm, Wm, Ym, α) + sup α 1 M MX m=1 ψ(Pm, Wm, Ym, α)− 1 M MX m=1 ψ(Am, Wm, Ym, α) + sup α 1 M MX m=1 ψ(Am, Wm, Ym, α)−E(ψ(A m, Wm, Ym, α)) Note that, by Assumption 5(vii) and Assumption 5(viii), sup α 1 M MX m=1 ψ( ˆA, Wm, Ym, α)− 1 M MX m=1 ψ(Pm, Wm, Ym, α) ≤ 1 M MX m=1 L(Wm, Ym...
work page 1994
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.