pith. machine review for the scientific record. sign in

arxiv: 2605.00771 · v1 · submitted 2026-05-01 · 💰 econ.EM · math.ST· stat.TH

Recognition: unknown

Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity

Jingrong Li, Yi Zhang, Zizhong Yan

Pith reviewed 2026-05-09 18:11 UTC · model grok-4.3

classification 💰 econ.EM math.STstat.TH
keywords network formationpenalized likelihooddegree heterogeneityfixed effectsincidental parametersdyadic modelsreciprocitysparse networks
0
0 comments X

The pith

A penalized likelihood estimator ensures existence and corrects bias for network formation models with degree heterogeneity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a penalized likelihood method to estimate dyadic network formation models that incorporate degree heterogeneity via fixed effects. Standard maximum likelihood often fails to exist for agents with zero links, all possible links, or in sparse settings, forcing researchers to drop those agents and create selection bias. The penalty guarantees the estimator exists in finite samples, supplies bias corrections for coefficients and average partial effects, and supports asymptotic normality when fixed effects diverge at a logarithmic rate. This framework fits the sparse degree patterns common in large empirical networks and nests undirected and non-reciprocal directed models. An application to global trade data illustrates that the approach recovers parameters where trimming-based methods break down.

Core claim

The penalized likelihood estimator for the directed dyadic network model with reciprocity guarantees finite-sample existence of the estimator, yields bias corrections for the common parameters and partial effects, and establishes asymptotic normality without compactness restrictions on the fixed effects, permitting them to diverge at a logarithmic rate to fit degree sparsity in empirical networks.

What carries the argument

The penalized likelihood function for the directed network model with reciprocity, where the penalty term enforces existence and supplies the incidental-parameter bias corrections.

If this is right

  • Agents with extreme degrees can remain in the estimation sample, eliminating the selection bias that trimming induces.
  • Bias corrections improve finite-sample accuracy for both coefficients and average partial effects.
  • Asymptotics remain valid in networks where maximum degrees grow only logarithmically with network size.
  • The estimator applies directly to the standard undirected and non-reciprocal directed specifications as special cases.
  • Trade or social-network studies can include all observed agents rather than discarding isolates or universal linkers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Previous empirical network studies that trimmed samples or used uncorrected MLE may have produced systematically different conclusions once re-estimated with this method.
  • The same penalty construction could be tested in other high-dimensional fixed-effects settings such as panel data with many individual effects.
  • Direct comparison of penalized estimates against trimmed MLE on the same datasets where trimming is feasible would quantify the practical size of the selection bias avoided.
  • In very large networks the logarithmic divergence allowance suggests the method can scale without requiring bounded heterogeneity assumptions.

Load-bearing premise

The chosen penalty function and tuning procedure correct incidental-parameter bias without introducing offsetting new biases, and the dyadic model including reciprocity correctly describes the data-generating process.

What would settle it

Monte Carlo experiments with known true coefficients, known fixed effects that diverge logarithmically, and generated sparse networks show the bias-corrected penalized estimates deviating from truth by more than the claimed order.

read the original abstract

Estimating network formation models with degree heterogeneity raises two problems in empirical networks. First, agents that send no links, receive no links, or link to all remaining agents can make the fixed-effects MLE fail to exist. Trimming these agents changes the estimation sample and induces selection bias. Second, the incidental-parameter problem biases common parameters and average partial effects. We resolve both issues through a penalized likelihood approach. Our leading specification is a directed network model with reciprocity, nesting the standard undirected and non-reciprocal directed models. The penalty guarantees finite-sample existence and yields bias corrections for coefficients and partial effects. We establish asymptotic results without imposing compactness on the fixed-effects. Allowing the fixed effects to diverge at a logarithmic rate, our asymptotic framework accommodates the degree sparsity ubiquitous in large empirical networks. A global trade application demonstrates that our estimator avoids selection bias and recovers robust parameters where conventional methods fail.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes a penalized likelihood estimator for dyadic network formation models with degree heterogeneity. It addresses non-existence of the fixed-effects MLE for agents with zero or full degree and corrects incidental-parameter bias in common parameters and average partial effects. The leading specification is a directed network model with reciprocity that nests undirected and non-reciprocal directed models. The penalty ensures finite-sample existence of the estimator and delivers bias corrections. Asymptotic theory is developed without compactness on the fixed effects, allowing logarithmic divergence to accommodate degree sparsity in large networks. A global trade application shows the estimator avoids selection bias from trimming and recovers robust parameters.

Significance. If the derivations hold, the contribution is significant for empirical work on networks. It offers a practical way to estimate models with degree fixed effects without trimming observations, which induces selection bias, and supplies explicit bias corrections for coefficients and partial effects. The framework for diverging fixed effects at log rates directly targets the sparsity typical in trade, social, and other empirical networks. Nesting reciprocity adds modeling flexibility. The trade application provides concrete evidence of improved performance over conventional MLE. This approach aligns with and extends penalized methods used in panel and network settings, potentially becoming a standard tool.

major comments (3)
  1. [§3.2] §3.2, penalty definition: the explicit functional form of the penalty and the data-driven rule for selecting the tuning parameter are not stated with sufficient precision. Without these, it is impossible to verify that the penalty restores coercivity for existence while delivering the claimed higher-order bias corrections without introducing offsetting finite-sample bias, as noted in the weakest assumption.
  2. [§4.3] §4.3, Theorem 3 (asymptotics): the proof sketch for asymptotic normality under log-divergence of fixed effects omits the precise rate conditions linking the penalty tuning sequence to the sparsity level. The current argument does not clearly establish that the remainder terms vanish uniformly when degrees are sparse, which is load-bearing for the claim that the method accommodates ubiquitous empirical sparsity without compactness.
  3. [§5] §5, Monte Carlo design: no simulation evidence is reported on finite-sample existence rates, bias correction accuracy, or coverage of the corrected partial effects. Given that the central claims concern finite-sample existence and bias reduction, the absence of controlled experiments leaves the practical performance unverified.
minor comments (2)
  1. [§2] §2: the literature review should cite the most recent work on penalized likelihood in network models (e.g., extensions of Graham 2017 or Fernández-Val & Weidner 2016) to clarify the incremental contribution.
  2. Notation: the distinction between the reciprocity parameter and the degree fixed effects is occasionally ambiguous in the model statement; consistent use of subscripts would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and indicate the changes we will make to the manuscript.

read point-by-point responses
  1. Referee: [§3.2] §3.2, penalty definition: the explicit functional form of the penalty and the data-driven rule for selecting the tuning parameter are not stated with sufficient precision. Without these, it is impossible to verify that the penalty restores coercivity for existence while delivering the claimed higher-order bias corrections without introducing offsetting finite-sample bias, as noted in the weakest assumption.

    Authors: We agree that greater precision is needed. In the revision we will state the exact functional form of the penalty (a smooth, strictly convex function of the fixed effects that diverges as any degree approaches 0 or the maximum possible value) and the explicit data-driven rule for the tuning parameter (a modified BIC that balances the penalized likelihood with a term penalizing the effective number of fixed effects). These additions will make it possible to verify coercivity and confirm that the higher-order bias corrections are not offset by finite-sample bias from the penalty itself. revision: yes

  2. Referee: [§4.3] §4.3, Theorem 3 (asymptotics): the proof sketch for asymptotic normality under log-divergence of fixed effects omits the precise rate conditions linking the penalty tuning sequence to the sparsity level. The current argument does not clearly establish that the remainder terms vanish uniformly when degrees are sparse, which is load-bearing for the claim that the method accommodates ubiquitous empirical sparsity without compactness.

    Authors: We accept the criticism that the rate conditions are insufficiently explicit. The revised proof will state the precise requirements: the tuning parameter must satisfy λ_n = o(1/log n) while the maximum degree is allowed to grow as O(log n). Under these conditions we will show that the penalty-induced remainder and the incidental-parameter correction terms are uniformly o_p(n^{-1/2}) even when degrees are sparse, thereby establishing asymptotic normality without compactness. revision: yes

  3. Referee: [§5] §5, Monte Carlo design: no simulation evidence is reported on finite-sample existence rates, bias correction accuracy, or coverage of the corrected partial effects. Given that the central claims concern finite-sample existence and bias reduction, the absence of controlled experiments leaves the practical performance unverified.

    Authors: The referee is correct that the current draft contains no Monte Carlo evidence. We will add a new simulation section that reports (i) the proportion of samples in which the penalized estimator exists for varying network sizes and sparsity levels, (ii) finite-sample bias and RMSE for the common parameters with and without the bias correction, and (iii) coverage rates of the corrected average partial effects. The design will include both dense and sparse regimes to directly test the practical performance claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a penalized likelihood estimator for dyadic network formation models with degree heterogeneity. The penalty is introduced as an explicit new device to restore finite-sample existence of the MLE and to generate explicit bias corrections for coefficients and partial effects via higher-order expansions. Asymptotics are derived under logarithmic divergence of fixed effects without compactness assumptions, which follows standard incidental-parameters techniques and does not reduce to any fitted input or self-citation by construction. No load-bearing step equates a claimed result to its own inputs; the central claims remain independent of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate free parameters or invented entities; the approach relies on standard domain assumptions for dyadic network models.

free parameters (1)
  • penalty tuning parameter
    The strength of the penalty must be chosen or cross-validated; details absent from abstract.
axioms (1)
  • domain assumption Dyadic link formation follows a logistic or similar parametric model conditional on observed covariates and unobserved fixed effects.
    Standard modeling choice in the literature but required for the likelihood to be well-defined.

pith-pipeline@v0.9.0 · 5453 in / 1207 out tokens · 32066 ms · 2026-05-09T18:11:21.206281+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 7 canonical work pages · 2 internal anchors

  1. [1]

    The statistical mechanics of strategic interaction

    Blume, Lawrence E (1993), “The statistical mechanics of strategic interaction.”Games and Economic Behavior, 5 (3), 387–424

  2. [2]

    My friend far, far away: a random field approach to exponential random graph models

    Boucher, Vincent and Ismael Mourifie (2017), “My friend far, far away: a random field approach to exponential random graph models.”The Econometrics Journal, 20 (3), S14–S46

  3. [3]

    Fundamentals of statistical exponential families: with applications in statistical decision theory

    Brown, Lawrence D (1986), “Fundamentals of statistical exponential families: with applications in statistical decision theory.” InLecture Notes-Monograph Series, Hayward, California

  4. [4]

    Random graphs with a given degree sequence

    Chatterjee, Sourav, Persi Diaconis, and Allan Sly (2011), “Random graphs with a given degree sequence.” [] Chen, Mingli, Iván Fernández-Val, and Martin Weidner (2021), “Nonlinear factor models for network and panel data.”Journal of Econometrics, 220 (2), 296–324

  5. [5]

    Chung, Fan RK (1997),Spectral graph theory, volume

  6. [6]

    Market structure and multiple equilibria in airline markets

    Ciliberto, Federico and Elie Tamer (2009), “Market structure and multiple equilibria in airline markets.”Econo- metrica, 77 (6), 1791–1828

  7. [7]

    Identifying preferences in networks with bounded degree

    de Paula, Áureo, Seth Richards-Shubik, and Elie Tamer (2018), “Identifying preferences in networks with bounded degree.”Econometrica, 86 (1), 263–288

  8. [8]

    An empirical model of dyadic link formation in a network with unobserved hetero- geneity

    Dzemski, Andreas (2019), “An empirical model of dyadic link formation in a network with unobserved hetero- geneity.”Review of Economics and Statistics, 101 (5), 763–776. [3, 5, 6] 43 Fernández-Val, Iván and Martin Weidner (2016), “Individual and time effects in nonlinear panel models with large n, t.”Journal of Econometrics, 192 (1), 291–312. [4, 5, 13, 1...

  9. [9]

    Sparse network asymptotics for logistic regression under possible misspecification

    Graham, Bryan S (2024), “Sparse network asymptotics for logistic regression under possible misspecification.” Econometrica, 92 (6), 1837–1868. [14, 16] Helpman, Elhanan, Marc Melitz, and Yona Rubinstein (2008), “Estimating trade flows: Trading partners and trading volumes.”The Quarterly Journal of Economics, 123 (2), 441–487

  10. [10]

    Probability inequalities for sums of bounded random variables

    Hoeffding, Wassily (1963), “Probability inequalities for sums of bounded random variables.”Journal of the Amer- ican Statistical Association, 58 (301), 13–30

  11. [11]

    A pairwise strategic network formation model with group heterogeneity: With an appli- cation to international travel

    Hoshino, Tadao (2022), “A pairwise strategic network formation model with group heterogeneity: With an appli- cation to international travel.”Network Science, 10 (2), 170–189. [5, 7] Hughes, David W (2025), “Estimating nonlinear network data models with fixed effects.”arXiv preprint arXiv:2203.15603. [3, 4, 5, 16] Jochmans, Koen (2018), “Semiparametric an...

  12. [12]

    Fixed-effect regressions on network data

    Jochmans, Koen and Martin Weidner (2019), “Fixed-effect regressions on network data.”Econometrica, 87 (5), 1543–1560

  13. [13]

    Springer Science & Business Media

    Lehmann, Erich L and George Casella (2006),Theory of Point Estimation. Springer Science & Business Media

  14. [14]

    Debiased inference for dynamic nonlinear panels with multi- dimensional heterogeneities

    Leng, Xuan, Jiaming Mao, and Yutao Sun (2025), “Debiased inference for dynamic nonlinear panels with multi- dimensional heterogeneities.”arXiv e-prints, arXiv–2305

  15. [15]

    Inference in models of discrete choice with social interactions using network data

    Leung, Michael P (2019), “Inference in models of discrete choice with social interactions using network data.” arXiv preprint arXiv:1911.07106

  16. [16]

    Normal approximation in large network models

    Leung, Michael P and Hyungsik Roger Moon (2026), “Normal approximation in large network models.”arXiv preprint arXiv:1904.11060

  17. [17]

    Coherency and completeness of structural models containing a dummy endogenous variable

    Lewbel, Arthur (2007), “Coherency and completeness of structural models containing a dummy endogenous variable.”International Economic Review, 48 (4), 1379–1392

  18. [18]

    Bagging the Network

    Li, Ming, Zhentao Shi, and Yapeng Zheng (2025), “Bagging the network.”arXiv preprint arXiv:2410.23852

  19. [19]

    Determining the number of communities in degree- corrected stochastic block models

    Ma, Shujie, Liangjun Su, and Yichong Zhang (2021), “Determining the number of communities in degree- corrected stochastic block models.”Journal of Machine Learning Research, 22 (69), 1–63

  20. [20]

    A structural model of dense network formation

    Mele, Angelo (2017), “A structural model of dense network formation.”Econometrica, 85 (3), 825–850. [2, 5, 7, 8] Mele, Angelo (2020), “Does school desegregation promote diverse interactions? an equilibrium model of segre- gation within schools.”American Economic Journal: Economic Policy, 12 (2), 228–257

  21. [21]

    Approximate variational estimation for a model of network formation

    44 Mele, Angelo and Lingjiong Zhu (2023), “Approximate variational estimation for a model of network formation.” Review of Economics and Statistics, 105 (1), 113–124

  22. [22]

    Strategic network formation with many agents

    Menzel, Konrad (2026), “Strategic network formation with many agents.”Journal of Econometrics, 253, 106174

  23. [23]

    Structural estimation of pairwise stable networks with nonnegative externality

    Miyauchi, Yuhei (2016), “Structural estimation of pairwise stable networks with nonnegative externality.”Journal of Econometrics, 195 (2), 224–235

  24. [24]

    Potential games

    Monderer, Dov and Lloyd S Shapley (1996), “Potential games.”Games and Economic Behavior, 14 (1), 124–143

  25. [25]

    Mcmc for doubly-intractable distributions

    Murray, Iain, Zoubin Ghahramani, and David MacKay (2012), “Mcmc for doubly-intractable distributions.”arXiv preprint arXiv:1206.6848

  26. [26]

    Large sample estimation and hypothesis testing

    Newey, Whitney K and Daniel McFadden (1994), “Large sample estimation and hypothesis testing.”Handbook of Econometrics, 4, 2111–2245

  27. [27]

    Consistent estimates based on partially consistent observations

    Neyman, Jerzy and Elizabeth L Scott (1948), “Consistent estimates based on partially consistent observations.” Econometrica, 1–32

  28. [28]

    Two-step estimation of a strategic network formation model with clustering

    Ridder, Geert and Shuyang Sheng (2025), “Two-step estimation of a strategic network formation model with clustering.”arXiv preprint arXiv:2001.03838

  29. [29]

    Maximum lilkelihood estimation in the β-model

    Rinaldo, Alessandro, Sonja Petrovi ´c, and Stephen E Fienberg (2013), “Maximum lilkelihood estimation in the β-model.”The Annals of Statistics, 41 (3), 1085–1110

  30. [30]

    A structural econometric analysis of network formation games through subnetworks

    Sheng, Shuyang (2020), “A structural econometric analysis of network formation games through subnetworks.” Econometrica, 88 (5), 1829–1858

  31. [31]

    Robust Priors in Nonlinear Panel Models with Individual and Time Effects

    Wainwright, Martin J and Michael I Jordan (2008), “Graphical models, exponential families, and variational inference.”Foundations and Trends® in Machine Learning, 1 (1–2), 1–305. [] Yan, Ting, Binyan Jiang, Stephen E Fienberg, and Chenlei Leng (2019), “Statistical inference in a directed network model with covariates.”Journal of the American Statistical A...