arxiv: 2605.00771 · v1 · submitted 2026-05-01 · 💰 econ.EM · math.ST· stat.TH

Recognition: unknown

Penalized Likelihood for Dyadic Network Formation Models with Degree Heterogeneity

Jingrong Li, Yi Zhang, Zizhong Yan

Pith reviewed 2026-05-09 18:11 UTC · model grok-4.3

classification 💰 econ.EM math.STstat.TH

keywords network formationpenalized likelihooddegree heterogeneityfixed effectsincidental parametersdyadic modelsreciprocitysparse networks

0 comments

The pith

A penalized likelihood estimator ensures existence and corrects bias for network formation models with degree heterogeneity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper develops a penalized likelihood method to estimate dyadic network formation models that incorporate degree heterogeneity via fixed effects. Standard maximum likelihood often fails to exist for agents with zero links, all possible links, or in sparse settings, forcing researchers to drop those agents and create selection bias. The penalty guarantees the estimator exists in finite samples, supplies bias corrections for coefficients and average partial effects, and supports asymptotic normality when fixed effects diverge at a logarithmic rate. This framework fits the sparse degree patterns common in large empirical networks and nests undirected and non-reciprocal directed models. An application to global trade data illustrates that the approach recovers parameters where trimming-based methods break down.

Core claim

The penalized likelihood estimator for the directed dyadic network model with reciprocity guarantees finite-sample existence of the estimator, yields bias corrections for the common parameters and partial effects, and establishes asymptotic normality without compactness restrictions on the fixed effects, permitting them to diverge at a logarithmic rate to fit degree sparsity in empirical networks.

What carries the argument

The penalized likelihood function for the directed network model with reciprocity, where the penalty term enforces existence and supplies the incidental-parameter bias corrections.

If this is right

Agents with extreme degrees can remain in the estimation sample, eliminating the selection bias that trimming induces.
Bias corrections improve finite-sample accuracy for both coefficients and average partial effects.
Asymptotics remain valid in networks where maximum degrees grow only logarithmically with network size.
The estimator applies directly to the standard undirected and non-reciprocal directed specifications as special cases.
Trade or social-network studies can include all observed agents rather than discarding isolates or universal linkers.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Previous empirical network studies that trimmed samples or used uncorrected MLE may have produced systematically different conclusions once re-estimated with this method.
The same penalty construction could be tested in other high-dimensional fixed-effects settings such as panel data with many individual effects.
Direct comparison of penalized estimates against trimmed MLE on the same datasets where trimming is feasible would quantify the practical size of the selection bias avoided.
In very large networks the logarithmic divergence allowance suggests the method can scale without requiring bounded heterogeneity assumptions.

Load-bearing premise

The chosen penalty function and tuning procedure correct incidental-parameter bias without introducing offsetting new biases, and the dyadic model including reciprocity correctly describes the data-generating process.

What would settle it

Monte Carlo experiments with known true coefficients, known fixed effects that diverge logarithmically, and generated sparse networks show the bias-corrected penalized estimates deviating from truth by more than the claimed order.

read the original abstract

Estimating network formation models with degree heterogeneity raises two problems in empirical networks. First, agents that send no links, receive no links, or link to all remaining agents can make the fixed-effects MLE fail to exist. Trimming these agents changes the estimation sample and induces selection bias. Second, the incidental-parameter problem biases common parameters and average partial effects. We resolve both issues through a penalized likelihood approach. Our leading specification is a directed network model with reciprocity, nesting the standard undirected and non-reciprocal directed models. The penalty guarantees finite-sample existence and yields bias corrections for coefficients and partial effects. We establish asymptotic results without imposing compactness on the fixed-effects. Allowing the fixed effects to diverge at a logarithmic rate, our asymptotic framework accommodates the degree sparsity ubiquitous in large empirical networks. A global trade application demonstrates that our estimator avoids selection bias and recovers robust parameters where conventional methods fail.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gives a penalized likelihood estimator that fixes MLE non-existence and incidental-parameter bias in directed dyadic network models with degree heterogeneity, without trimming the sample.

read the letter

The main takeaway is that the authors develop a penalized likelihood approach for directed dyadic network formation models that include reciprocity. It solves the two standard headaches: agents with zero or full degree make the usual fixed-effects MLE fail to exist, and the incidental parameters bias the common coefficients and partial effects. The penalty restores existence in finite samples and supplies bias corrections, while the asymptotics let the fixed effects grow at logarithmic rates to match sparse large networks. This nests the undirected and non-reciprocal cases as special instances. The global trade application shows the estimator avoids the selection bias that comes from dropping extreme-degree agents and produces more stable estimates where conventional methods break down. That combination of existence guarantee plus bias correction in the reciprocal directed setting is the actual new piece. The paper does a clean job laying out why trimming is problematic and how the penalty sidesteps it. The soft spots are modest and mostly practical. The penalty has a tuning parameter whose choice could affect results in small samples, and the abstract leaves the exact functional form and the simulation evidence for bias correction implicit. Anyone using the method would want to see the full derivations and Monte Carlo checks to confirm the higher-order expansions work as stated under the log-divergence allowance. The model is assumed correctly specified, which is the usual starting point but still matters for the partial effects. This is aimed at empirical economists who estimate network formation on large, sparse data such as trade, migration, or social graphs. Readers who already work with incidental-parameter problems in networks will get immediate value from the practical fix. It deserves a serious referee because the core technical claims are coherent, the problem is real, and the proposed solution is targeted rather than generic. I would send it out for review.

Referee Report

3 major / 2 minor

Summary. The paper proposes a penalized likelihood estimator for dyadic network formation models with degree heterogeneity. It addresses non-existence of the fixed-effects MLE for agents with zero or full degree and corrects incidental-parameter bias in common parameters and average partial effects. The leading specification is a directed network model with reciprocity that nests undirected and non-reciprocal directed models. The penalty ensures finite-sample existence of the estimator and delivers bias corrections. Asymptotic theory is developed without compactness on the fixed effects, allowing logarithmic divergence to accommodate degree sparsity in large networks. A global trade application shows the estimator avoids selection bias from trimming and recovers robust parameters.

Significance. If the derivations hold, the contribution is significant for empirical work on networks. It offers a practical way to estimate models with degree fixed effects without trimming observations, which induces selection bias, and supplies explicit bias corrections for coefficients and partial effects. The framework for diverging fixed effects at log rates directly targets the sparsity typical in trade, social, and other empirical networks. Nesting reciprocity adds modeling flexibility. The trade application provides concrete evidence of improved performance over conventional MLE. This approach aligns with and extends penalized methods used in panel and network settings, potentially becoming a standard tool.

major comments (3)

[§3.2] §3.2, penalty definition: the explicit functional form of the penalty and the data-driven rule for selecting the tuning parameter are not stated with sufficient precision. Without these, it is impossible to verify that the penalty restores coercivity for existence while delivering the claimed higher-order bias corrections without introducing offsetting finite-sample bias, as noted in the weakest assumption.
[§4.3] §4.3, Theorem 3 (asymptotics): the proof sketch for asymptotic normality under log-divergence of fixed effects omits the precise rate conditions linking the penalty tuning sequence to the sparsity level. The current argument does not clearly establish that the remainder terms vanish uniformly when degrees are sparse, which is load-bearing for the claim that the method accommodates ubiquitous empirical sparsity without compactness.
[§5] §5, Monte Carlo design: no simulation evidence is reported on finite-sample existence rates, bias correction accuracy, or coverage of the corrected partial effects. Given that the central claims concern finite-sample existence and bias reduction, the absence of controlled experiments leaves the practical performance unverified.

minor comments (2)

[§2] §2: the literature review should cite the most recent work on penalized likelihood in network models (e.g., extensions of Graham 2017 or Fernández-Val & Weidner 2016) to clarify the incremental contribution.
Notation: the distinction between the reciprocity parameter and the degree fixed effects is occasionally ambiguous in the model statement; consistent use of subscripts would improve readability.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and indicate the changes we will make to the manuscript.

read point-by-point responses

Referee: [§3.2] §3.2, penalty definition: the explicit functional form of the penalty and the data-driven rule for selecting the tuning parameter are not stated with sufficient precision. Without these, it is impossible to verify that the penalty restores coercivity for existence while delivering the claimed higher-order bias corrections without introducing offsetting finite-sample bias, as noted in the weakest assumption.

Authors: We agree that greater precision is needed. In the revision we will state the exact functional form of the penalty (a smooth, strictly convex function of the fixed effects that diverges as any degree approaches 0 or the maximum possible value) and the explicit data-driven rule for the tuning parameter (a modified BIC that balances the penalized likelihood with a term penalizing the effective number of fixed effects). These additions will make it possible to verify coercivity and confirm that the higher-order bias corrections are not offset by finite-sample bias from the penalty itself. revision: yes
Referee: [§4.3] §4.3, Theorem 3 (asymptotics): the proof sketch for asymptotic normality under log-divergence of fixed effects omits the precise rate conditions linking the penalty tuning sequence to the sparsity level. The current argument does not clearly establish that the remainder terms vanish uniformly when degrees are sparse, which is load-bearing for the claim that the method accommodates ubiquitous empirical sparsity without compactness.

Authors: We accept the criticism that the rate conditions are insufficiently explicit. The revised proof will state the precise requirements: the tuning parameter must satisfy λ_n = o(1/log n) while the maximum degree is allowed to grow as O(log n). Under these conditions we will show that the penalty-induced remainder and the incidental-parameter correction terms are uniformly o_p(n^{-1/2}) even when degrees are sparse, thereby establishing asymptotic normality without compactness. revision: yes
Referee: [§5] §5, Monte Carlo design: no simulation evidence is reported on finite-sample existence rates, bias correction accuracy, or coverage of the corrected partial effects. Given that the central claims concern finite-sample existence and bias reduction, the absence of controlled experiments leaves the practical performance unverified.

Authors: The referee is correct that the current draft contains no Monte Carlo evidence. We will add a new simulation section that reports (i) the proportion of samples in which the penalized estimator exists for varying network sizes and sparsity levels, (ii) finite-sample bias and RMSE for the common parameters with and without the bias correction, and (iii) coverage rates of the corrected average partial effects. The design will include both dense and sparse regimes to directly test the practical performance claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper proposes a penalized likelihood estimator for dyadic network formation models with degree heterogeneity. The penalty is introduced as an explicit new device to restore finite-sample existence of the MLE and to generate explicit bias corrections for coefficients and partial effects via higher-order expansions. Asymptotics are derived under logarithmic divergence of fixed effects without compactness assumptions, which follows standard incidental-parameters techniques and does not reduce to any fitted input or self-citation by construction. No load-bearing step equates a claimed result to its own inputs; the central claims remain independent of the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

Abstract provides insufficient detail to enumerate free parameters or invented entities; the approach relies on standard domain assumptions for dyadic network models.

free parameters (1)

penalty tuning parameter
The strength of the penalty must be chosen or cross-validated; details absent from abstract.

axioms (1)

domain assumption Dyadic link formation follows a logistic or similar parametric model conditional on observed covariates and unobserved fixed effects.
Standard modeling choice in the literature but required for the likelihood to be well-defined.

pith-pipeline@v0.9.0 · 5453 in / 1207 out tokens · 32066 ms · 2026-05-09T18:11:21.206281+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

31 extracted references · 7 canonical work pages · 2 internal anchors

[1]

The statistical mechanics of strategic interaction

Blume, Lawrence E (1993), “The statistical mechanics of strategic interaction.”Games and Economic Behavior, 5 (3), 387–424

1993
[2]

My friend far, far away: a random field approach to exponential random graph models

Boucher, Vincent and Ismael Mourifie (2017), “My friend far, far away: a random field approach to exponential random graph models.”The Econometrics Journal, 20 (3), S14–S46

2017
[3]

Fundamentals of statistical exponential families: with applications in statistical decision theory

Brown, Lawrence D (1986), “Fundamentals of statistical exponential families: with applications in statistical decision theory.” InLecture Notes-Monograph Series, Hayward, California

1986
[4]

Random graphs with a given degree sequence

Chatterjee, Sourav, Persi Diaconis, and Allan Sly (2011), “Random graphs with a given degree sequence.” [] Chen, Mingli, Iván Fernández-Val, and Martin Weidner (2021), “Nonlinear factor models for network and panel data.”Journal of Econometrics, 220 (2), 296–324

2011
[5]

Chung, Fan RK (1997),Spectral graph theory, volume

1997
[6]

Market structure and multiple equilibria in airline markets

Ciliberto, Federico and Elie Tamer (2009), “Market structure and multiple equilibria in airline markets.”Econo- metrica, 77 (6), 1791–1828

2009
[7]

Identifying preferences in networks with bounded degree

de Paula, Áureo, Seth Richards-Shubik, and Elie Tamer (2018), “Identifying preferences in networks with bounded degree.”Econometrica, 86 (1), 263–288

2018
[8]

An empirical model of dyadic link formation in a network with unobserved hetero- geneity

Dzemski, Andreas (2019), “An empirical model of dyadic link formation in a network with unobserved hetero- geneity.”Review of Economics and Statistics, 101 (5), 763–776. [3, 5, 6] 43 Fernández-Val, Iván and Martin Weidner (2016), “Individual and time effects in nonlinear panel models with large n, t.”Journal of Econometrics, 192 (1), 291–312. [4, 5, 13, 1...

2019
[9]

Sparse network asymptotics for logistic regression under possible misspecification

Graham, Bryan S (2024), “Sparse network asymptotics for logistic regression under possible misspecification.” Econometrica, 92 (6), 1837–1868. [14, 16] Helpman, Elhanan, Marc Melitz, and Yona Rubinstein (2008), “Estimating trade flows: Trading partners and trading volumes.”The Quarterly Journal of Economics, 123 (2), 441–487

2024
[10]

Probability inequalities for sums of bounded random variables

Hoeffding, Wassily (1963), “Probability inequalities for sums of bounded random variables.”Journal of the Amer- ican Statistical Association, 58 (301), 13–30

1963
[11]

A pairwise strategic network formation model with group heterogeneity: With an appli- cation to international travel

Hoshino, Tadao (2022), “A pairwise strategic network formation model with group heterogeneity: With an appli- cation to international travel.”Network Science, 10 (2), 170–189. [5, 7] Hughes, David W (2025), “Estimating nonlinear network data models with fixed effects.”arXiv preprint arXiv:2203.15603. [3, 4, 5, 16] Jochmans, Koen (2018), “Semiparametric an...

work page arXiv 2022
[12]

Fixed-effect regressions on network data

Jochmans, Koen and Martin Weidner (2019), “Fixed-effect regressions on network data.”Econometrica, 87 (5), 1543–1560

2019
[13]

Springer Science & Business Media

Lehmann, Erich L and George Casella (2006),Theory of Point Estimation. Springer Science & Business Media

2006
[14]

Debiased inference for dynamic nonlinear panels with multi- dimensional heterogeneities

Leng, Xuan, Jiaming Mao, and Yutao Sun (2025), “Debiased inference for dynamic nonlinear panels with multi- dimensional heterogeneities.”arXiv e-prints, arXiv–2305

2025
[15]

Inference in models of discrete choice with social interactions using network data

Leung, Michael P (2019), “Inference in models of discrete choice with social interactions using network data.” arXiv preprint arXiv:1911.07106

work page arXiv 2019
[16]

Normal approximation in large network models

Leung, Michael P and Hyungsik Roger Moon (2026), “Normal approximation in large network models.”arXiv preprint arXiv:1904.11060

work page arXiv 2026
[17]

Coherency and completeness of structural models containing a dummy endogenous variable

Lewbel, Arthur (2007), “Coherency and completeness of structural models containing a dummy endogenous variable.”International Economic Review, 48 (4), 1379–1392

2007
[18]

Bagging the Network

Li, Ming, Zhentao Shi, and Yapeng Zheng (2025), “Bagging the network.”arXiv preprint arXiv:2410.23852

work page internal anchor Pith review Pith/arXiv arXiv 2025
[19]

Determining the number of communities in degree- corrected stochastic block models

Ma, Shujie, Liangjun Su, and Yichong Zhang (2021), “Determining the number of communities in degree- corrected stochastic block models.”Journal of Machine Learning Research, 22 (69), 1–63

2021
[20]

A structural model of dense network formation

Mele, Angelo (2017), “A structural model of dense network formation.”Econometrica, 85 (3), 825–850. [2, 5, 7, 8] Mele, Angelo (2020), “Does school desegregation promote diverse interactions? an equilibrium model of segre- gation within schools.”American Economic Journal: Economic Policy, 12 (2), 228–257

2017
[21]

Approximate variational estimation for a model of network formation

44 Mele, Angelo and Lingjiong Zhu (2023), “Approximate variational estimation for a model of network formation.” Review of Economics and Statistics, 105 (1), 113–124

2023
[22]

Strategic network formation with many agents

Menzel, Konrad (2026), “Strategic network formation with many agents.”Journal of Econometrics, 253, 106174

2026
[23]

Structural estimation of pairwise stable networks with nonnegative externality

Miyauchi, Yuhei (2016), “Structural estimation of pairwise stable networks with nonnegative externality.”Journal of Econometrics, 195 (2), 224–235

2016
[24]

Potential games

Monderer, Dov and Lloyd S Shapley (1996), “Potential games.”Games and Economic Behavior, 14 (1), 124–143

1996
[25]

Mcmc for doubly-intractable distributions

Murray, Iain, Zoubin Ghahramani, and David MacKay (2012), “Mcmc for doubly-intractable distributions.”arXiv preprint arXiv:1206.6848

work page arXiv 2012
[26]

Large sample estimation and hypothesis testing

Newey, Whitney K and Daniel McFadden (1994), “Large sample estimation and hypothesis testing.”Handbook of Econometrics, 4, 2111–2245

1994
[27]

Consistent estimates based on partially consistent observations

Neyman, Jerzy and Elizabeth L Scott (1948), “Consistent estimates based on partially consistent observations.” Econometrica, 1–32

1948
[28]

Two-step estimation of a strategic network formation model with clustering

Ridder, Geert and Shuyang Sheng (2025), “Two-step estimation of a strategic network formation model with clustering.”arXiv preprint arXiv:2001.03838

work page arXiv 2025
[29]

Maximum lilkelihood estimation in the β-model

Rinaldo, Alessandro, Sonja Petrovi ´c, and Stephen E Fienberg (2013), “Maximum lilkelihood estimation in the β-model.”The Annals of Statistics, 41 (3), 1085–1110

2013
[30]

A structural econometric analysis of network formation games through subnetworks

Sheng, Shuyang (2020), “A structural econometric analysis of network formation games through subnetworks.” Econometrica, 88 (5), 1829–1858

2020
[31]

Robust Priors in Nonlinear Panel Models with Individual and Time Effects

Wainwright, Martin J and Michael I Jordan (2008), “Graphical models, exponential families, and variational inference.”Foundations and Trends® in Machine Learning, 1 (1–2), 1–305. [] Yan, Ting, Binyan Jiang, Stephen E Fienberg, and Chenlei Leng (2019), “Statistical inference in a directed network model with covariates.”Journal of the American Statistical A...

work page internal anchor Pith review Pith/arXiv arXiv 2008