pith. machine review for the scientific record. sign in

arxiv: 2604.22791 · v2 · submitted 2026-04-13 · 📊 stat.CO · cs.SI· stat.OT

Recognition: unknown

R Package iglm: Regression under Interference in Connected Populations

Authors on Pith no claims yet

Pith reviewed 2026-05-10 14:58 UTC · model grok-4.3

classification 📊 stat.CO cs.SIstat.OT
keywords interferencespillover effectsregressionR packagepseudo-likelihoodconnected populationsconvex optimizationnetwork data
0
0 comments X

The pith

The iglm R package enables regression analysis of spillover effects and interference in connected populations using scalable convex optimization with theoretical guarantees.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents the iglm R package as a tool for regression when units influence each other through connections such as social networks. This setup lets users examine how predictors relate to outcomes while accounting for spillover from connected units rather than treating observations as independent. The package solves a pseudo-likelihood based convex program with Minorization-Maximization and Quasi-Newton algorithms, which scales from small to large datasets. It supplies provable statistical guarantees and supports custom model terms so users can tailor the interference structure. Demonstrations use hate speech data from the X platform and communication records among students.

Core claim

iglm implements a regression framework under interference by optimizing a pseudo-likelihood objective via convex programming, delivering both computational scalability for large connected populations and provable theoretical guarantees for inference on spillover and related effects.

What carries the argument

The pseudo-likelihood convex optimization program, which approximates the full likelihood to enable efficient fitting of interference-adjusted regression models while preserving statistical validity.

If this is right

  • Researchers gain a practical tool to quantify spillover from connected units in network data without assuming independence.
  • The same regression framework applies to both small and large populations because the optimization scales through standard convex solvers.
  • Users can extend the model by defining custom terms that capture domain-specific interference patterns.
  • Applications become feasible in areas such as social media analysis and educational networks where interference is common.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The package could serve as a base for extensions that handle time-varying connections or dynamic interference.
  • Integration with existing network visualization tools in R might allow direct inspection of how estimated spillovers align with observed connections.
  • The theoretical guarantees might support sensitivity checks that vary the strength of assumed interference to test robustness of conclusions.

Load-bearing premise

The pseudo-likelihood convex optimization program yields valid inference and the claimed theoretical guarantees under realistic interference structures in connected populations.

What would settle it

Run iglm and a full-likelihood benchmark on the same small network dataset with known interference parameters; mismatch in recovered coefficients or sub-nominal coverage of confidence intervals would indicate the approximation fails.

Figures

Figures reproduced from arXiv: 2604.22791 by Cornelius Fritz, Michael Schweinberger.

Figure 1
Figure 1. Figure 1: Hate speech on X: observed in- and outdegrees of U.S. state legislators in the [PITH_FULL_IMAGE:figures/full_fig_p007_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Hate speech on X: interactions among U.S. state legislators represented by lines [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Solid arrows indicate the workflow of R package iglm, while dashed arrows indicate dependencies. For example, simulating data from the model helps quantify the uncertainty about estimators and assess the estimated model, while the interpretation and assessment of the model may suggest updating the specification. In addition, one may wish to simulate from a specified model, in which case estimation and unce… view at source ↗
Figure 4
Figure 4. Figure 4: A graphical representation of class iglm, which encompasses objects iglm.data, control, results, and sampler. Additional options supplied by these objects are described in the help function of R package iglm: see ?iglm. Left-hand side of formula: iglm.data The class iglm.data is described in Section 2. We focus here on the specification of the type of predictors Xi and outcomes Yi , which determines the re… view at source ↗
Figure 5
Figure 5. Figure 5: A generic MM algorithm maximizing an objective function [PITH_FULL_IMAGE:figures/full_fig_p015_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Hate speech on X: trace plots of estimates, including estimates of degree weights [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Hate speech on X: assessing the model in terms of the numbers of edgewise shared [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Hate speech on X: model assessment based on predictions. Plots ( [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Copenhagen network study: The colors and sizes of the circles represent the gender [PITH_FULL_IMAGE:figures/full_fig_p029_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Copenhagen network study: model assessment in terms of the distribution of [PITH_FULL_IMAGE:figures/full_fig_p031_10.png] view at source ↗
read the original abstract

We introduce R package iglm, which implements a comprehensive framework for studying relationships among predictors and outcomes under interference. The implemented regression framework facilitates the study of spillover and other phenomena in connected populations and has important advantages over existing packages, among them scalability and provable theoretical guarantees. On the computational side, the regression framework relies on scalable methods that can be applied to small and large data sets, by solving a convex optimization program based on pseudo-likelihoods using Minorization-Maximization and Quasi-Newton algorithms. On the statistical side, the regression framework comes with provable theoretical guarantees. To increase the versatility of iglm, users can add custom-built model terms. We showcase iglm using two data sets, including hate speech on the social media platform X and communications among students.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces the R package iglm implementing a regression framework for studying relationships among predictors and outcomes under interference in connected populations. The approach relies on scalable convex optimization of pseudo-likelihoods via Minorization-Maximization and Quasi-Newton algorithms, claims provable theoretical guarantees, supports user-defined custom model terms, and is demonstrated on two datasets (hate speech on platform X and student communications).

Significance. If the claimed theoretical guarantees and valid inference under realistic interference structures hold, the package offers a useful, scalable addition to the R ecosystem for analyzing spillover and interference phenomena in network data, with computational advantages over existing tools and extensibility via custom terms.

minor comments (3)
  1. Abstract: the phrase 'provable theoretical guarantees' is stated without naming the key result (e.g., consistency or asymptotic normality of the estimator); a one-sentence summary of the main theorem would improve clarity for readers.
  2. Section describing custom model terms: the text explains that users can add custom-built terms but provides no concrete code snippet or vignette reference showing the required interface; an explicit example would aid reproducibility and adoption.
  3. Application sections (hate speech and student data): the reported results would benefit from explicit tabulation of the estimated interference parameters and their standard errors to allow direct comparison with non-interference baselines.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary and recommendation of minor revision. We are encouraged by the recognition of iglm's scalability via convex pseudo-likelihood optimization, extensibility through custom terms, and potential utility for spillover analysis in network data.

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper presents a regression framework for interference in connected populations implemented via the iglm R package. It relies on standard convex pseudo-likelihood optimization solved by Minorization-Maximization and Quasi-Newton methods, with claimed theoretical guarantees. No load-bearing step reduces by construction to a fitted parameter renamed as a prediction, nor does any uniqueness theorem or ansatz trace exclusively to self-citation chains within the provided text. The approach uses conventional statistical machinery without self-definitional loops or smuggling of assumptions via prior author work that would force the central results. The framework is self-contained against external benchmarks for the stated scope.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, ad-hoc axioms, or invented entities; relies on standard convex optimization and pseudo-likelihood concepts from prior literature.

axioms (1)
  • standard math Standard results from convex optimization and pseudo-likelihood theory apply to the interference setting.
    Invoked implicitly to support scalability and guarantees.

pith-pipeline@v0.9.0 · 5424 in / 1049 out tokens · 63199 ms · 2026-05-10T14:58:14.206806+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

31 extracted references · 9 canonical work pages

  1. [1]

    Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory

    Brown L (1986). Fundamentals of Statistical Exponential Families: With Applications in Statistical Decision Theory. Institute of Mathematical Statistics, Hayworth, CA, USA

  2. [2]

    Bergm : B ayesian Exponential Random Graphs in R

    Caimo A, Friel N (2014). Bergm : B ayesian Exponential Random Graphs in R . Journal of Statistical Software, 61, 1--25

  3. [3]

    R6: Encapsulated Classes with Reference Semantics

    Chang W (2025). R6: Encapsulated Classes with Reference Semantics. doi:10.32614/CRAN.package.R6. R package version 2.6.1, ://CRAN.R-project.org/package=R6

  4. [4]

    igraph : Network Analysis and Visualization in R

    Csárdi G, Nepusz T, Traag V, Horvát S, Zanini F, Noom D, Müller K (2025). igraph : Network Analysis and Visualization in R . doi:10.5281/zenodo.7682609. R package version 2.1.4

  5. [5]

    Seamless R and C++ Integration with Rcpp

    Eddelbuettel D (2013). Seamless R and C++ Integration with Rcpp . Springer, New York. doi:10.1007/978-1-4614-6868-4. ISBN 978-1-4614-6867-7

  6. [6]

    RcppArmadillo: Accelerating R with high-performance C++ linear algebra

    Eddelbuettel D, Sanderson C (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Computational Statistics and Data Analysis, 71, 1054--1063. doi:10.1016/j.csda.2013.02.005

  7. [7]

    ernm : Exponential-Family Random Network Models

    Fellows I, Clark D (2025). ernm : Exponential-Family Random Network Models . R package version 1.0.4, ://CRAN.R-project.org/package=ernm

  8. [8]

    Exponential-family Random Network Models

    Fellows I, Handcock MS (2012). Exponential-family Random Network Models. Technical report, Department of Statistics, University of California, Los Angeles. ArXiv:1208.0121

  9. [9]

    Testing and modeling dependencies between a network and nodal attributes

    Fosdick BK, Hoff PD (2015). Testing and modeling dependencies between a network and nodal attributes. Journal of the American Statistical Association, 110, 1047--1056

  10. [10]

    iglm: Regression under Interference in Connected Populations

    Fritz C, Schweinberger M (2025). iglm: Regression under Interference in Connected Populations. R package version 1.2.4, ://CRAN.R-project.org/package=iglm

  11. [11]

    A regression framework for studying relationships among attributes under network interference

    Fritz C, Schweinberger M, Bhadra S, Hunter DR (2026). A regression framework for studying relationships among attributes under network interference. Journal of the American Statistical Association. To appear

  12. [12]

    amen: Additive and Multiplicative Effects Models for Networks and Relational Data

    Hoff P, Fosdick B, Volfovsky A (2024). amen: Additive and Multiplicative Effects Models for Networks and Relational Data. doi:10.32614/CRAN.package.amen. R package version 1.4.5, ://CRAN.R-project.org/package=amen

  13. [13]

    Additive and multiplicative effects network models

    Hoff PD (2021). Additive and multiplicative effects network models. Statistical Science, 36, 34--50

  14. [14]

    Inference in curved exponential family models for networks

    Hunter DR, Handcock MS (2006). Inference in curved exponential family models for networks. Journal of Computational and Graphical Statistics, 15, 565--583

  15. [15]

    Attention to the COVID -19 pandemic on Twitter : Partisan differences among U

    Kim T, Nakka N, Gopal I, Desmarais BA, Mancinelli A, Harden JJ, Ko H, Boehmke FJ (2022). Attention to the COVID -19 pandemic on Twitter : Partisan differences among U . S . state legislators . Legislative Studies Quarterly, 47, 1023--1041. doi:10.1111/lsq.12367

  16. [16]

    Bayesian analysis of social influence

    Koskinen J, Daraganova G (2022). Bayesian analysis of social influence. Journal of the Royal Statistical Society Series A: Statistics in Society, 185, 1855--1881

  17. [17]

    ergm 4: New Features for Analyzing Exponential-Family Random Graph Models

    Krivitsky PN, Hunter DR, Morris M, Klumb C (2023). ergm 4: New Features for Analyzing Exponential-Family Random Graph Models. Journal of Statistical Software, 105, 1--44. doi:10.18637/jss.v105.i06

  18. [18]

    vcd: Visualizing Categorical Data

    Meyer D, Zeileis A, Hornik K, Friendly M (2024). vcd: Visualizing Categorical Data. doi:10.32614/CRAN.package.vcd. R package version 1.4-13

  19. [19]

    Co-evolution of social networks and continuous actor attributes

    Niezink NMD, Snijders TAB (2017). Co-evolution of social networks and continuous actor attributes. The Annals of Applied Statistics, 11, 1948--1973

  20. [20]

    Iterative solution of nonlinear equations in several variables

    Ortega JM, Rheinboldt WC (2000). Iterative solution of nonlinear equations in several variables. Society for Industrial and Applied Mathematics, Philadelphia, PA

  21. [21]

    CODA: Convergence Diagnosis and Output Analysis for MCMC

    Plummer M, Best N, Cowles K, Vines K (2006). CODA: Convergence Diagnosis and Output Analysis for MCMC. R News, 6, 7--11

  22. [22]

    R: A Language and Environment for Statistical Computing

    R Core Team (2026). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ://www.R-project.org/

  23. [23]

    On the geometry of discrete exponential families with application to exponential random graph models

    Rinaldo A, Fienberg SE, Zhou Y (2009). On the geometry of discrete exponential families with application to exponential random graph models. Electronic Journal of Statistics, 3, 446--484

  24. [24]

    pROC: an open-source package for R and S+ to analyze and compare ROC curves

    Robin X, Turck N, Hainard A, Tiberti N, Lisacek F, Sanchez JC, Müller M (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinformatics, 12, 77

  25. [25]

    Interaction Data from the Copenhagen Networks Study

    Sapiezynski P, Stopczynski A, Lassen DD, Lehmann S (2019). Interaction Data from the Copenhagen Networks Study . Scientific Data, 6, 315

  26. [26]

    Siena - Simulation Investigation for Empirical Network Analysis

    Snijders TAB, Ripley RM, Boitmanis K, Steglich C, Niezink NMD, Amati V, Schoenenberger F (2025). Siena - Simulation Investigation for Empirical Network Analysis. University of Groningen, Groningen, The Netherlands. R package version 1.5.0, ://www.stats.ox.ac.uk/ snijders/siena/

  27. [27]

    Modeling the co-evolution of networks and behavior

    Snijders TAB, Steglich CEG, Schweinberger M (2007). Modeling the co-evolution of networks and behavior. In K van Montfort, H Oud, A Satorra (eds.), Longitudinal models in the behavioral and related sciences, pp. 41--71. Lawrence Erlbaum

  28. [28]

    PNet: program for the simulation and estimation of exponential random graph models

    Wang P, Robins G, Pattison P, Koskinen J (2009). PNet: program for the simulation and estimation of exponential random graph models. Melbourne School of Psychological Sciences, The University of Melbourne

  29. [29]

    Understanding networks with exponential-family random network models

    Wang Z, Fellows IE, Handcock MS (2024). Understanding networks with exponential-family random network models. Social Networks, 78, 81--91

  30. [30]

    , " * write output.state after.block = add.period write newline

    ENTRY address archive author booktitle chapter collaboration doi edition editor eid eprint howpublished institution isbn issn journal key month note number numpages organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION ...

  31. [31]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...