Recognition: unknown
Estimating Treatment and Spillover Effects with the Ego-Cluster Experimental Design
Pith reviewed 2026-05-09 19:23 UTC · model grok-4.3
The pith
The ego-cluster experimental design partitions networks into focal clusters to estimate both global treatment effects and spillover effects without bias from interference.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Under the ego-cluster design the network is partitioned into clusters each consisting of an ego and its alters, treatment is randomized at the cluster level, and model-based estimators recover the global treatment effect and the spillover effect with consistency and asymptotic normality whose variance is governed by the ego-cluster structure; an ego-clustering algorithm then selects egos and assigns alters sequentially to minimize the relevant asymptotic variances.
What carries the argument
Ego-cluster randomization, which partitions the network into focal units (egos) plus their immediate neighbors (alters) and performs treatment assignment at the cluster level, thereby separating direct effects from spillover effects.
If this is right
- The estimators are consistent and asymptotically normal under the stated model-based framework.
- Asymptotic variances are explicitly determined by the ego-cluster structure, enabling optimization through the proposed clustering algorithm.
- The design produces more accurate inference for both global treatment and spillover effects than existing network experimental designs.
- Simulation studies and empirical applications confirm efficiency gains over alternatives.
Where Pith is reading between the lines
- The same clustering logic could be adapted to networks observed at multiple time points by updating clusters dynamically.
- Extensions to heterogeneous treatment effects or multi-level treatments would require only modest changes to the variance-minimization step.
- Field experiments on online social platforms could directly compare the ego-cluster design against complete randomization to measure realized precision gains.
- If the interference model is misspecified, the reported asymptotic variances may understate true uncertainty, suggesting a diagnostic based on comparing design-based and model-based variance estimates.
Load-bearing premise
The network interference follows a model-based structure that lets ego-cluster partitioning separate global treatment effects from spillover effects, with the clustering algorithm correctly minimizing the resulting asymptotic variances.
What would settle it
Run the proposed ego-clustering algorithm on a network with known interference structure and simulated outcomes; the estimators fail to achieve consistency or exhibit larger finite-sample variance than standard cluster randomization.
Figures
read the original abstract
Network interference occurs when a unit's outcome depends not only on its own treatment but also on the treatments received by connected units in the network. Experimental designs and analysis methods that ignore such interference can yield biased estimators of causal effects. In this paper, we develop a new experimental design for the estimation and inference of global treatment effect and spillover effect under a model-based framework and ego-cluster randomization. Under this design, the network is partitioned into a collection of ego-clusters, each consisting of a focal unit (the ego) and its network neighbors (the alters), with randomization conducted at the cluster level. We propose model-based estimators for the global treatment effect and spillover effect and establish their consistency and asymptotic normality, with asymptotic variances determined by the ego-cluster structure. Building on these theoretical results, we introduce an ego-clustering algorithm that sequentially selects egos and assigns alters to minimize asymptotic variances. Simulation studies and two empirical applications demonstrate that the proposed procedure yields accurate inference and efficiency improvements over existing network experimental designs.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces a novel ego-cluster experimental design to estimate global treatment effects and spillover effects in the presence of network interference. The approach partitions the network into clusters each consisting of an ego and its neighboring alters, performs randomization at the cluster level, and develops model-based estimators whose consistency and asymptotic normality are established, with asymptotic variances explicitly determined by the ego-cluster structure. An algorithm is proposed for sequentially selecting egos and assigning alters to minimize these asymptotic variances. The theoretical results are complemented by simulation studies and two empirical applications that demonstrate accurate inference and efficiency gains compared to existing designs.
Significance. This manuscript makes a meaningful contribution to the literature on causal inference under network interference by providing a design that balances theoretical guarantees with practical implementation via the variance-minimizing clustering algorithm. The model-based framework allows for clean separation of effects and derivation of asymptotic properties, which is a strength when the assumptions hold. The inclusion of reproducible simulation studies and real-data applications enhances the paper's impact. If the central claims are verified, it could influence how experiments are designed in social networks and other interconnected systems.
major comments (2)
- [§3 (Theoretical Results)] The consistency and asymptotic normality of the estimators are derived assuming a fixed ego-cluster structure; however, because the clustering algorithm selects clusters based on the observed network to minimize variance, it is important to confirm that these asymptotic properties continue to hold when the partition is data-dependent. This could affect the validity of the inference procedures.
- [§4 (Clustering Algorithm)] The sequential selection procedure is presented as minimizing the asymptotic variances, but without a proof of optimality or bounds on the approximation error relative to the global minimum, the claimed efficiency improvements may not be fully realized in all networks.
minor comments (3)
- [Abstract] Consider specifying the types of networks or contexts in the two empirical applications to provide immediate context for the results.
- [Simulation studies] Include details on the parameter values used in the data-generating process and the number of Monte Carlo replications to allow for better reproducibility.
- [Notation and setup] Ensure that the definitions of the global treatment effect and spillover effect are clearly distinguished from standard average treatment effects early in the paper.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments. We address each major comment below, indicating planned revisions where appropriate.
read point-by-point responses
-
Referee: [§3 (Theoretical Results)] The consistency and asymptotic normality of the estimators are derived assuming a fixed ego-cluster structure; however, because the clustering algorithm selects clusters based on the observed network to minimize variance, it is important to confirm that these asymptotic properties continue to hold when the partition is data-dependent. This could affect the validity of the inference procedures.
Authors: We appreciate this observation. In the experimental setting, the network is observed prior to randomization and is regarded as fixed. The ego-clustering algorithm uses this fixed network to produce a deterministic partition, after which randomization occurs at the cluster level. The consistency and asymptotic normality results are derived conditional on the ego-cluster structure. We will add a clarifying statement in Section 3 to make this conditioning explicit and confirm that the asymptotic properties and inference procedures remain valid under the data-dependent but pre-randomization clustering. revision: yes
-
Referee: [§4 (Clustering Algorithm)] The sequential selection procedure is presented as minimizing the asymptotic variances, but without a proof of optimality or bounds on the approximation error relative to the global minimum, the claimed efficiency improvements may not be fully realized in all networks.
Authors: The algorithm is a greedy sequential procedure that iteratively selects egos and assigns alters to reduce the asymptotic variances in a computationally tractable way. We do not claim global optimality, as identifying the exact variance-minimizing partition is a combinatorial problem that is intractable for large networks. We will revise Section 4 to describe the procedure more precisely as a practical heuristic and will expand the discussion to note the absence of approximation bounds while emphasizing that the simulation studies demonstrate consistent efficiency gains relative to existing designs. revision: partial
Circularity Check
No significant circularity identified
full rationale
The paper's derivation chain starts from the ego-cluster randomization design and a model-based interference structure, then derives model-based estimators whose consistency and asymptotic normality (with variances explicitly determined by cluster structure) follow from standard M-estimation or similar arguments under the stated assumptions. The ego-clustering algorithm is then defined to minimize those derived asymptotic variances. No equation reduces by construction to a fitted input, no prediction is statistically forced from a subset of the target data, and no load-bearing uniqueness theorem or ansatz is imported via self-citation. The central claims remain independent of the quantities they estimate.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Network interference admits a model-based representation allowing consistent separation of global treatment effects from spillovers through ego-cluster randomization.
Reference graph
Works this paper leans on
-
[1]
Aronow, P. M. and Samii, C. (2017). Estimating average causal effects under general inter- ference, with application to a social network experiment.The Annals of Applied Statistics, 11(4):1912–1947
2017
-
[2]
Athey, S., Eckles, D., and Imbens, G. W. (2018). Exact p-values for network interference. Journal of the American Statistical Association, 113(521):230–240
2018
-
[3]
G., Duflo, E., and Jackson, M
Banerjee, A., Chandrasekhar, A. G., Duflo, E., and Jackson, M. O. (2013). The diffusion of microfinance.Science, 341(6144):1236498. Barab´ asi, A.-L. and Albert, R. (1999). Emergence of scaling in random networks.Science, 286(5439):509–512
2013
-
[4]
and Feller, A
Basse, G. and Feller, A. (2018). Analyzing two-stage experiments in the presence of inter- ference.Journal of the American Statistical Association, 113(521):41–55
2018
-
[5]
Basse, G. W. and Airoldi, E. M. (2018). Model-assisted design of experiments in the presence of network-correlated outcomes.Biometrika, 105(4):849–858
2018
- [6]
-
[7]
D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E
Blondel, V. D., Guillaume, J.-L., Lambiotte, R., and Lefebvre, E. (2008). Fast unfolding of communities in large networks.Journal of Statistical Mechanics: Theory and Experiment, 2008(10):P10008
2008
-
[8]
A., Canay, I
Bugni, F. A., Canay, I. A., and Shaikh, A. M. (2018). Inference under covariate-adaptive ran- domization.Journal of the American Statistical Association, 113(524):1784–1796. PMID: 30906087
2018
-
[9]
Cai, C., Zhang, X., and Airoldi, E. (2024). Independent-set design of experiments for estimat- ing treatment and spillover effects under network interference. InThe Twelfth International Conference on Learning Representations. 31
2024
-
[10]
Cai, J., De Janvry, A., and Sadoulet, E. (2015). Social networks and the decision to insure. American Economic Journal: Applied Economics, 7(2):81–108
2015
- [11]
-
[12]
Eckles, D., Karrer, B., and Ugander, J. (2017). Design and analysis of experiments in networks: Reducing bias from interference.Journal of Causal Inference, 5(1):1–23. Erd¨ os, P. and R´ enyi, A. (1959). On random graphs i.Publicationes Mathematicae Debrecen, 6:290–297
2017
-
[13]
M., and Mealli, F
Forastiere, L., Airoldi, E. M., and Mealli, F. (2021). Identification and estimation of treat- ment and interference effects in observational studies on networks.Journal of the American Statistical Association, 116(534):901–918
2021
-
[14]
and Ding, P
Gao, M. and Ding, P. (2025). Causal inference in network experiments: regression-based analysis and design-based properties.Journal of Econometrics, 252:106119
2025
-
[15]
Goodman, L. A. (1961). Snowball sampling.The Annals of Mathematical Statistics, 32(1):148–170
1961
-
[16]
and Hu, F
Hu, Y. and Hu, F. (2012). Asymptotic properties of covariate-adaptive randomization.The Annals of Statistics, 40(3):1794–1815
2012
-
[17]
Hu, Y., Li, S., and Wager, S. (2022). Average direct and indirect causal effects under interference.Biometrika, 109(4):1165–1172
2022
-
[18]
Hudgens, M. G. and Halloran, M. E. (2008). Toward causal inference with interference. Journal of the American Statistical Association, 103(482):832–842. PMID: 19081744
2008
-
[19]
Imbens, G. W. and Rubin, D. B. (2015).Causal Inference for Statistics, Social, and Biomed- ical Sciences: An Introduction. Cambridge University Press
2015
-
[20]
S., and Volfovsky, A
Jagadeesan, R., Pillai, N. S., and Volfovsky, A. (2020). Designs for estimating the treatment effect in networks with interference.The Annals of Statistics, 48(2):679–712
2020
-
[21]
B., Wang, X., and Yu, J
Jia, C., Li, Y., Carson, M. B., Wang, X., and Yu, J. (2017). Node attribute-enhanced community detection in complex networks.Scientific Reports, 7(1):2626
2017
-
[22]
Jiang, Z., Imai, K., and Malani, A. (2022). Statistical inference and power analysis for direct and spillover effects in two-stage randomized experiments.Biometrics, 79(3):2370–2381. 32
2022
- [23]
-
[24]
Leung, M. P. (2020). Treatment and spillover effects under network interference.The Review of Economics and Statistics, 102(2):368–380
2020
-
[25]
Leung, M. P. (2023). Network cluster-robust inference.Econometrica, 91(2):641–667
2023
-
[26]
and Wager, S
Li, S. and Wager, S. (2022). Random graph asymptotics for treatment effect estimation under network interference.The Annals of Statistics, 50(4):2334–2358
2022
-
[27]
G., and Becker-Dreps, S
Liu, L., Hudgens, M. G., and Becker-Dreps, S. (2016). On inverse probability-weighted estimators in the presence of interference.Biometrika, 103(4):829–842
2016
-
[28]
Liu, Y., Zhou, Y., Li, P., and Hu, F. (2022). Adaptive a/b test on networks with cluster structures. InProceedings of The 25th International Conference on Artificial Intelligence and Statistics, volume 151, pages 10836–10851. PMLR
2022
-
[29]
Liu, Y., Zhou, Y., Li, P., and Hu, F. (2024). Cluster-adaptive network a/b testing: from randomization to estimation.Journal of Machine Learning Research, 25(170):1–48
2024
-
[30]
Ma, W., Li, P., Zhang, L.-X., and Hu, F. (2024). A new and unified family of covariate adap- tive randomization procedures and their properties.Journal of the American Statistical Association, 119(545):151–162
2024
-
[31]
Manski, C. F. (2000). Economic analysis of social interactions.Journal of Economic Per- spectives, 14(3):115–136
2000
-
[32]
Manski, C. F. (2013). Identification of treatment response with social interactions.The Econometrics Journal, 16(1):S1–S23
2013
-
[33]
L., Sofrygin, O., Diaz, I., and Van der Laan, M
Ogburn, E. L., Sofrygin, O., Diaz, I., and Van der Laan, M. J. (2024). Causal inference for social network data.Journal of the American Statistical Association, 119(545):597–611. 33
2024
-
[34]
L., Shepherd, H., and Aronow, P
Paluck, E. L., Shepherd, H., and Aronow, P. M. (2016). Changing climates of conflict: a social network experiment in 56 schools.Proceedings of the National Academy of Sciences, 113(3):566–571
2016
-
[35]
M., Gilmour, S
Parker, B. M., Gilmour, S. G., and Schormans, J. (2017). Optimal design of experiments on connected units with application to social networks.Journal of the Royal Statistical Society Series C: Applied Statistics, 66(3):455–480
2017
-
[36]
Phan, T. Q. and Airoldi, E. M. (2015). A natural experiment of social network formation and dynamics.Proceedings of the National Academy of Sciences, 112(21):6595–6600
2015
-
[37]
Ross, N. (2011). Fundamentals of stein’s method.Probability Surveys, 8:210–293
2011
-
[38]
Saint-Jacques, G., Varshney, M., Simpson, J., and Xu, Y. (2019). Using ego-clusters to measure network effects at linkedin.arXiv preprint arXiv:1903.08755. S¨ avje, F., Aronow, P., and Hudgens, M. (2021). Average treatment effects in the presence of unknown interference.The Annals of Statistics, 49(2):673–701
-
[39]
Shalizi, C. R. and Thomas, A. C. (2011). Homophily and contagion are generically confounded in observational social network studies.Sociological Methods & Research, 40(2):211–239. PMID: 22523436
2011
-
[40]
and Duan, W
Su, W. and Duan, W. (2024). Improving ego-cluster for network effect measurement. In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 5713–5722. Association for Computing Machinery
2024
-
[41]
and Kao, E
Toulis, P. and Kao, E. (2013). Estimation of causal peer influence effects. InProceedings of the 30th International Conference on Machine Learning, volume 28, pages 1489–1497. PMLR
2013
-
[42]
Ugander, J., Karrer, B., Backstrom, L., and Kleinberg, J. (2013). Graph cluster randomiza- tion: network exposure to multiple universes. InProceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 329–337. Asso- ciation for Computing Machinery
2013
-
[43]
and Yin, H
Ugander, J. and Yin, H. (2023). Randomized graph cluster randomization.Journal of Causal Inference, 11(1)
2023
- [44]
- [45]
-
[46]
Watts, D. J. and Strogatz, S. H. (1998). Collective dynamics of ‘small-world’ networks. Nature, 393(6684):440–442
1998
-
[47]
Zhou, Z., Li, P., and Hu, F. (2024). Adaptive randomization in network data.Electronic Journal of Statistics, 18(1):47–76. 35
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.