Recognition: unknown
Simpson's paradox explains the ubiquity of nonlinear, threshold, and complex contagions
Pith reviewed 2026-05-09 18:18 UTC · model grok-4.3
The pith
Simpson's paradox makes linear or sublinear contagions appear nonlinear and threshold-like when data from heterogeneous groups is averaged.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Global threshold dynamics and superlinear complex contagions arise even in populations where agents are distributed across social groups described solely by linear or even sublinear contagions; this effect is a manifestation of Simpson's paradox because incidence data from heterogeneous groups looks superlinear once averaged, since the sampling of groups represented at high incidence is biased towards those with stronger local transmission.
What carries the argument
Simpson's paradox in contagion incidence, where the set of groups contributing data shifts toward higher-transmission groups as incidence rises, producing an apparent nonlinear aggregate response.
If this is right
- Observed threshold and complex contagion effects do not necessarily require individual-level nonlinearity or social reinforcement.
- Model selection for contagion data risks misattributing heterogeneity-driven artifacts to behavioral mechanisms if group structure is ignored.
- Empirical studies should stratify populations by group to distinguish true complex contagions from Simpson's contagions.
Where Pith is reading between the lines
- Interventions aimed at reducing group-level differences in transmission might flatten apparent nonlinearity more effectively than targeting assumed reinforcement effects.
- The same sampling bias could generate illusory nonlinearity in other heterogeneous systems, such as adoption curves or opinion dynamics.
- Collecting group-stratified longitudinal data would allow direct tests for Simpson's contagions in real populations.
Load-bearing premise
The bias in which groups are observed at high incidence is the dominant driver of the apparent nonlinearity, without other unmodeled correlations or temporal effects changing the aggregate curve.
What would settle it
Disaggregate incidence data by individual social groups, measure the transmission function within each group separately, and test whether all groups remain linear or sublinear while the pooled data shows superlinear growth.
Figures
read the original abstract
Complex contagions describe systems where the probability or rate of contagious transmission is a nonlinear function of the exposure to contagious agents. These models were first studied theoretically but have since been used to capture effects such as nonconformism, social reinforcement or peer pressure in empirical data. However, recent studies have shown that local correlations (e.g., group structure or temporal burstiness) and heterogeneity (e.g., diversity of parameters or covariates) can give the illusion of nonlinear effects even when the dynamics is actually linear. We briefly review these studies to inform a new model and explanation for these effective models of complex contagions. We find global threshold dynamics and superlinear complex contagions even in populations where agents are distributed across social groups described solely by linear or even sublinear contagions. This effect can be understood as a manifestation of Simpson's paradox. Incidence data from heterogeneous groups can look superlinear once averaged over all groups, since the sampling of groups represented at high incidence is biased towards those with stronger local transmission. We then define what we call a Simpson's contagion: a contagion process that looks superlinear when observed over an entire population, but is mechanistically linear or even sublinear in all of its subgroups. By exploring these Simpson's contagions over mathematical case studies, our work contributes to the growing body of literature on the ubiquity of threshold and complex contagions as effective models, and our results stress the pitfall of model selection that ignores correlations and heterogeneity in populations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that Simpson's paradox arising from heterogeneous group transmission rates and biased sampling of stronger-transmitting groups at high incidence can produce apparent threshold, superlinear, and complex contagion dynamics at the aggregate level, even when every subgroup follows strictly linear or sublinear local rules. The authors review prior work on heterogeneity-induced illusions, define 'Simpson's contagions,' and illustrate the mechanism through mathematical case studies of populations partitioned into groups with differing transmission parameters.
Significance. If the mechanism is shown to be prevalent, the result would be significant for contagion modeling: it supplies a parameter-free, heterogeneity-driven explanation for why aggregate data frequently favor nonlinear effective models without requiring nonlinear local rules. The case studies provide clear, reproducible illustrations of how averaging over heterogeneous linear processes yields nonlinear population curves, extending the literature on effective complex contagions and cautioning against direct inference of local nonlinearity from global incidence data.
major comments (2)
- [Abstract] Abstract: the central claim that Simpson's paradox 'explains the ubiquity' of nonlinear/threshold/complex contagions is not supported by the mathematical case studies. These demonstrate that the illusion is possible under heterogeneous linear rules, but the manuscript provides no empirical comparison, quantification of relative contribution versus temporal burstiness or genuine nonlinearities, or test showing this bias dominates in real populations (as flagged in the weakest-assumption note).
- [Abstract and mathematical case studies] The definition of a Simpson's contagion (abstract and case-study sections): while biased sampling toward stronger-transmission groups at high incidence is described qualitatively, the manuscript does not derive or state the general conditions on the distribution of group rates and incidence sampling probabilities that guarantee superlinear or threshold-like aggregates rather than mild concavity or convexity.
minor comments (2)
- [Introduction/review section] The review of prior studies on local correlations and heterogeneity is useful but would benefit from explicit citations to the specific equations or figures in those works that demonstrate similar illusions.
- [Mathematical case studies] Notation for incidence curves and group-level rates is introduced without a consolidated table or equation list, making it harder to compare the linear subgroup rules to the resulting aggregate curves across case studies.
Simulated Author's Rebuttal
We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating where revisions have been made to the manuscript. Our responses focus on clarifying the scope of our theoretical results while strengthening the presentation where possible.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that Simpson's paradox 'explains the ubiquity' of nonlinear/threshold/complex contagions is not supported by the mathematical case studies. These demonstrate that the illusion is possible under heterogeneous linear rules, but the manuscript provides no empirical comparison, quantification of relative contribution versus temporal burstiness or genuine nonlinearities, or test showing this bias dominates in real populations (as flagged in the weakest-assumption note).
Authors: We agree that the manuscript is theoretical in nature and uses mathematical case studies to establish that Simpson's paradox can produce apparent nonlinear, threshold, and complex contagion patterns from strictly linear or sublinear subgroup dynamics. The paper does not include new empirical analyses or direct quantification of the mechanism's prevalence relative to alternatives such as temporal burstiness. However, the contribution lies in identifying a general, parameter-free mechanism driven by common population heterogeneity that can generate these patterns without invoking local nonlinearity. In the revised manuscript, we have softened the abstract language from 'explains the ubiquity' to 'provides a potential explanation for the observed ubiquity' and added a discussion paragraph comparing the mechanism to other heterogeneity-induced effects, with references to related literature. We note that empirical validation would require additional data and is left for future work. revision: partial
-
Referee: [Abstract and mathematical case studies] The definition of a Simpson's contagion (abstract and case-study sections): while biased sampling toward stronger-transmission groups at high incidence is described qualitatively, the manuscript does not derive or state the general conditions on the distribution of group rates and incidence sampling probabilities that guarantee superlinear or threshold-like aggregates rather than mild concavity or convexity.
Authors: We thank the referee for this suggestion. The original manuscript relied on specific case studies to illustrate the effect. In the revised version, we have added a new subsection that derives the general conditions. Assuming groups ordered by increasing transmission rates β_i and an incidence-dependent sampling probability p_i(I) that is nondecreasing in β_i (stronger groups overrepresented at higher incidence), we state the sufficient conditions on the variance of the β distribution and the form of p_i(I) under which the effective aggregate transmission function is strictly increasing in incidence I. This yields superlinear or threshold-like behavior in the population-level incidence curve, with explicit inequalities provided to distinguish from cases producing only mild concavity or convexity. revision: yes
Circularity Check
No significant circularity; derivation is self-contained mathematical demonstration
full rationale
The paper constructs its central result through explicit mathematical case studies of heterogeneous groups with linear or sublinear local contagion rules. Global superlinearity emerges directly from the averaging properties and biased sampling of high-incidence groups (stronger-transmission groups dominate the aggregate at high incidence). This is a first-principles consequence of heterogeneity and does not involve fitting parameters to the target aggregate curve, self-referential definitions, or load-bearing self-citations. The derivation chain remains independent of the claimed explanation for real-world data.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Social groups differ in their baseline transmission rates
- domain assumption Incidence data is observed at the population level without group labels
invented entities (1)
-
Simpson's contagion
no independent evidence
Reference graph
Works this paper leans on
-
[1]
& Metz, H
Diekmann, O., Heesterbeek, H. & Metz, H. The legacy of Kermack and McKendrick.Publ. Newton Inst.95–115 (1995)
1995
-
[2]
& Vespignani, A
Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks.Rev. Mod. Phys.87, 925–979 (2015)
2015
-
[3]
M., Conde-Guillén, G., Safont-Montes, J
Robles-Romero, J. M., Conde-Guillén, G., Safont-Montes, J. C., García-Padilla, F. M. & Romero-Martín, M. Behaviour of aerosols and their role in the transmission of SARS-CoV-2; a scoping review.Rev. Med. Virol.32, e2297 (2022)
2022
-
[4]
& Althouse, B
Hébert-Dufresne, L. & Althouse, B. M. Complex dynamics of synergistic coinfections on realistically clustered networks. Proc. Natl. Acad. Sci.112, 10551–10556 (2015)
2015
-
[5]
Hébert-Dufresne, L.et al.One pathogen does not an epidemic make: a review of interacting contagions, diseases, beliefs, and stories.npj Complex.2, 26 (2025)
2025
-
[6]
W., Andreasen, V ., Levin, S
Castillo-Chavez, C., Hethcote, H. W., Andreasen, V ., Levin, S. A. & Liu, W. M. Epidemiological models with age structure, proportionate mixing, and cross-immunity.J. Math. Biol.27, 233–258 (1989)
1989
-
[7]
& Stein, D
Galesic, M. & Stein, D. L. Statistical physics models of belief dynamics: Theory and empirical tests.Phys. A519, 275–294 (2019). 9.Anttila, J.et al.A mechanistic underpinning for sigmoid dose-dependent infection.Oikos126, 910–916 (2017)
2019
-
[8]
Hébert-Dufresne, L., Scarpino, S. V . & Young, J.-G. Macroscopic patterns of interacting contagions are indistinguishable from social reinforcement.Nat. Phys.16, 426–431 (2020)
2020
-
[9]
& Bianconi, G
St-Onge, G., Sun, H., Allard, A., Hébert-Dufresne, L. & Bianconi, G. Universal nonlinear infection kernel from heterogeneous exposure on higher-order networks.Phys. Rev. Lett.127, 158301 (2021)
2021
-
[10]
& Allard, A
St-Onge, G., Hébert-Dufresne, L. & Allard, A. Nonlinear bias toward complex contagion in uncertain transmission settings. Proc. Natl. Acad. Sci.121, e2312202121 (2023)
2023
-
[11]
& Ahn, Y .-Y
Aiyappa, R., Flammini, A. & Ahn, Y .-Y . Emergence of simple and complex contagion dynamics from weighted belief networks.Sci. Adv.10, eadh4439 (2024)
2024
-
[12]
Dodds, P. S. & Watts, D. J. A generalized model of social and biological contagion.J. Theor. Biol.232, 587–604 (2005). 7/8
2005
-
[13]
& Centola, D
Guilbeault, D., Becker, J. & Centola, D. Complex contagions: A decade in review. In Lehmann, S. & Ahn, Y . (eds.) Complex spreading phenomena in social systems: Influence and contagion in real-world social networks, 3–25 (Springer, 2018)
2018
-
[14]
& Grassberger, P
Cai, W., Chen, L., Ghanbarnejad, F. & Grassberger, P. Avalanche outbreaks emerging in cooperative contagions.Nat. Phys. 11, 936–940 (2015). 17.Yule, G. U. Notes on the theory of association of attributes in statistics.Biometrika2, 121–134 (1903). 18.Simpson, E. H. The interpretation of interaction in contingency tables.J. Royal Stat. Soc. Ser. B13, 238–241 (1951)
2015
-
[15]
& Dubé, L
Hébert-Dufresne, L., Noël, P.-A., Marceau, V ., Allard, A. & Dubé, L. J. Propagation dynamics on networks featuring complex topologies.Phys. Rev. E82, 036115 (2010)
2010
-
[16]
St-Onge, G., Thibeault, V ., Allard, A., Dubé, L. J. & Hébert-Dufresne, L. Master equation analysis of mesoscopic localization in contagion dynamics on higher-order networks.Phys. Rev. E103, 032301 (2021)
2021
-
[17]
St-Onge, G.et al.Influential groups for seeding and sustaining nonlinear contagion in heterogeneous hypergraphs. Commun. Phys.5, 25 (2022)
2022
-
[18]
& Sundararajan, A
Aral, S., Muchnik, L. & Sundararajan, A. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proc. Natl. Acad. Sci.106, 21544–21549 (2009)
2009
-
[19]
& Lehmann, S
Mønsted, B., Sapie˙zy´nski, P., Ferrara, E. & Lehmann, S. Evidence of complex contagion of information in social media: An experiment using twitter bots.PLOS One12, e0184148 (2017)
2017
-
[20]
A., Mancastroppa, M
Cencetti, G., Contreras, D. A., Mancastroppa, M. & Barrat, A. Distinguishing simple and complex contagion processes on networks.Phys. Rev. Lett.130, 247401 (2023). 25.Nelsen, R. B.An Introduction to Copulas(Springer New York, 2006), 2 edn. 26.Peixoto, T. P. Network reconstruction and community detection from dynamics.Phys. Rev. Lett.123, 128301 (2019)
2023
-
[21]
W., Thompson, W., Hébert-Dufresne, L
Landry, N. W., Thompson, W., Hébert-Dufresne, L. & Young, J.-G. Complex contagions can outperform simple contagions for network reconstruction with dense networks or saturated dynamics.Phys. Rev. E(2024). 8/8
2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.