arxiv: 2605.00791 · v1 · submitted 2026-05-01 · ⚛️ physics.soc-ph

Recognition: unknown

Simpson's paradox explains the ubiquity of nonlinear, threshold, and complex contagions

Laurent H\'ebert-Dufresne , Antoine Allard , Jean-Gabriel Young , William H. W. Thompson , Guillaume St-Onge

Authors on Pith no claims yet

Pith reviewed 2026-05-09 18:18 UTC · model grok-4.3

classification ⚛️ physics.soc-ph

keywords Simpson's paradoxcomplex contagionsthreshold dynamicsheterogeneitysocial contagionnonlinear dynamicsgroup structure

0 comments

The pith

Simpson's paradox makes linear or sublinear contagions appear nonlinear and threshold-like when data from heterogeneous groups is averaged.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that global threshold dynamics and superlinear complex contagions emerge in population-level data even when every social subgroup follows only linear or sublinear transmission rules. This occurs because groups with stronger local transmission become overrepresented in the observed incidence at higher levels, biasing the aggregate curve upward. A sympathetic reader cares because this shows many reported instances of social reinforcement, peer pressure, or nonconformism could be statistical artifacts of ignored heterogeneity rather than genuine behavioral mechanisms. The authors define a Simpson's contagion as any process that is mechanistically linear inside groups yet produces apparent nonlinearity in the full population.

Core claim

Global threshold dynamics and superlinear complex contagions arise even in populations where agents are distributed across social groups described solely by linear or even sublinear contagions; this effect is a manifestation of Simpson's paradox because incidence data from heterogeneous groups looks superlinear once averaged, since the sampling of groups represented at high incidence is biased towards those with stronger local transmission.

What carries the argument

Simpson's paradox in contagion incidence, where the set of groups contributing data shifts toward higher-transmission groups as incidence rises, producing an apparent nonlinear aggregate response.

If this is right

Observed threshold and complex contagion effects do not necessarily require individual-level nonlinearity or social reinforcement.
Model selection for contagion data risks misattributing heterogeneity-driven artifacts to behavioral mechanisms if group structure is ignored.
Empirical studies should stratify populations by group to distinguish true complex contagions from Simpson's contagions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Interventions aimed at reducing group-level differences in transmission might flatten apparent nonlinearity more effectively than targeting assumed reinforcement effects.
The same sampling bias could generate illusory nonlinearity in other heterogeneous systems, such as adoption curves or opinion dynamics.
Collecting group-stratified longitudinal data would allow direct tests for Simpson's contagions in real populations.

Load-bearing premise

The bias in which groups are observed at high incidence is the dominant driver of the apparent nonlinearity, without other unmodeled correlations or temporal effects changing the aggregate curve.

What would settle it

Disaggregate incidence data by individual social groups, measure the transmission function within each group separately, and test whether all groups remain linear or sublinear while the pooled data shows superlinear growth.

Figures

Figures reproduced from arXiv: 2605.00791 by Antoine Allard, Guillaume St-Onge, Jean-Gabriel Young, Laurent H\'ebert-Dufresne, William H. W. Thompson.

**Figure 4.** Figure 4: Simpson’s contagion from correlated λ and ν. We model correlations between λ and ν with a normal copula, assuming a uniform marginal density over [0, 10] for λ and over [0, 1] for ν. The parameter ρ then controls correlations between the two parameters, which can be positive (ρ > 0), null (ρ = 0), or negative (ρ < 0). We show numerical estimates of the effective kernel variation δ ˜β(n,i) (left panel) and … view at source ↗

read the original abstract

Complex contagions describe systems where the probability or rate of contagious transmission is a nonlinear function of the exposure to contagious agents. These models were first studied theoretically but have since been used to capture effects such as nonconformism, social reinforcement or peer pressure in empirical data. However, recent studies have shown that local correlations (e.g., group structure or temporal burstiness) and heterogeneity (e.g., diversity of parameters or covariates) can give the illusion of nonlinear effects even when the dynamics is actually linear. We briefly review these studies to inform a new model and explanation for these effective models of complex contagions. We find global threshold dynamics and superlinear complex contagions even in populations where agents are distributed across social groups described solely by linear or even sublinear contagions. This effect can be understood as a manifestation of Simpson's paradox. Incidence data from heterogeneous groups can look superlinear once averaged over all groups, since the sampling of groups represented at high incidence is biased towards those with stronger local transmission. We then define what we call a Simpson's contagion: a contagion process that looks superlinear when observed over an entire population, but is mechanistically linear or even sublinear in all of its subgroups. By exploring these Simpson's contagions over mathematical case studies, our work contributes to the growing body of literature on the ubiquity of threshold and complex contagions as effective models, and our results stress the pitfall of model selection that ignores correlations and heterogeneity in populations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that Simpson's paradox arising from heterogeneous group transmission rates and biased sampling of stronger-transmitting groups at high incidence can produce apparent threshold, superlinear, and complex contagion dynamics at the aggregate level, even when every subgroup follows strictly linear or sublinear local rules. The authors review prior work on heterogeneity-induced illusions, define 'Simpson's contagions,' and illustrate the mechanism through mathematical case studies of populations partitioned into groups with differing transmission parameters.

Significance. If the mechanism is shown to be prevalent, the result would be significant for contagion modeling: it supplies a parameter-free, heterogeneity-driven explanation for why aggregate data frequently favor nonlinear effective models without requiring nonlinear local rules. The case studies provide clear, reproducible illustrations of how averaging over heterogeneous linear processes yields nonlinear population curves, extending the literature on effective complex contagions and cautioning against direct inference of local nonlinearity from global incidence data.

major comments (2)

[Abstract] Abstract: the central claim that Simpson's paradox 'explains the ubiquity' of nonlinear/threshold/complex contagions is not supported by the mathematical case studies. These demonstrate that the illusion is possible under heterogeneous linear rules, but the manuscript provides no empirical comparison, quantification of relative contribution versus temporal burstiness or genuine nonlinearities, or test showing this bias dominates in real populations (as flagged in the weakest-assumption note).
[Abstract and mathematical case studies] The definition of a Simpson's contagion (abstract and case-study sections): while biased sampling toward stronger-transmission groups at high incidence is described qualitatively, the manuscript does not derive or state the general conditions on the distribution of group rates and incidence sampling probabilities that guarantee superlinear or threshold-like aggregates rather than mild concavity or convexity.

minor comments (2)

[Introduction/review section] The review of prior studies on local correlations and heterogeneity is useful but would benefit from explicit citations to the specific equations or figures in those works that demonstrate similar illusions.
[Mathematical case studies] Notation for incidence curves and group-level rates is introduced without a consolidated table or equation list, making it harder to compare the linear subgroup rules to the resulting aggregate curves across case studies.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments on our manuscript. We address each major comment point by point below, indicating where revisions have been made to the manuscript. Our responses focus on clarifying the scope of our theoretical results while strengthening the presentation where possible.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that Simpson's paradox 'explains the ubiquity' of nonlinear/threshold/complex contagions is not supported by the mathematical case studies. These demonstrate that the illusion is possible under heterogeneous linear rules, but the manuscript provides no empirical comparison, quantification of relative contribution versus temporal burstiness or genuine nonlinearities, or test showing this bias dominates in real populations (as flagged in the weakest-assumption note).

Authors: We agree that the manuscript is theoretical in nature and uses mathematical case studies to establish that Simpson's paradox can produce apparent nonlinear, threshold, and complex contagion patterns from strictly linear or sublinear subgroup dynamics. The paper does not include new empirical analyses or direct quantification of the mechanism's prevalence relative to alternatives such as temporal burstiness. However, the contribution lies in identifying a general, parameter-free mechanism driven by common population heterogeneity that can generate these patterns without invoking local nonlinearity. In the revised manuscript, we have softened the abstract language from 'explains the ubiquity' to 'provides a potential explanation for the observed ubiquity' and added a discussion paragraph comparing the mechanism to other heterogeneity-induced effects, with references to related literature. We note that empirical validation would require additional data and is left for future work. revision: partial
Referee: [Abstract and mathematical case studies] The definition of a Simpson's contagion (abstract and case-study sections): while biased sampling toward stronger-transmission groups at high incidence is described qualitatively, the manuscript does not derive or state the general conditions on the distribution of group rates and incidence sampling probabilities that guarantee superlinear or threshold-like aggregates rather than mild concavity or convexity.

Authors: We thank the referee for this suggestion. The original manuscript relied on specific case studies to illustrate the effect. In the revised version, we have added a new subsection that derives the general conditions. Assuming groups ordered by increasing transmission rates β_i and an incidence-dependent sampling probability p_i(I) that is nondecreasing in β_i (stronger groups overrepresented at higher incidence), we state the sufficient conditions on the variance of the β distribution and the form of p_i(I) under which the effective aggregate transmission function is strictly increasing in incidence I. This yields superlinear or threshold-like behavior in the population-level incidence curve, with explicit inequalities provided to distinguish from cases producing only mild concavity or convexity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained mathematical demonstration

full rationale

The paper constructs its central result through explicit mathematical case studies of heterogeneous groups with linear or sublinear local contagion rules. Global superlinearity emerges directly from the averaging properties and biased sampling of high-incidence groups (stronger-transmission groups dominate the aggregate at high incidence). This is a first-principles consequence of heterogeneity and does not involve fitting parameters to the target aggregate curve, self-referential definitions, or load-bearing self-citations. The derivation chain remains independent of the claimed explanation for real-world data.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on the mathematical property that heterogeneous linear transmission rates plus incidence-based sampling produce nonlinear aggregates; this is shown via case studies rather than new axioms.

axioms (2)

domain assumption Social groups differ in their baseline transmission rates
Required to generate the sampling bias at high incidence; stated in the model description.
domain assumption Incidence data is observed at the population level without group labels
Necessary for the aggregation step that creates the paradox.

invented entities (1)

Simpson's contagion no independent evidence
purpose: Label for a process that is linear or sublinear within subgroups but appears superlinear at population level
New term coined to describe the effective dynamics; no independent evidence provided beyond the mathematical examples.

pith-pipeline@v0.9.0 · 5589 in / 1404 out tokens · 41243 ms · 2026-05-09T18:18:10.298030+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

21 extracted references

[1]

& Metz, H

Diekmann, O., Heesterbeek, H. & Metz, H. The legacy of Kermack and McKendrick.Publ. Newton Inst.95–115 (1995)

1995
[2]

& Vespignani, A

Pastor-Satorras, R., Castellano, C., Van Mieghem, P. & Vespignani, A. Epidemic processes in complex networks.Rev. Mod. Phys.87, 925–979 (2015)

2015
[3]

M., Conde-Guillén, G., Safont-Montes, J

Robles-Romero, J. M., Conde-Guillén, G., Safont-Montes, J. C., García-Padilla, F. M. & Romero-Martín, M. Behaviour of aerosols and their role in the transmission of SARS-CoV-2; a scoping review.Rev. Med. Virol.32, e2297 (2022)

2022
[4]

& Althouse, B

Hébert-Dufresne, L. & Althouse, B. M. Complex dynamics of synergistic coinfections on realistically clustered networks. Proc. Natl. Acad. Sci.112, 10551–10556 (2015)

2015
[5]

Hébert-Dufresne, L.et al.One pathogen does not an epidemic make: a review of interacting contagions, diseases, beliefs, and stories.npj Complex.2, 26 (2025)

2025
[6]

W., Andreasen, V ., Levin, S

Castillo-Chavez, C., Hethcote, H. W., Andreasen, V ., Levin, S. A. & Liu, W. M. Epidemiological models with age structure, proportionate mixing, and cross-immunity.J. Math. Biol.27, 233–258 (1989)

1989
[7]

& Stein, D

Galesic, M. & Stein, D. L. Statistical physics models of belief dynamics: Theory and empirical tests.Phys. A519, 275–294 (2019). 9.Anttila, J.et al.A mechanistic underpinning for sigmoid dose-dependent infection.Oikos126, 910–916 (2017)

2019
[8]

Hébert-Dufresne, L., Scarpino, S. V . & Young, J.-G. Macroscopic patterns of interacting contagions are indistinguishable from social reinforcement.Nat. Phys.16, 426–431 (2020)

2020
[9]

& Bianconi, G

St-Onge, G., Sun, H., Allard, A., Hébert-Dufresne, L. & Bianconi, G. Universal nonlinear infection kernel from heterogeneous exposure on higher-order networks.Phys. Rev. Lett.127, 158301 (2021)

2021
[10]

& Allard, A

St-Onge, G., Hébert-Dufresne, L. & Allard, A. Nonlinear bias toward complex contagion in uncertain transmission settings. Proc. Natl. Acad. Sci.121, e2312202121 (2023)

2023
[11]

& Ahn, Y .-Y

Aiyappa, R., Flammini, A. & Ahn, Y .-Y . Emergence of simple and complex contagion dynamics from weighted belief networks.Sci. Adv.10, eadh4439 (2024)

2024
[12]

Dodds, P. S. & Watts, D. J. A generalized model of social and biological contagion.J. Theor. Biol.232, 587–604 (2005). 7/8

2005
[13]

& Centola, D

Guilbeault, D., Becker, J. & Centola, D. Complex contagions: A decade in review. In Lehmann, S. & Ahn, Y . (eds.) Complex spreading phenomena in social systems: Influence and contagion in real-world social networks, 3–25 (Springer, 2018)

2018
[14]

& Grassberger, P

Cai, W., Chen, L., Ghanbarnejad, F. & Grassberger, P. Avalanche outbreaks emerging in cooperative contagions.Nat. Phys. 11, 936–940 (2015). 17.Yule, G. U. Notes on the theory of association of attributes in statistics.Biometrika2, 121–134 (1903). 18.Simpson, E. H. The interpretation of interaction in contingency tables.J. Royal Stat. Soc. Ser. B13, 238–241 (1951)

2015
[15]

& Dubé, L

Hébert-Dufresne, L., Noël, P.-A., Marceau, V ., Allard, A. & Dubé, L. J. Propagation dynamics on networks featuring complex topologies.Phys. Rev. E82, 036115 (2010)

2010
[16]

St-Onge, G., Thibeault, V ., Allard, A., Dubé, L. J. & Hébert-Dufresne, L. Master equation analysis of mesoscopic localization in contagion dynamics on higher-order networks.Phys. Rev. E103, 032301 (2021)

2021
[17]

St-Onge, G.et al.Influential groups for seeding and sustaining nonlinear contagion in heterogeneous hypergraphs. Commun. Phys.5, 25 (2022)

2022
[18]

& Sundararajan, A

Aral, S., Muchnik, L. & Sundararajan, A. Distinguishing influence-based contagion from homophily-driven diffusion in dynamic networks.Proc. Natl. Acad. Sci.106, 21544–21549 (2009)

2009
[19]

& Lehmann, S

Mønsted, B., Sapie˙zy´nski, P., Ferrara, E. & Lehmann, S. Evidence of complex contagion of information in social media: An experiment using twitter bots.PLOS One12, e0184148 (2017)

2017
[20]

A., Mancastroppa, M

Cencetti, G., Contreras, D. A., Mancastroppa, M. & Barrat, A. Distinguishing simple and complex contagion processes on networks.Phys. Rev. Lett.130, 247401 (2023). 25.Nelsen, R. B.An Introduction to Copulas(Springer New York, 2006), 2 edn. 26.Peixoto, T. P. Network reconstruction and community detection from dynamics.Phys. Rev. Lett.123, 128301 (2019)

2023
[21]

W., Thompson, W., Hébert-Dufresne, L

Landry, N. W., Thompson, W., Hébert-Dufresne, L. & Young, J.-G. Complex contagions can outperform simple contagions for network reconstruction with dense networks or saturated dynamics.Phys. Rev. E(2024). 8/8

2024