pith. sign in

arxiv: 1906.11709 · v1 · pith:AI5XG666new · submitted 2019-06-27 · 🧮 math.PR · q-bio.PE

The minimal observable clade size of exchangeable coalescents

Pith reviewed 2026-05-25 14:41 UTC · model grok-4.3

classification 🧮 math.PR q-bio.PE
keywords Lambda-coalescentsclade sizemutationsasymptoticsmoment recursiongenealogical modelsexchangeable coalescents
0
0 comments X

The pith

In Lambda-n-coalescents with mutation the first shared-mutation block size O_n admits asymptotics and a moment recursion.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper studies the random variable O_n that counts the number of individuals sharing the first mutation on the ancestral line of a fixed sample member in a Lambda-n-coalescent. It establishes the limiting behavior of this quantity when the total sample size n becomes large and gives a recursive formula that computes every moment of O_n exactly for any finite n. Because this block size upper-bounds the smallest clade that can be detected through shared mutations, the results provide a calculable proxy for an otherwise unobservable quantity used in genetic model selection.

Core claim

For Lambda-n-coalescents with mutation, the block size O_n at the first time a mutation shared with another individual appears on a given lineage has an asymptotic distribution for large n and satisfies a recursion for all its moments at finite n. This O_n serves as an upper bound on the minimal observable clade size.

What carries the argument

The random variable O_n, the size of the partition block containing a fixed individual at the first shared-mutation time on its lineage.

Load-bearing premise

The underlying process must be exactly a standard Lambda-n-coalescent with mutations arriving at constant rate, and the first shared mutation time must be almost surely finite and positive.

What would settle it

Collect large genetic samples, compute the empirical distribution of sizes of minimal shared-mutation blocks, and check whether it stays consistent with the predicted asymptotic tail for every plausible Lambda; systematic deviation would falsify the upper-bound claim.

Figures

Figures reproduced from arXiv: 1906.11709 by Arno Siri-J\'egousse, Fabian Freund.

Figure 1
Figure 1. Figure 1: Genealogical tree and its minimal observable clade sizes O6(i) and minimal clade sizes M6(i) for i ∈ [6]. x denotes a mutation. moments (4) E(S k ) = 1 − X k+1 r=2 ak+1,r θ 2 λr + θ 2 , where λr is the total rate of the Λ-coalescent in a state with r blocks and ak+1,r is a rational function of λ2, . . . , λk+1, defined as in [18, Prop. 29]. In particular, E(S) = Λ([0, 1]) Λ([0, 1]) + θ 2 , E(S 2 ) = 1 − 3 … view at source ↗
read the original abstract

For $\Lambda$-$n$-coalescents with mutation, we analyse the size $O_n$ of the partition block of $i\in\{1,\ldots,n\}$ at the time where the first mutation appears on the tree that affects $i$ and is shared with any other $j\in\{1,\ldots,n\}$. We provide asymptotics of $O_n$ for $n\to\infty$ and a recursion for all moments of $O_n$ for finite $n$. This variable gives an upper bound for the minimal clade size [2], which is not observable in real data. In applications to genetics, it has been shown to be useful to lower classification errors in genealogical model selection [10].

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The manuscript analyzes the random variable O_n, the size of the partition block containing a fixed lineage i at the first time a mutation affects i and is shared with another lineage, in the setting of Lambda-n-coalescents with mutation. It derives the asymptotic behavior of O_n as n tends to infinity and supplies a recursion allowing computation of all moments of O_n for any finite n. O_n is presented as an upper bound on the minimal clade size, with cited utility for reducing classification errors in genealogical model selection.

Significance. If the derivations hold under appropriate conditions on Lambda, the combination of limiting results and exact moment recursions supplies both theoretical insight and a practical computational device for observable quantities in exchangeable coalescents. This directly supports applications in population genetics where only certain mutation events are detectable.

major comments (1)
  1. [Abstract and model setup] Abstract and model setup: the definition of O_n presupposes that the first shared mutation epoch exists and is finite almost surely, yet no regularity conditions on the finite measure Lambda (such as Lambda({0})<1 or positive total mass ensuring the rate function psi(x) is strictly positive) are stated. This assumption is load-bearing for both the asymptotic statements and the moment recursion, since its failure would render O_n undefined for general Lambda.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and for identifying this gap in the model assumptions. We agree that explicit regularity conditions on Λ are required to ensure O_n is well-defined a.s. and will revise the manuscript accordingly.

read point-by-point responses
  1. Referee: Abstract and model setup: the definition of O_n presupposes that the first shared mutation epoch exists and is finite almost surely, yet no regularity conditions on the finite measure Lambda (such as Lambda({0})<1 or positive total mass ensuring the rate function psi(x) is strictly positive) are stated. This assumption is load-bearing for both the asymptotic statements and the moment recursion, since its failure would render O_n undefined for general Lambda.

    Authors: We agree with the referee that the current manuscript does not explicitly state the necessary conditions on Λ. In the revised version we will add the standing assumption that Λ is a finite measure on [0,1] satisfying Λ({0}) < 1 (or, more generally, that the rate function ψ satisfies ψ(1) > 0) together with a positive mutation rate θ > 0. These conditions guarantee that the first shared mutation epoch is finite a.s. and render both the asymptotic results and the moment recursions rigorous for the class of measures under consideration. revision: yes

Circularity Check

0 steps flagged

Direct stochastic analysis of defined process O_n with no circular reduction

full rationale

The paper defines O_n explicitly from the Lambda-n-coalescent with mutation and derives its asymptotics (n to infinity) and moment recursion for finite n via standard analysis of the underlying exchangeable partition process. No quoted step equates a claimed prediction or result to a fitted input, self-citation, or ansatz by construction; the recursion and limits follow from the process rates and mutation mechanism without reducing to the target quantities themselves. Citations [2] and [10] supply context for the upper-bound interpretation and applications but are not invoked to justify the derivations of the asymptotics or recursion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated in the provided text.

pith-pipeline@v0.9.0 · 5645 in / 1114 out tokens · 17290 ms · 2026-05-25T14:41:53.874171+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references · 25 canonical work pages

  1. [1]

    Barbour, and Simon Tavar´ e.Logarithmic combinatorial structures: A probabilistic approach

    Richard Arratia, Andrew D. Barbour, and Simon Tavar´ e.Logarithmic combinatorial structures: A probabilistic approach. European Mathematical Society (EMS), Z¨ urich, 2003

  2. [2]

    Blum and Olivier Fran¸ cois

    Michael G.B. Blum and Olivier Fran¸ cois. Minimal clade size and external branch length under the neutral coalescent. Adv. in Appl. Probab. , 37(3):647–662, 06 2005

  3. [3]

    On the length distribution of external branches in coalescence trees: Genetic diversity within species

    Amke Caliebe, Ralph Neininger, Michael Krawczak, and Uwe R¨ osler. On the length distribution of external branches in coalescence trees: Genetic diversity within species. Theor. Pop. Biol. , 72(2):245 – 252, 2007

  4. [4]

    Desai, Aleksandra M

    Michael M. Desai, Aleksandra M. Walczak, and Daniel S. Fisher. Genetic diver- sity and the structure of genealogies in rapidly adapting populations. Genetics, 193(2):565–585, 2013

  5. [5]

    On the length of an external branch in the Beta-coalescent

    Jean-St´ ephane Dhersin, Fabian Freund, Arno Siri-J´ egousse, and Linglong Yuan. On the length of an external branch in the Beta-coalescent. Stochastic Process. Appl. , 123(5):1691–1715, 2013

  6. [6]

    Coalescent processes when the distribution of off- spring number among individuals is highly skewed.Genetics, 172(4):2621–2633, 2006

    Bjarki Eldon and John Wakeley. Coalescent processes when the distribution of off- spring number among individuals is highly skewed.Genetics, 172(4):2621–2633, 2006

  7. [7]

    Cannings models, populations size changes and multiple-merger coa- lescents

    Fabian Freund. Cannings models, populations size changes and multiple-merger coa- lescents. Preprint on Arxiv , 2019

  8. [8]

    On the size of the block of 1 for Ξ-coalescents with dust

    Fabian Freund and Martin M¨ ohle. On the size of the block of 1 for Ξ-coalescents with dust. Modern Stoch. Theory Appl. , 4(4):407–425, 2017. 12 FABIAN FREUND AND ARNO SIRI-J ´EGOUSSE

  9. [9]

    Minimal clade size in the Bolthausen- Sznitman coalescent

    Fabian Freund and Arno Siri-J´ egousse. Minimal clade size in the Bolthausen- Sznitman coalescent. J. Appl. Probab., 51(3):657–668, 2014

  10. [10]

    Distinguishing coalescent models - which sta- tistics matter most? Preprint on Biorxiv , 2019

    Fabian Freund and Arno Siri-J´ egousse. Distinguishing coalescent models - which sta- tistics matter most? Preprint on Biorxiv , 2019

  11. [11]

    Λ-coalescents: a survey

    Alexander Gnedin, Alexander Iksanov, and Alexander Marynych. Λ-coalescents: a survey. J. Appl. Probab. , 51A(Celebrating 50 Years of The Applied Probability Trust):23–40, 2014

  12. [12]

    Griffiths and Simon Tavare

    Robert C. Griffiths and Simon Tavare. Sampling theory for neutral alleles in a varying environment. Philos. Trans. R. Soc. Lond. B Biol. Sci. , 344(1310):403–410, 1994

  13. [13]

    John F. C. Kingman. The coalescent. Stochastic Process. Appl., 13(3):235–248, 1982

  14. [14]

    John F.C. Kingman. Random discrete distributions. J. Royal Stat. Soc. B, 37(1):1–15, 1975

  15. [15]

    John F.C. Kingman. Poisson processes. Wiley Online Library, 1993

  16. [16]

    Hildebrandt, Guillaume Achaz, and Jeffrey D

    Sebastian Matuszewski, Marcel E. Hildebrandt, Guillaume Achaz, and Jeffrey D. Jensen. Coalescent processes with skewed offspring distributions and non-equilibrium demography. Genetics, 208(1):323–338, 2018

  17. [17]

    Neher and Oskar Hallatschek

    Richard A. Neher and Oskar Hallatschek. Genealogies of rapidly adapting popula- tions. Proc. Natl. Acad. Sci. USA , 110(2):437–442, 2013

  18. [18]

    Coalescents with multiple collisions

    Jim Pitman. Coalescents with multiple collisions. Ann. Probab., 27(4):1870–1902, 1999

  19. [19]

    Theory predicts the uneven distribution of genetic diversity within species

    Erik Rauch and Yaneer Bar-Yam. Theory predicts the uneven distribution of genetic diversity within species. Nature, 431:449–452, 2004

  20. [20]

    The general coalescent with asynchronous mergers of ancestral lines

    Serik Sagitov. The general coalescent with asynchronous mergers of ancestral lines. J. Appl. Probab., 36(4):1116–1125, 1999

  21. [21]

    Coalescent processes obtained from supercritical Galton-Watson processes

    Jason Schweinsberg. Coalescent processes obtained from supercritical Galton-Watson processes. Stochastic Process. Appl., 106(1):107–139, 2003

  22. [22]

    Rigorous results for a population model with selection II: ge- nealogy of the population

    Jason Schweinsberg. Rigorous results for a population model with selection II: ge- nealogy of the population. Electron. J. Probab., 22, 2017

  23. [23]

    Asymptotics of the minimal clade size and related functionals of certain beta-coalescents

    Arno Siri-J´ egousse and Linglong Yuan. Asymptotics of the minimal clade size and related functionals of certain beta-coalescents. Acta Appl. Math. , 142:127–148, 2016

  24. [24]

    A note on the small-time behaviour of the largest block size of beta n-coalescents

    Arno Siri-J´ egousse and Linglong Yuan. A note on the small-time behaviour of the largest block size of beta n-coalescents. In XII Symposium of Probability and Sto- chastic Processes, volume 73 of Progr. Probab., pages 219–234. Birkh¨ auser/Springer, Cham, 2018

  25. [25]

    Spence, John A

    Jeffrey P. Spence, John A. Kamm, and Yun S. Song. The site frequency spectrum for general coalescents. Genetics, 202(4):1549–1561, 2016. Crop Plant Biodiversity and Breeding Informatics Group (350b), Institute of Plant Breeding, Seed Science and Population Genetics, University of Hohen- heim, Fruwirthstrasse 21, 70599 Stuttgart, Germany E-mail address: fab...