pith. machine review for the scientific record. sign in

arxiv: 2604.09660 · v1 · submitted 2026-03-30 · 📊 stat.AP

Recognition: 2 theorem links

· Lean Theorem

Overdispersed and Markovian Children

Nils Lid Hjort

Pith reviewed 2026-05-14 01:12 UTC · model grok-4.3

classification 📊 stat.AP
keywords birth gender ratiosoverdispersionMarkov dependencesibling sequencesbinomial modelstatistical powersex ratio variation
0
0 comments X

The pith

Birth gender sequences show small but detectable deviations from independent coin tosses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that while births are close to 50-50, careful analysis of large datasets reveals modest imbalances in gender probabilities, variation from one family to the next, slight dependence between consecutive births, and more all-boy or all-girl families than a pure binomial model predicts. A sympathetic reader would care because these findings point to subtle biological or social influences shaping family composition that the independent-toss model misses. The author also uses the example to show how sample size affects the ability to detect small effects through p-values and statistical power. The overall message is that the simple random model is a good first approximation but not the full story.

Core claim

The simple binomial model of independent 50-50 births is not entirely correct; the coins of fate are slightly imbalanced, they vary from family to family, there is slight dependence in the sequence of a child's genders, and there are slightly more only-girls and only-boys families than binomial conditions predict.

What carries the argument

Extensions of the binomial model that add overdispersion across families and Markovian dependence within sibling sequences to capture the observed patterns.

If this is right

  • Large datasets are required to detect these small effects reliably.
  • P-values shrink with increasing sample size even for tiny underlying deviations.
  • Excess only-boy and only-girl families arise directly from the overdispersed and dependent structure.
  • Statistical power grows with sample size, allowing detection of modest biological signals.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • These patterns could connect to evolutionary questions about sex-ratio adjustment in populations.
  • Similar modeling approaches might apply to other sequential traits such as birth spacing or health outcomes.
  • Long-term population registers could test whether the biases change across generations or cultures.
  • Forecasting tools for family gender composition could incorporate the adjusted probabilities.

Load-bearing premise

The observed deviations reflect genuine biological or social processes rather than artifacts from data collection, sampling bias, or model misspecification in the analysis.

What would settle it

Reanalysis of the same family datasets with different modeling choices or data subsets that eliminates the detected overdispersion and sequence dependence would falsify the central claim.

Figures

Figures reproduced from arXiv: 2604.09660 by Nils Lid Hjort.

Figure 1
Figure 1. Figure 1: y = 5 girls, m − y = 0 boys. Introduction Once upon a time I saw these numbers in a book in the statistics library, catching my immediate attention: 264, 1655, 4948, 8498, 10263, 7603, 3951, 1152, 161. These ∗material partly from a FocuStat Blog Post, August 2018; in this modified form April 2026 for wider channels 1 arXiv:2604.09660v1 [stat.AP] 30 Mar 2026 [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The power function, the probability of claiming that the true [PITH_FULL_IMAGE:figures/full_fig_p005_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: One of the tables in Geißler (1889), concerning families with eight children. [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The 0.95 and 0.99 quantiles (black and red) for the null distribution of [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The confidence curve cc(σ0) for the overdispersion parameter σ0; the point estimate is 0.0538 and the 95 % interval is [0.0490, 0.0583]. Pure binomial, the same p, means σ0 = 0. These three estimation approaches give rather similar results for the 8-children families from 19th century Sachsen. The minimum chi-squared method consists in 9 [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Simulated log-likelihood ℓ ∗ (k), with values obtained at a grid of k values, using 105 simulations for each, followed by a 4th order polymomial approximation to compute the maximiser. Let me now introduce my model for the Markovian children, where the gender of your next child depends (but only slightly, as it turns out) on the gender of your currently last child. Let (q0, p0) = (1 − p0, p0) be the long-t… view at source ↗
Figure 7
Figure 7. Figure 7: The probability P(girl), black line, with 95 % confidence intervals, appears to go a bit down, as a function of family size, but the null hypothesis of P(girl) staying the same is not siginificantly rejected. scopes can pick up even tiny effects and discrepancies from what we might think of as Nature’s default operations. 7 Concluding comments A: The quality of the Geißler (1889) data. Let me first point t… view at source ↗
Figure 8
Figure 8. Figure 8: Estimated overdispersion standard deviation [PITH_FULL_IMAGE:figures/full_fig_p015_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Power of the Wn test, growing with the number n of families. I needed about 4000 volunteer families of size 8 in order to be satisfied with my Wn testing power. These numbers are a bit bigger than what the binomial view of the world would predict, namely 192 and 117. With the beta-binomial model I can compute the theoretical over-representation ratios for all-boys and for all-girls, as follows. 2 1.007 1.0… view at source ↗
Figure 10
Figure 10. Figure 10: Beta densities for p for three families: for the overall population (black, in the middle); for Kristin Lavransdatter (red, left); for a family with five sons (green, right). all. However, Sir David Spiegelhalter (2015a, 2015b) comes to the rescue, and finds that the old myth is not a myth, but the Real Thing, complete with a reasonable statistical argument (related to random visits inside the menstrual c… view at source ↗
Figure 11
Figure 11. Figure 11: y = 0 girls, m − y = 5 boys. 18 [PITH_FULL_IMAGE:figures/full_fig_p018_11.png] view at source ↗
read the original abstract

Take a look around you -- in your family, your school or workplace, in the streets, and you see boys & girls in about equal proportion, and without any easily visible gender patterns in case of siblings. So, to the famous first order of statistical approximation, we're all the results of hierarchical cascades of independent coin tosses through history, with each little fate determined by a 0.50-0.50 coin. This is not entirely correct, as one discovers with careful analysis and enough data: the coins of fate are (a little) imbalanced; they vary (a little) from family to family; there is a (slight) dependence in your children's gender sequence; and there are (slightly) more only-girls and only-boys families than predicted from binomial conditions. In this article I use the opportunity to talk also about how sample sizes influence p-values and statistical detection power.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that sibling gender sequences deviate mildly from independent fair-coin (binomial) tosses: the per-child probability is slightly imbalanced, varies across families (overdispersion), exhibits slight Markov dependence, and produces excess all-boy and all-girl families relative to the binomial null. It also discusses how sample size affects p-value detection power for these effects.

Significance. If substantiated with transparent data, models, and robustness checks against selection artifacts, the work would provide a concrete empirical illustration of overdispersion and weak dependence in a large-scale demographic process, useful for teaching statistical power, model misspecification, and the limits of the binomial approximation in applied statistics.

major comments (2)
  1. [Abstract] Abstract: the central claims (imbalance, family-to-family variation, Markov dependence, excess single-gender families) are stated without any data source, sample size, likelihood function, or numerical results, so the evidence supporting them cannot be evaluated from the provided text.
  2. [Discussion] The discussion of sample-size effects on p-values does not examine whether the chosen moment conditions or likelihood remain valid once plausible selection mechanisms (truncation at small family sizes, differential reporting of mixed vs. single-gender sibships, or gender-preference stopping rules) are introduced; such mechanisms can generate the reported patterns under pure independence.
minor comments (1)
  1. [Abstract] Notation for the overdispersion and transition parameters should be defined explicitly before any numerical claims are made.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive and detailed comments, which have helped us identify areas where the manuscript can be strengthened. We address each major comment point by point below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claims (imbalance, family-to-family variation, Markov dependence, excess single-gender families) are stated without any data source, sample size, likelihood function, or numerical results, so the evidence supporting them cannot be evaluated from the provided text.

    Authors: We agree that the abstract would benefit from greater specificity to allow readers to assess the empirical basis of the claims. In the revised manuscript we will expand the abstract to reference the underlying demographic dataset, the overall sample size of families analyzed, the likelihood framework (including the overdispersed binomial component and first-order Markov transition model), and the principal quantitative results that support the reported mild imbalance, family-level heterogeneity, sequential dependence, and excess single-gender families. revision: yes

  2. Referee: [Discussion] The discussion of sample-size effects on p-values does not examine whether the chosen moment conditions or likelihood remain valid once plausible selection mechanisms (truncation at small family sizes, differential reporting of mixed vs. single-gender sibships, or gender-preference stopping rules) are introduced; such mechanisms can generate the reported patterns under pure independence.

    Authors: This is a substantive concern. Our current discussion of sample-size effects on detection power assumes the data-generating process follows the modeled overdispersion and Markov structure without additional selection. We will revise the discussion to incorporate a dedicated robustness subsection that analytically and via simulation evaluates the impact of truncation at small family sizes, differential reporting of sibship compositions, and gender-preference stopping rules on the moment conditions and likelihood. The revision will clarify under which conditions these mechanisms can produce the observed patterns under pure independence and will show that the primary conclusions remain supported after accounting for plausible levels of such selection in the data source used. revision: yes

Circularity Check

0 steps flagged

No significant circularity; analysis rests on external data

full rationale

The paper is an applied statistical analysis of family gender sequence data, claiming mild overdispersion, Markov dependence, and excess single-gender families relative to binomial baselines. No derivation chain, equations, or model-fitting steps are exhibited that reduce any 'prediction' to a fitted parameter by construction. Claims are presented as empirical findings from data rather than self-definitional or self-citation-dependent results. The discussion of sample-size effects on p-values is methodological commentary and does not create circularity in the core statistical conclusions. This is the expected honest outcome for a data-driven applied paper without internal tautological reductions.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are mentioned in the abstract.

pith-pipeline@v0.9.0 · 5440 in / 923 out tokens · 55047 ms · 2026-05-14T01:12:11.989432+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

14 extracted references · 14 canonical work pages

  1. [1]

    and Hjort, N.L

    Claeskens, G. and Hjort, N.L. (2008).Model Selection and Model Averaging.Cambridge University Press

  2. [2]

    Edwards, A.W.F. (1958). An analysis of Geissler’s data on the human sex ratio.Annals of Human Genetics, vol. 23, 6–15

  3. [3]

    Edwards, A.W.F. (2005). Sexes and statistics.Significance, vol. 2, issue 4, 185–186. 17

  4. [4]

    (2025).The Latin Square: Essays in defence of R.A

    Edwards, A.W.F. (2025).The Latin Square: Essays in defence of R.A. Fisher.Cam Rivers

  5. [5]

    (1930).The Genetical Theory of Natural Selection.Clarendon Press, London

    Fisher, R.A. (1930).The Genetical Theory of Natural Selection.Clarendon Press, London

  6. [6]

    Hjort, N.L. (2016). Recruitment Dynamics and Stock Variability: The Johan Hjort Sympo- sium, some personal reflections.FocuStat Blog Post

  7. [7]

    Hjort, N.L. (2019). Your Mother is Alive with Probability One Half.FocuStat Blog Post

  8. [8]

    and Jullum, M

    Hjort, N.L. and Jullum, M. (2018). Categorical model selection. Manuscript

  9. [9]

    and Stoltenberg, E.Aa

    Hjort, N.L. and Stoltenberg, E.Aa. (2026).Statistical Inference: 600 Exercises and 100 Stories.Cambridge University Press

  10. [10]

    Klotz, J. (1972). Markov chain clustering of births by sex. Proceedings of the Sixth Berkeley Symposium on Mathematical Statistics, vol. 4, 173–185

  11. [11]

    Klotz, J. (1973). Statistical inference in Bernoulli trials with dependence.Annals of Statis- tics, vol. 1, 373–379

  12. [12]

    and Altham, P.M.E

    Lindsey, J.K. and Altham, P.M.E. (1998). Analysis of the human sex ratio by using overdis- persion models.Applied Statistics, vol. 47, 149–157

  13. [13]

    Nichols, J.B. (1905). The sex-composition of human families.American Anthropologist (New Series), vol. 7, 24–36

  14. [14]

    and Hjort, N.L

    Schweder, T. and Hjort, N.L. (2016).Confidence, Likelihood, Probability: Statistical Infer- ence With Confidence Distributions.Cambridge University Press, Cambridge