pith. sign in

arxiv: 2606.04551 · v1 · pith:ZRYZCNHSnew · submitted 2026-06-03 · 🧬 q-bio.PE · math.PR

Quasi-birth-and-death processes evolving within trees: Applications to comparative phylogenetics

Pith reviewed 2026-06-28 03:08 UTC · model grok-4.3

classification 🧬 q-bio.PE math.PR
keywords quasi-birth-and-death processesphylogenetic treestrait evolutionlikelihood computationrecursive algorithmcomparative phylogeneticsdiscretized traitsspeciation model
0
0 comments X

The pith

An efficient recursive algorithm computes the likelihood of an observed phylogenetic tree under a quasi-birth-and-death model of discretized trait evolution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper extends prior work on quasi-birth-and-death processes to trees by allowing the process to duplicate at speciation times. A continuous trait is discretized to serve as the level variable while the phase tracks environmental dynamics. The central contribution is a recursive algorithm that calculates the likelihood of the full tree together with any observed tip states. The method is illustrated on synthetic examples that exhibit different evolutionary patterns and then applied to two mammal phylogenies to examine range area and body size evolution under varying parameter settings.

Core claim

We develop an efficient recursive algorithm for computing the likelihood of an observed tree under this model where a QBD duplicates itself at fixed times within the tree, with the level obtained by discretizing a continuous trait and the phase modeling underlying environmental dynamics.

What carries the argument

Recursive algorithm that propagates likelihoods backward through the tree while accounting for QBD duplication at speciation nodes and transitions in levels and phases.

If this is right

  • The algorithm yields the likelihood of any observed tree and its tip levels under the duplicating QBD model.
  • Different choices of level discretization, phase rates, and duplication parameters produce a range of possible trait-evolution behaviors on the same tree.
  • The framework supports analysis of partially observed states at the tips of empirical phylogenies.
  • Application to the mammal data shows how likelihood changes with parameter values for range area and body size.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same recursive structure could be used to compute likelihoods on non-phylogenetic trees provided duplication times are known.
  • Sensitivity of the likelihood surface to the number of discretization levels could be checked by repeating the mammal analyses at finer and coarser partitions of the trait range.
  • The phase variable offers a route to incorporate hidden environmental covariates without increasing the state space of the observed trait.

Load-bearing premise

Discretizing a continuous trait to obtain the QBD level variable preserves the essential dynamics of trait evolution.

What would settle it

Running the recursive likelihood algorithm on the same mammal phylogeny once with the discretized levels and once with a fully continuous-state model, then checking whether the two produce materially different maximum-likelihood parameter estimates or tree likelihood values.

Figures

Figures reproduced from arXiv: 2606.04551 by Barbara R. Holland, Habtu Kiros Nigus, Malgorzata M. O'Reilly.

Figure 1
Figure 1. Figure 1: Species tree T ∗ (figure adapted from Soewongsono et al. [25] to the model considered here). Branching event at time t1 results in two subtrees, T ∗(0,1)(t1, t) and T ∗(1,1)(t1, t). Next, another branching event occurs at time t2, which results in two subtrees of T ∗(0,1)(t1, t), being T ∗(0,2)(t2, t) and T ∗(1,2)(t1, t), respectively. The corresponding part of the T ∗(1,1)(t1, t) that starts at time t2 is… view at source ↗
Figure 2
Figure 2. Figure 2: Species tree T ∗ (notation in Algorithm 2): The tree starts at time t(5) with parent node 5 who has (left and right) children nodes 4 = cL(5) and 3 = cR(5). Nodes 1, 2, and 3 are the tips of the tree, with 1 = cL(4) and 2 = cR(4) being the (left and right) children nodes of their parent node 4. Trait levels m(1), m(2) and m(3) are observed at tips 1, 2, and 3, respectively. Tip 3 is some fossil record. The… view at source ↗
Figure 3
Figure 3. Figure 3: Phylogenetic tree with nodes 1, . . . , 7, including tips 1, . . . , 4, internal nodes 5, 6 and parent node 7. Branch lengths are indicated along each edge. Observed trait values are indicated to the right of each tip. 5.1.1 QBD3 – Preliminaries: The effect of the mean drift γ First, to study the effect of the mean drift γ, we consider QBD3 from [PITH_FULL_IMAGE:figures/full_fig_p012_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Phylogenetic tree with nodes 1, . . . , 7, including tips 1, . . . , 4, internal nodes 5, 6 and parent node 7. Branch lengths are indicated along each edge. Observed trait values are indicated to the right of each tip. 5.2.1 QBD3 – Preliminaries: The effect of the mean drift γ First, to study the effect of the mean drift γ, we consider a QBD with the block matrices in (17) for a selected range of parameter… view at source ↗
Figure 5
Figure 5. Figure 5: A phylogenetic tree of 49 mammal species, with branch lengths shown on the edges in red, [PITH_FULL_IMAGE:figures/full_fig_p020_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Histogram of body mass (kg) showing positive skew with most values concentrated at the lower ranges. 6.1.2 QBD3 with 5 phases We performed the analysis for the r vectors #1 − #5 as shown in [PITH_FULL_IMAGE:figures/full_fig_p021_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Phylogenetic tree of 49 mammal species showing home range area values (km [PITH_FULL_IMAGE:figures/full_fig_p023_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Histogram of home range area (km2 ) showing positive skew with most values concentrated at the lower ranges. 6.2.2 QBD3 with 5 phases We performed the analysis for the r vectors #1 − #5 as shown in [PITH_FULL_IMAGE:figures/full_fig_p024_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: From top left to the bottom right: Stationary distribution of the levels in the QBD3 model in [PITH_FULL_IMAGE:figures/full_fig_p030_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: From top left to the bottom right: Stationary distribution of the phases [PITH_FULL_IMAGE:figures/full_fig_p031_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: From top left to the bottom right: Stationary distribution of the states in the QBD3 model [PITH_FULL_IMAGE:figures/full_fig_p032_11.png] view at source ↗
Figure 12
Figure 12. Figure 12: From top left to the bottom right: Stationary distribution of the traits in the QBD3 model [PITH_FULL_IMAGE:figures/full_fig_p033_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: From top left to the bottom right: The likelihood of observing tip [PITH_FULL_IMAGE:figures/full_fig_p034_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p035_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p036_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p037_16.png] view at source ↗
Figure 17
Figure 17. Figure 17: From top left to the bottom right: The likelihood of observing tip [PITH_FULL_IMAGE:figures/full_fig_p038_17.png] view at source ↗
Figure 18
Figure 18. Figure 18: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p039_18.png] view at source ↗
Figure 19
Figure 19. Figure 19: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p040_19.png] view at source ↗
Figure 20
Figure 20. Figure 20: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p041_20.png] view at source ↗
Figure 21
Figure 21. Figure 21: From top left to the bottom right: The likelihood of observing tip [PITH_FULL_IMAGE:figures/full_fig_p043_21.png] view at source ↗
Figure 22
Figure 22. Figure 22: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p044_22.png] view at source ↗
Figure 23
Figure 23. Figure 23: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p045_23.png] view at source ↗
Figure 24
Figure 24. Figure 24: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p046_24.png] view at source ↗
Figure 25
Figure 25. Figure 25: From top left to the bottom right: The likelihood of observing tip [PITH_FULL_IMAGE:figures/full_fig_p048_25.png] view at source ↗
Figure 26
Figure 26. Figure 26: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p049_26.png] view at source ↗
Figure 27
Figure 27. Figure 27: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p050_27.png] view at source ↗
Figure 28
Figure 28. Figure 28: From top left to the bottom right: The likelihood of observing the phylogenetic tree that [PITH_FULL_IMAGE:figures/full_fig_p051_28.png] view at source ↗
read the original abstract

We consider a quasi-birth-and-process (QBD) that duplicates itself at some fixed times within a tree that contains information about duplication times and potentially partially observed states. We analyse a continuous trait by discretising it to obtain the QBD level variable. Then, the phase variable is used to model the dynamics of the underlying environment. Here, we extend the framework of Soewongsono et al. to enable a more general analysis. We develop an efficient recursive algorithm for computing the likelihood of an observed tree under this model and construct several numerical examples to illustrate its application potential. Through our synthetic data examples, we show a range of potential behaviours that could be modelled with this approach. Further, we apply the framework to two empirical examples from comparative phylogenetics (the evolution of range area and body size traits across a phylogeny of 49 mammals) to gain different insights into the evolution of these continuous traits. In this setting duplication of the QBD represents speciation and continuous trait evolution is modelled in a discretised state space. In our empirical examples, we explore the impact of different parameter choices on the corresponding likelihood of observing a given phylogenetic tree and the observed levels at its tips.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript extends quasi-birth-and-death (QBD) processes to evolve on phylogenetic trees, with speciation corresponding to duplication events. A continuous trait is discretized to define the QBD level variable while the phase variable captures environmental dynamics; an efficient recursive algorithm is developed to compute the likelihood of an observed tree (with possible partial observations at tips), and the framework is illustrated on synthetic data plus two empirical mammalian phylogenies (range area and body size).

Significance. If the discretization step is shown to be a controlled approximation, the recursive likelihood algorithm would supply a new, computationally tractable way to embed state-dependent environmental effects into comparative phylogenetic models. The synthetic examples demonstrate a range of qualitative behaviors, and the empirical applications explore sensitivity to parameter choices, but the absence of any convergence analysis or error bounds for the discretization prevents assessment of whether the reported likelihoods and inferences remain reliable as approximations to continuous-trait dynamics.

major comments (2)
  1. [Abstract] Abstract (model-construction paragraph): the central modeling step obtains the QBD level by discretizing a continuous trait, yet no convergence argument, truncation-error bound, or preservation of key moments (e.g., expected change or variance) under the chosen rate matrices is supplied. Without such justification the recursive likelihood cannot be guaranteed to approximate the intended continuous-trait process on the tree.
  2. [Abstract] Abstract and empirical-examples paragraph: the two mammalian applications report likelihood sensitivity to parameter choices within the discretized model, but supply no comparison against standard continuous-trait models (Brownian motion, OU) or any diagnostic that the discretization preserves the qualitative features of the original trait data.
minor comments (1)
  1. [Abstract] Abstract: the phrase "quasi-birth-and-process (QBD)" is missing the word "death".

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments on the discretization justification and empirical comparisons. We respond to each major comment below, indicating planned revisions where appropriate.

read point-by-point responses
  1. Referee: [Abstract] Abstract (model-construction paragraph): the central modeling step obtains the QBD level by discretizing a continuous trait, yet no convergence argument, truncation-error bound, or preservation of key moments (e.g., expected change or variance) under the chosen rate matrices is supplied. Without such justification the recursive likelihood cannot be guaranteed to approximate the intended continuous-trait process on the tree.

    Authors: We acknowledge that the manuscript lacks a formal convergence analysis or explicit error bounds. In revision we will add a dedicated subsection describing how the rate matrices are constructed to match the infinitesimal mean and variance of the underlying continuous process, together with numerical convergence checks (likelihood stabilization as the number of levels grows). A complete theoretical proof of convergence to the continuous limit lies beyond the present scope and will be noted as future work with appropriate references. revision: partial

  2. Referee: [Abstract] Abstract and empirical-examples paragraph: the two mammalian applications report likelihood sensitivity to parameter choices within the discretized model, but supply no comparison against standard continuous-trait models (Brownian motion, OU) or any diagnostic that the discretization preserves the qualitative features of the original trait data.

    Authors: We agree that benchmarking would strengthen the empirical section. The revised manuscript will include likelihood comparisons under Brownian motion and Ornstein-Uhlenbeck models fitted to the same mammalian phylogenies and traits, plus moment-matching diagnostics (mean, variance, and autocorrelation) between discretized QBD simulations and the original continuous trait data. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected; derivation is self-contained algorithmic extension

full rationale

The paper's central contribution is an efficient recursive algorithm for the likelihood of an observed tree under a QBD process on a phylogeny, obtained by extending the framework of Soewongsono et al. The discretization of a continuous trait into QBD levels is presented explicitly as a modeling choice in the abstract and model construction, not as a derived result that loops back to its own inputs. No equations, fitted parameters, or self-citation chains are shown that would make any prediction or uniqueness claim equivalent to the inputs by construction. The self-citation is to prior work by different authors and serves as a starting point rather than a load-bearing justification that forces the new algorithm. The numerical examples and empirical applications explore behavior within the chosen model without reducing the likelihood computation to tautology. This is the common case of an independent algorithmic development on top of an external modeling framework.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific free parameters or axioms; the discretization step and phase-variable dynamics are treated as modeling choices whose justification is not supplied.

pith-pipeline@v0.9.1-grok · 5750 in / 1162 out tokens · 32541 ms · 2026-06-28T03:08:00.279658+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

25 extracted references

  1. [1]

    Aksamit, M

    A. Aksamit, M. M. O’Reilly, and Z. Palmowski. Sensitivities of some performance measures of quasi-birth-and-death processes.Stochastic Models, 2024

  2. [2]

    Aksamit, M

    A. Aksamit, M. M. O’Reilly, and Z. Palmowski. Random walk on a quadrant: mapping to a one-dimensional level-dependent quasi-birth-and-death process.Stochastic Models, 2025

  3. [3]

    N. G. Bean, P. K. Pollett, and P. G. Taylor. Quasistationary distributions for level-dependent quasi- birth-and-death processes.Communications in Statistics. Part C: Stochastic Models, 16(5):511–541, 2000

  4. [4]

    J. M. Beaulieu, D.-C. Jhwueng, C. Boettiger, and B. C. O’Meara. Modeling stabilizing selection: expanding the Ornstein–Uhlenbeck model of adaptive evolution.Evolution, 66(8):2369–2383, 2012

  5. [5]

    Bright and P

    L. Bright and P. G. Taylor. Calculating the equilibrium distribution in level dependent quasi-birth- and-death processes.Communications in Statistics. Stochastic Models, 11(3):497–525, 1995

  6. [6]

    M. A. Butler and A. A. King. Phylogenetic comparative analysis: a modeling approach for adaptive evolution.The american naturalist, 164(6):683–695, 2004

  7. [7]

    Den Iseger

    P. Den Iseger. Numerical transform inversion using gaussian quadrature.Probability in the Engi- neering and Informational Sciences, 20(1):1–44, 2006

  8. [8]

    Felsenstein

    J. Felsenstein. Evolutionary trees from dna sequences: a maximum likelihood approach.Journal of molecular evolution, 17:368–376, 1981

  9. [9]

    Felsenstein

    J. Felsenstein. Phylogenies and the comparative method.The American Naturalist, 125(1):1–15, 1985

  10. [10]

    Fletcher.Practical methods of optimization

    R. Fletcher.Practical methods of optimization. John Wiley & Sons, 2000

  11. [11]

    Garland, Theodore, P

    J. Garland, Theodore, P. H. Harvey, and A. R. Ives. Procedures for the Analysis of Comparative Data Using Phylogenetically Independent Contrasts.Systematic Biology, 41(1):18–32, 03 1992

  12. [12]

    T. F. Hansen. Stabilizing selection and the comparative analysis of adaptation.Evolution, 51(5):1341–1351, 1997

  13. [13]

    He.Fundamentals of matrix-analytic methods, volume 365

    Q.-M. He.Fundamentals of matrix-analytic methods, volume 365. Springer, 2014

  14. [14]

    L. S. T. Ho and C. An´ e. Intrinsic inference difficulties for trait evolution with Ornstein-Uhlenbeck models.Methods in Ecology and Evolution, 5(11):1133–1146, 2014. 27

  15. [15]

    Horv´ ath, I

    G. Horv´ ath, I. Horv´ ath, S. A.-D. Almousa, and M. Telek. Numerical inverse Laplace transformation using concentrated matrix exponential distributions.Performance Evaluation, 137:102067, 2020

  16. [16]

    Ingram and D

    T. Ingram and D. L. Mahler. SURFACE: detecting convergent evolution from comparative data by fitting Ornstein-Uhlenbeck models with stepwise AIC.Methods in ecology and evolution, 4(5):416– 425, 2013

  17. [17]

    Joyner and B

    J. Joyner and B. Fralix. A new look at Markov processes of G/M/1-type.Stochastic Models, 32(2):253–274, 2016

  18. [18]

    Latouche and V

    G. Latouche and V. Ramaswami.Introduction to matrix analytic methods in stochastic modeling. SIAM, 1999

  19. [19]

    J. A. Nelder and R. Mead. A simplex method for function minimization.The computer journal, 7(4):308–313, 1965

  20. [20]

    M. F. Neuts.Matrix-geometric solutions in stochastic models: an algorithmic approach, volume 2 ofJohns Hopkins Series in the Mathematical Sciences. Johns Hopkins University Press, 1981

  21. [21]

    H. K. Nigus.Stochastic Models for the Conservation of Endangered Species. PhD thesis, The University of Tasmania, Under preparation

  22. [22]

    Phung-Duc, H

    T. Phung-Duc, H. Masuyama, S. Kasahara, and Y. Takahashi. A simple algorithm for the rate matrices of level-dependent QBD processes. In5th International Conference on Queueing Theory and Network Applications, QTNA 2010 - Proceedings, pages 46–52, 2010

  23. [23]

    Ramaswami

    V. Ramaswami. Matrix Analytic Methods: A Tutorial Overview with Some Extensions and New Results. InMatrix-Analytic Methods in Stochastic Models (Flint, MI), volume 183 ofLecture Notes in Pure and Appl. Math., pages 261–296. Dekker, New York, 1997

  24. [24]

    L. J. Revell. phytools: an R package for phylogenetic comparative biology (and other things). Methods in Ecology and Evolution, (2):217–223, 2012

  25. [25]

    A. C. Soewongsono, J. Diao, T. Stark, A. E. Wilson, D. A. Liberles, B. R. Holland, and M. M. O’Reilly. Matrix-analytic methods for the evolution of species trees, gene trees, and their reconcil- iation.Methodology and Computing in Applied Probability, 27(1):1–47, 2025. 28 A The effect of the mean drift To illustrate the effect of the mean drift on the lik...