Bayesian Credible Sets for Phylogenetic Tree Topologies with Applications to Coverage Analysis and Cross-Model Comparison
Pith reviewed 2026-05-22 14:04 UTC · model grok-4.3
The pith
New methods using Conditional Clade Distributions compute credible levels for any phylogenetic tree topology.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Credible levels of individual tree topologies and subtrees can be estimated directly from a Conditional Clade Distribution without relying on raw sample frequencies, and an alpha credible CCD is the set of highest-probability trees that together carry exactly alpha posterior probability mass.
What carries the argument
Conditional Clade Distribution (CCD), a factorized probability model over tree topologies built from conditional clade probabilities that remains computationally tractable.
If this is right
- Any sampled tree topology can be assigned a numeric credible level in linear time relative to the number of clades.
- Credible sets can be constructed for subtrees, allowing focused uncertainty statements on particular clades.
- Rank-uniformity checks become possible by plotting the empirical distribution of credible levels against the uniform distribution.
- Different CCD parameterizations can be ranked by how well their credible sets achieve nominal coverage in repeated simulations.
Where Pith is reading between the lines
- The same machinery could supply topology-aware diagnostics when comparing non-nested phylogenetic models.
- Reporting a credible CCD alongside the maximum a posteriori tree would give readers a direct sense of remaining topological uncertainty.
- Extensions to time-calibrated trees might allow credible sets on divergence times conditional on topology.
Load-bearing premise
The Conditional Clade Distribution must approximate the true posterior over tree topologies closely enough that its probability assignments remain meaningful for credible-set construction.
What would settle it
On simulated data where the true posterior over topologies is known exactly, the estimated credible levels for topologies would fail to match their nominal coverage if the approximation is too coarse.
read the original abstract
Credible intervals and credible sets, such as highest posterior density (HPD) intervals, form an integral statistical tool in Bayesian phylogenetics, both for phylogenetic analyses and for development. Readily available for continuous parameters such as base frequencies and clock rates, the vast and complex space of tree topologies poses significant challenges for defining analogous credible sets. Traditional frequency-based approaches are inadequate for diffuse posteriors where sampled trees are often unique. To address this, we introduce novel and efficient methods for estimating the credible level of individual tree topologies using tractable tree distributions, specifically Conditional Clade Distribution (CCD). Furthermore, we propose a new concept called $\alpha$ credible CCD, which encapsulates a CCD whose trees collectively make up $\alpha$ probability. We present algorithms to compute these credible CCDs efficiently and to determine credible levels of tree topologies as well as of subtrees. We evaluate the accuracy of these credible set methods leveraging simulated and real datasets. Furthermore, to demonstrate the utility of our methods, we use well-calibrated simulation studies to evaluate the performance of different CCD models. In particular, we show how the credible set methods can be used to conduct rank-uniformity validation and produce Empirical Cumulative Distribution Function (ECDF) plots, supplementing standard coverage analyses for continuous parameters.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript addresses challenges in defining Bayesian credible sets for phylogenetic tree topologies, where traditional frequency-based methods fail in diffuse posteriors with many unique sampled trees. It introduces efficient methods to estimate credible levels of individual topologies via tractable Conditional Clade Distributions (CCD), proposes the new concept of an α-credible CCD (a CCD whose trees sum to α probability mass), develops algorithms for computing these sets and per-topology/subtree credible levels, evaluates accuracy on simulated and real datasets, and demonstrates utility for rank-uniformity validation and ECDF plots to supplement coverage analyses and enable cross-model comparison of CCD variants.
Significance. If the CCD approximation holds, the work offers a practical advance for quantifying uncertainty over tree topologies in Bayesian phylogenetics, particularly for large or diffuse posteriors. The algorithmic focus on efficient computation and the application to model validation (via rank-uniformity) are strengths; the latter provides a falsifiable check on CCD assumptions that could improve downstream phylogenetic analyses. Reproducible simulation studies and ECDF diagnostics add value for the field.
major comments (2)
- [Simulation studies and results sections] The accuracy evaluation and rank-uniformity validation (described in the simulation studies and results sections) are performed entirely within the CCD model family, using CCD-derived probabilities both to generate data and to compute credible levels/ECDFs. This does not constitute an external check against an independent ground-truth posterior (e.g., exhaustive enumeration for ≤12 taxa or converged MCMC with ESS >10^5 unique topologies). If unmodeled higher-order clade dependencies cause CCD to mis-rank or mis-assign mass to topologies, the reported credible levels and coverage diagnostics will be mis-calibrated even if ECDFs appear uniform internally.
- [Methods (definition of α-credible CCD and credible level estimation)] The central construction of α-credible CCD sets and per-topology credible levels (introduced after the abstract and formalized in the methods) treats CCD marginal probabilities as stand-ins for true posterior probabilities. The manuscript reports good performance on simulated/real data but provides no direct calibration test in diffuse regimes where the CCD factorization assumption is most likely to break; this is load-bearing for the claim that the methods support meaningful credible sets and cross-model comparison.
minor comments (2)
- [Abstract and introduction] Clarify in the abstract and introduction whether the CCD models used for credible-set estimation are the same as those being validated in the cross-model comparison, to avoid any appearance of circularity in the experimental design.
- [Algorithms section] Add a brief discussion of computational complexity or runtime scaling for the proposed algorithms on larger taxon sets, as this is relevant for the cs.DS audience.
Simulated Author's Rebuttal
We thank the referee for their constructive comments, which help clarify the scope and limitations of our proposed methods for computing credible sets and levels using Conditional Clade Distributions. We address each major comment below and indicate where revisions will be made to improve clarity and acknowledge assumptions.
read point-by-point responses
-
Referee: [Simulation studies and results sections] The accuracy evaluation and rank-uniformity validation (described in the simulation studies and results sections) are performed entirely within the CCD model family, using CCD-derived probabilities both to generate data and to compute credible levels/ECDFs. This does not constitute an external check against an independent ground-truth posterior (e.g., exhaustive enumeration for ≤12 taxa or converged MCMC with ESS >10^5 unique topologies). If unmodeled higher-order clade dependencies cause CCD to mis-rank or mis-assign mass to topologies, the reported credible levels and coverage diagnostics will be mis-calibrated even if ECDFs appear uniform internally.
Authors: We appreciate this point regarding the internal nature of the validation. The simulation framework is deliberately constructed within the CCD family to isolate and test the rank-uniformity property as a diagnostic for different CCD variants, allowing direct assessment of whether the assigned probabilities produce the expected uniform ranks under the model. This serves as a falsifiable check on the CCD assumptions themselves, which is useful for cross-model comparison as described in the manuscript. We acknowledge that this does not constitute an external validation against an independent ground truth and that higher-order dependencies could lead to miscalibration relative to the true posterior. We will revise the simulation studies and discussion sections to explicitly state this scope and limitation, including a note that the methods are intended for use when CCD provides a reasonable approximation, and we will suggest small-tree exhaustive enumeration as a direction for future external calibration studies. revision: partial
-
Referee: [Methods (definition of α-credible CCD and credible level estimation)] The central construction of α-credible CCD sets and per-topology credible levels (introduced after the abstract and formalized in the methods) treats CCD marginal probabilities as stand-ins for true posterior probabilities. The manuscript reports good performance on simulated/real data but provides no direct calibration test in diffuse regimes where the CCD factorization assumption is most likely to break; this is load-bearing for the claim that the methods support meaningful credible sets and cross-model comparison.
Authors: We thank the referee for identifying this foundational assumption. The α-credible CCD and per-topology credible levels are explicitly defined with respect to the CCD probabilities, and the real-data experiments illustrate practical application even when the true posterior is inaccessible. The rank-uniformity and ECDF diagnostics provide an indirect calibration mechanism by verifying consistency properties that should hold if the CCD marginals are well-specified. We agree that direct tests in highly diffuse regimes are difficult to perform at scale. We will revise the methods section to more prominently articulate the CCD factorization assumption, the conditions under which the credible sets are expected to be meaningful, and the role of the validation diagnostics in supporting cross-model comparisons of CCD variants. revision: partial
Circularity Check
No significant circularity in derivation of α-credible CCD sets
full rationale
The paper defines α-credible CCD sets and per-topology credible levels by direct application of the existing Conditional Clade Distribution factorization to the topology probability space. Credible levels are computed from the CCD marginals, and accuracy is assessed via separate simulation studies and real-data ECDF plots that compare against the generating model rather than re-using the same fitted CCD quantities as both input and output. No quoted step reduces a claimed prediction or uniqueness result to a self-citation or to a parameter fitted from the target quantity itself. The central constructions therefore retain independent content relative to the inputs.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption The Conditional Clade Distribution (CCD) provides a tractable approximation to the posterior distribution over tree topologies.
Forward citations
Cited by 1 Pith paper
-
Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction
Omni-DuplexEval creates a new benchmark and LLM-as-a-Judge framework for real-time duplex omni-modal interaction, revealing that current models score below 40% overall and struggle especially with proactive responses.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.