Recognition: 2 theorem links
· Lean TheoremCentral limit theorem for the homozygosity of the hierarchical Pitman-Yor process
Pith reviewed 2026-05-13 02:52 UTC · model grok-4.3
The pith
The hierarchical Pitman-Yor process obeys a central limit theorem for homozygosity and related power-sum statistics on its weights as concentration parameters tend to infinity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We prove a central limit theorem for the family of power sum symmetric polynomials in the weights of the hierarchical Pitman-Yor process when the concentration parameters tend to infinity. Explicit formulas are derived for the asymptotic variances, which display the separate influence of each level in the hierarchical construction. These results are obtained via moment calculations based on the exchangeable partition probability function of the process and are more demanding than the corresponding statements for the hierarchical Dirichlet process while making the power-law features of the model more apparent.
What carries the argument
The exchangeable partition probability function of the hierarchical Pitman-Yor process, together with recursive moment calculations for the weight vectors at each level of the hierarchy.
If this is right
- The asymptotic sampling formulas for the process are Gaussian after normalization.
- Each component of the hierarchy contributes additively or through specific products to the asymptotic variance.
- The results apply directly to understanding fluctuations in clustered data models with power-law cluster sizes.
- The approach extends previous work on the hierarchical Dirichlet process to the more general Pitman-Yor setting.
Where Pith is reading between the lines
- These variance formulas might be used to construct asymptotic confidence intervals for estimated homozygosity in finite samples from hierarchical models.
- Similar CLTs could hold for other functionals of the weights beyond power sums.
- In applications to population genetics, the theorem would predict the distribution of homozygosity measures under hierarchical Pitman-Yor priors.
Load-bearing premise
The hierarchical Pitman-Yor process uses its standard two-parameter construction at each level, and the limit is taken with all concentration parameters diverging to infinity while the discount parameters remain fixed.
What would settle it
A simulation study with increasingly large concentration parameters where the normalized homozygosity statistic fails to approach a normal distribution with the predicted variance would disprove the central limit theorem.
read the original abstract
The hierarchical Pitman-Yor process is a discrete random measure used as a prior in Bayesian nonparametrics. It is motivated by the study of groups of clustered data exhibiting power law behavior. Our focus in this paper is on the Gaussian behavior of a family of statistics, namely the power sum symmetric polynomials for the vector of weights of the process, as the concentration parameters tend to infinity. We establish a central limit theorem and obtain explicit representations for the asymptotic variance, with the latter clearly showing the impact of each component in the hierarchical structure. These results are crucial for understanding the asymptotic behavior of the sampling formulas associated with the process. In comparison with the known results for the hierarchical Dirichlet process, the results for the hierarchical Pitman-Yor process are mathematically more challenging and structurally more revealing of power law behavior.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper establishes a central limit theorem for the power-sum symmetric polynomials (including homozygosity) of the weights of the hierarchical Pitman-Yor process as all concentration parameters tend to infinity. Explicit asymptotic variance formulas are derived that decompose the contributions from each level of the hierarchy. The argument relies on the exchangeable partition probability function of the process together with direct moment calculations, and the results are presented as more technically demanding than the corresponding statements for the hierarchical Dirichlet process due to the power-law features.
Significance. If the central limit theorem and variance expressions hold, the work is significant for Bayesian nonparametrics because it supplies Gaussian fluctuations and interpretable variances for key functionals of hierarchical random measures that exhibit power-law cluster sizes. The explicit separation of hierarchical contributions in the variance is a concrete strength that can inform the analysis of sampling formulas and statistical procedures built on these priors. The extension beyond the Dirichlet case is a natural and useful step in the literature on exchangeable random partitions.
minor comments (2)
- [Abstract] The abstract refers to 'a family of statistics, namely the power sum symmetric polynomials' but does not list the precise collection considered (e.g., which exponents p are treated); adding an explicit enumeration or reference to the relevant definition would improve clarity.
- [Introduction] The comparison with the hierarchical Dirichlet process is mentioned but would benefit from a short paragraph in the introduction that recalls the known variance formulas for the Dirichlet case and highlights the new technical obstacles introduced by the Pitman-Yor discount parameter.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our manuscript and for recommending minor revision. The referee's summary accurately captures the paper's contribution: a central limit theorem for the power-sum symmetric polynomials (including homozygosity) of the weights in the hierarchical Pitman-Yor process, together with explicit asymptotic variances that separate the hierarchical contributions and reflect the power-law features. We appreciate the recognition that these results extend the Dirichlet case in a technically more demanding setting and that the variance decomposition is useful for Bayesian nonparametrics.
Circularity Check
No significant circularity; derivation self-contained from definition and standard theorems
full rationale
The paper derives a CLT for power-sum symmetric polynomials (including homozygosity) of the hierarchical Pitman-Yor weights in the regime where concentration parameters tend to infinity. It starts from the explicit EPPF of the hierarchical process and performs direct moment calculations on the weights, invoking standard limit theorems for exchangeable partitions. No step reduces by construction to a fitted parameter renamed as prediction, a self-definitional loop, or a load-bearing self-citation whose content is itself unverified. The asymptotic variance formula separates hierarchical contributions via explicit computation rather than by ansatz or renaming. The result is independent of its own outputs and aligns with known single-level cases without internal reduction.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption The hierarchical Pitman-Yor process admits an exchangeable partition probability function with power-law behavior controlled by discount and concentration parameters.
- standard math Moments of the power sum polynomials admit asymptotic expansions as concentration parameters tend to infinity.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclearWe establish a central limit theorem and obtain explicit representations for the asymptotic variance... (Theorem 1.1)
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclearthe homozygosity of the HPYP with L groups... (Section 1.2)
Reference graph
Works this paper leans on
-
[1]
R. Argiento, A. Cremaschi, and M. Vannucci,Hierarchical normalized completely random measures to cluster grouped data, Journal of the American Statistical Association, 115 (529) (2020), 318-333
work page 2020
-
[2]
F. Camerlenghi, A. Lijoi, P. Orbanz, and I. Prunster,Distribution theory for hierarchical processes, Annals of Statistics, 47 (1) (2019), 67-92
work page 2019
-
[3]
C. A. Charalambides,Combinatorial Methods in Discrete Distributions, Wiley, 2005
work page 2005
-
[4]
D. A. Dawson and S. Feng,Asymptotic behavior of the Poisson-Dirichlet distribution for large mutation rate, Annals of Applied Probability, 16 (2) (2006), 562-582
work page 2006
-
[5]
D. A. Dawson and S. Feng,Large deviations for homozygosity, Electronic Communications in Probability, 21 (83) (2016), 1-8
work page 2016
- [6]
-
[7]
S. Feng,The Poisson-Dirichlet Distribution and Related Topics: Models and Asymptotic Behaviors, Springer, 2010
work page 2010
-
[8]
S. Feng,Hierarchical Dirichlet process and relative entropy, Electronic Communications in Probability, 28 (5) (2023), 1-12
work page 2023
-
[9]
S. Feng and F. Gao,Moderate deviations for Poisson-Dirichlet distribution, Annals of Applied Probability, 18 (5) (2008), 1794-1824
work page 2008
-
[10]
S. Feng and F. Gao,Asymptotic results for the two-parameter Poisson-Dirichlet distribution, Stochastic Processes and their Applications, 120 (2010), 1159-1177
work page 2010
-
[11]
S. Feng and J. E. Paguyo,Central limit theorems associated with the hierarchical Dirichlet process, Stochastic Processes and their Applications, 190 (2025), 104767
work page 2025
-
[12]
T. S. Ferguson,A Bayesian analysis of some nonparametric problems, Annals of Statistics, 1 (1973), 209-230
work page 1973
-
[13]
S. Ghosal and A. van der Vaart,Fundamentals of nonparametric Bayesian inference, Cambridge Series in Statistical and Probabilistic Mathematics 44, Cambridge University Press, 2017
work page 2017
-
[14]
R. C. Griffiths,On the distribution of allele frequencies in a diffusion model, Theoretical Population Biology, 15 (1979), 140-158
work page 1979
-
[15]
Handa,The two-parameter Poisson-Dirichlet point process, Bernoulli, 15 (4) (2009), 1082-1116
K. Handa,The two-parameter Poisson-Dirichlet point process, Bernoulli, 15 (4) (2009), 1082-1116. CLT FOR THE HOMOZYGOSITY OF THE HPYP 21
work page 2009
-
[16]
O. C. Herfindahl,Concentration in the U.S. Steel Industry, Unpublished doctoral dissertation, Columbia University, (1950)
work page 1950
-
[17]
A. O. Hirschman,National Power and the Structure of Foreign Trade, University of California Press, Berkeley, 1945
work page 1945
- [18]
- [19]
-
[20]
J. Pitman and M. Yor,The two-parameter Poisson-Dirichlet distribution derived from a stable subordinator, Annals of Probability, 25 (2) (1997), 855-900
work page 1997
-
[21]
E. H. Simpson,Measurement of diversity, Nature, 163 (1949), 688-688
work page 1949
-
[22]
Y. W. Teh,A hierarchical Bayesian language model based on Pitman-Yor processes, Proceedings of the 21st International Conference on Computation Linguistics and 44th Annual Meeting of the ACL, (2006), 985-992
work page 2006
-
[23]
Y. W. Teh and M. I. Jordan,Hierarchical Bayesian nonparametric models with applications, In: Bayesian Nonparametrics, Cambridge University Press, 2010
work page 2010
-
[24]
Y. W. Teh, M. I. Jordan, M. J. Beal, and D. M. Blei,Hierarchical Dirichlet processes, Journal of the American Statistical Association, 101 (476) (2006), 1566-1581. Department of Mathematics & Statistics, McMaster University, Hamilton, ON, L8S 4K1, Canada E-mail address:paguyoj@mcmaster.ca
work page 2006
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.