pith. sign in

arxiv: 2606.24001 · v1 · pith:HLYQYXSInew · submitted 2026-06-22 · 📊 stat.ME

Bayesian Mixture Models for Histograms: with Applications to Large Datasets

classification 📊 stat.ME
keywords databayesianhistogramsmixtureapplicationsdistributionmethodmixtures
0
0 comments X
read the original abstract

In many real-world scenarios, especially those involving privacy constraints or data summarization, data are available only in aggregated forms, such as histograms or frequency tables. This work introduces a novel Bayesian method for inferring the underlying population distribution by fitting a mixture model to binned data. While we focus on mixtures of normal distributions, the framework is flexible and can be extended to other distributional families. We place a prior distribution on the number of mixture components, accommodating both finite and countably infinite mixtures, and perform inference using reversible jump MCMC. The proposed approach demonstrates strong performance on large-scale data, showcasing the potential of nonparametric Bayesian modeling in practical applications. Furthermore, we extend the method to model multiple histograms simultaneously and cluster them using the Dirichlet process. This enables information sharing across populations and provides a principled posterior probability to assess homogeneity between groups. Some theoretical results supporting the performance of our proposed methodology are also discussed.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.