Learning mixtures of structured distributions over discrete domains

Ilias Diakonikolas; Rocco A. Servedio; Siu-on Chan; Xiaorui Sun

arxiv: 1210.0864 · v1 · pith:EG6E4HFBnew · submitted 2012-10-02 · 💻 cs.LG · cs.DS· math.ST· stat.TH

Learning mixtures of structured distributions over discrete domains

Siu-on Chan , Ilias Diakonikolas , Rocco A. Servedio , Xiaorui Sun This is my paper

classification 💻 cs.LG cs.DSmath.STstat.TH

keywords distributionsmathfrakalgorithmbinsdiscreteefficientgeneralhistogram

0 comments

read the original abstract

Let $\mathfrak{C}$ be a class of probability distributions over the discrete domain $[n] = \{1,...,n\}.$ We show that if $\mathfrak{C}$ satisfies a rather general condition -- essentially, that each distribution in $\mathfrak{C}$ can be well-approximated by a variable-width histogram with few bins -- then there is a highly efficient (both in terms of running time and sample complexity) algorithm that can learn any mixture of $k$ unknown distributions from $\mathfrak{C}.$ We analyze several natural types of distributions over $[n]$, including log-concave, monotone hazard rate and unimodal distributions, and show that they have the required structural property of being well-approximated by a histogram with few bins. Applying our general algorithm, we obtain near-optimally efficient algorithms for all these mixture learning problems.

This paper has not been read by Pith yet.

Learning mixtures of structured distributions over discrete domains

discussion (0)