Near Optimal Stratified Sampling

· 2019 · cs.LG · arXiv 1906.11289

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it

open full Pith review browse 1 citing papers arXiv PDF

abstract

The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can be beneficial in such settings and can reduce the number of true labels required without compromising the evaluation accuracy. Stratified sampling exploits statistical properties (e.g., variance) across strata of the unlabeled population, though usually under the unrealistic assumption that these properties are known. We propose two new algorithms that simultaneously estimate these properties and optimize the evaluation accuracy. We construct a lower bound to show the proposed algorithms (to log-factors) are rate optimal. Experiments on synthetic and real data show the reduction in label complexity that is enabled by our algorithms.

representative citing papers

TS-Neyman: Posterior Sampling for Adaptive Stratified Estimation

stat.ME · 2026-06-07 · conditional · novelty 7.0 · 2 refs

TS-Neyman uses posterior sampling of stratum variances to implement an adaptive Neyman allocation rule that converges almost surely to the oracle proportions and achieves near-oracle efficiency in finite-strata settings.

citing papers explorer

Showing 1 of 1 citing paper after filters.

TS-Neyman: Posterior Sampling for Adaptive Stratified Estimation stat.ME · 2026-06-07 · conditional · none · ref 13 · 2 links · internal anchor
TS-Neyman uses posterior sampling of stratum variances to implement an adaptive Neyman allocation rule that converges almost surely to the oracle proportions and achieves near-oracle efficiency in finite-strata settings.

Near Optimal Stratified Sampling

fields

years

verdicts

representative citing papers

citing papers explorer