pith. sign in

Near Optimal Stratified Sampling

1 Pith paper cite this work. Polarity classification is still indexing.

1 Pith paper citing it
abstract

The performance of a machine learning system is usually evaluated by using i.i.d.\ observations with true labels. However, acquiring ground truth labels is expensive, while obtaining unlabeled samples may be cheaper. Stratified sampling can be beneficial in such settings and can reduce the number of true labels required without compromising the evaluation accuracy. Stratified sampling exploits statistical properties (e.g., variance) across strata of the unlabeled population, though usually under the unrealistic assumption that these properties are known. We propose two new algorithms that simultaneously estimate these properties and optimize the evaluation accuracy. We construct a lower bound to show the proposed algorithms (to log-factors) are rate optimal. Experiments on synthetic and real data show the reduction in label complexity that is enabled by our algorithms.

fields

stat.ME 1

years

2026 1

verdicts

CONDITIONAL 1

clear filters

representative citing papers

TS-Neyman: Posterior Sampling for Adaptive Stratified Estimation

stat.ME · 2026-06-07 · conditional · novelty 7.0 · 2 refs

TS-Neyman uses posterior sampling of stratum variances to implement an adaptive Neyman allocation rule that converges almost surely to the oracle proportions and achieves near-oracle efficiency in finite-strata settings.

citing papers explorer

Showing 1 of 1 citing paper after filters.

  • TS-Neyman: Posterior Sampling for Adaptive Stratified Estimation stat.ME · 2026-06-07 · conditional · none · ref 13 · 2 links · internal anchor

    TS-Neyman uses posterior sampling of stratum variances to implement an adaptive Neyman allocation rule that converges almost surely to the oracle proportions and achieves near-oracle efficiency in finite-strata settings.