Active Hypothesis Testing under Computational Budgets with Applications to GWAS and LLM

Bowen Gang, Qi Kuang, Yin Xia

classification 📊 stat.ME

keywords valueshypothesisunderexacttestingactiveapplicationsbudget

read the original abstract

In large-scale hypothesis testing, computing exact $p$-values or $e$-values is often resource-intensive, creating a need for budget-aware inferential methods. We propose a general framework for active hypothesis testing that leverages inexpensive auxiliary statistics to allocate a global computational budget. For each hypothesis, our data-adaptive procedure probabilistically decides whether to compute the exact test statistic or a transformed proxy, guaranteeing a valid $p$-value or $e$-value while satisfying the exact budget constraint. Theoretical guarantees are established for our constructions, showing that the procedure achieves optimality for $e$-values and for $p$-values under independence, and admissibility for $p$-values under general dependence. Empirical results from simulations and two real-world applications, including a large-scale genome-wide association study (GWAS) and a clinical prediction task leveraging large language models (LLM), demonstrate that our framework improves statistical efficiency under fixed resource limits.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Learning U-Statistics with Active Inference
stat.ML 2026-05 unverdicted novelty 6.0

Active inference framework for U-statistics using augmented IPW to optimize label queries and minimize variance under budget constraints.