The multiply iterated LIL is derived as the minimax boundary of a sequential-detection game whose equalizer prior is the Jeffreys prior selected by the Erdős-Kolmogorov integral test, yielding a closed-form 3/2 coefficient correction.
Position: Don’t Use the CLT in LLM Evals With Fewer Than a Few Hundred Datapoints
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
verdicts
UNVERDICTED 2representative citing papers
A Dirichlet-prior Bayesian estimator for model success probability replaces Pass@k, delivering faster-converging and more stable rankings with credible intervals on math benchmarks.
citing papers explorer
-
The multiply iterated law of the iterated logarithm: game-theoretic foundations of sequential detection boundaries
The multiply iterated LIL is derived as the minimax boundary of a sequential-detection game whose equalizer prior is the Jeffreys prior selected by the Erdős-Kolmogorov integral test, yielding a closed-form 3/2 coefficient correction.
-
Don't Pass@k: A Bayesian Framework for Large Language Model Evaluation
A Dirichlet-prior Bayesian estimator for model success probability replaces Pass@k, delivering faster-converging and more stable rankings with credible intervals on math benchmarks.