pith. sign in

arxiv: 2312.01386 · v2 · pith:QVZ3LUI6new · submitted 2023-12-03 · 💻 cs.LG · stat.ML

On the Suboptimality of GP-UCB under Polynomial Effective Optimism

Pith reviewed 2026-05-24 05:26 UTC · model grok-4.3

classification 💻 cs.LG stat.ML
keywords GP-UCBMatérn kernelregret lower boundeffective optimismGaussian process banditssuboptimalitysequential optimization
0
0 comments X

The pith

Polynomial growth of effective optimism prevents GP-UCB from reaching minimax-optimal regret rates with Matérn kernels.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proves that under a uniform confidence assumption, GP-UCB incurs a regret lower bound exceeding the known minimax rate whenever the effective optimism level grows polynomially, up to logarithmic factors. This regime includes the parameter choices in nearly all existing upper-bound analyses of the algorithm. A sympathetic reader would care because the result isolates a structural feature of standard GP-UCB that blocks optimality, rather than attributing the theory-practice gap solely to loose analysis techniques. The central object carrying the argument is the effective optimism level, the product of the exploration coefficient and the kernel ridge regression regularization parameter.

Core claim

Under a uniform confidence assumption, we prove a new regret lower bound for GP-UCB with Matérn kernels. The bound shows that polynomial growth of the effective optimism level, up to logarithmic factors, rules out the minimax-optimal regret rate. Since this is the regime covered by most existing analyses, our result identifies a concrete obstacle to proving minimax optimality for standard GP-UCB.

What carries the argument

effective optimism level (product of the exploration coefficient and the regularization parameter in kernel ridge regression), which controls the growth rate that triggers the lower bound

If this is right

  • Most published upper bounds on GP-UCB regret fall into the polynomial-growth regime and therefore cannot be minimax-tight.
  • The persistent gap between upper and lower bounds on GP-UCB regret may reflect an inherent limitation of the algorithm rather than analysis shortcomings.
  • Standard GP-UCB cannot be shown to achieve the minimax rate without changing its optimism-parameter schedule.
  • Any proof of minimax optimality for GP-UCB must operate outside the polynomial effective-optimism regime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Keeping the effective optimism level bounded or growing only logarithmically could be necessary for any GP-UCB variant to match minimax rates.
  • The same polynomial-growth obstacle may appear in other kernel-based upper-confidence-bound algorithms that share the same effective-optimism structure.
  • Empirical tuning of the exploration coefficient to slower growth schedules offers a testable way to check whether the lower-bound regime is avoidable in practice.

Load-bearing premise

The uniform confidence assumption is required to derive the regret lower bound.

What would settle it

A direct calculation or simulation on a Matérn kernel instance showing that GP-UCB regret stays at the minimax rate despite polynomially growing effective optimism would falsify the lower bound.

read the original abstract

Gaussian process upper confidence bound (GP-UCB) is widely used for sequential optimization of expensive black-box functions. Although many upper bounds on its cumulative regret have been established in the literature, whether GP-UCB is minimax optimal remains open. We study this question through the effective optimism level, defined as the product of the exploration coefficient and the regularization parameter in kernel ridge regression. Under a uniform confidence assumption, we prove a new regret lower bound for GP-UCB with Mat\'ern kernels. The bound shows that polynomial growth of the effective optimism level, up to logarithmic factors, rules out the minimax-optimal regret rate. Since this is the regime covered by most existing analyses, our result identifies a concrete obstacle to proving minimax optimality for standard GP-UCB. More broadly, it suggests that the gap between current upper bounds and minimax lower bounds may reflect a real limitation of the algorithm, not only of the analysis.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 0 minor

Summary. The paper claims that under a uniform confidence assumption, GP-UCB with Matérn kernels admits a new regret lower bound showing that polynomial growth (up to logs) of the effective optimism level—the product of the exploration coefficient and the regularization parameter in kernel ridge regression—precludes the minimax-optimal regret rate. The authors conclude that this regime, which covers most existing upper-bound analyses, identifies a concrete obstacle to proving minimax optimality for standard GP-UCB and suggests the upper/lower-bound gap may reflect an algorithmic limitation rather than solely an analysis gap.

Significance. If the uniform confidence assumption holds for standard GP-UCB confidence sets, the result supplies a conditional explanation for why GP-UCB may fail to achieve minimax rates and supplies an independent lower-bound construction that avoids circular dependence on fitted parameters. This is a useful diagnostic for the field even if the assumption requires further verification.

major comments (1)
  1. [Abstract] Abstract: the lower bound is derived under the uniform confidence assumption, yet the manuscript supplies no argument, reference, or verification establishing that this assumption is satisfied by the standard GP-UCB confidence sets for Matérn kernels. Because the claim that the result identifies an obstacle for standard GP-UCB rests on the assumption applying to the algorithm as implemented, this link is load-bearing for the central conclusion.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading and constructive feedback. We address the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the lower bound is derived under the uniform confidence assumption, yet the manuscript supplies no argument, reference, or verification establishing that this assumption is satisfied by the standard GP-UCB confidence sets for Matérn kernels. Because the claim that the result identifies an obstacle for standard GP-UCB rests on the assumption applying to the algorithm as implemented, this link is load-bearing for the central conclusion.

    Authors: We agree that the manuscript does not supply an argument, reference, or verification that the uniform confidence assumption holds for the standard GP-UCB confidence sets on Matérn kernels. The lower bound is derived under this assumption, and the central claim is therefore conditional on it. In the revision we will update the abstract and add a short discussion clarifying that the result identifies an obstacle only in the regime where the assumption holds, and that verifying the assumption for standard GP-UCB remains an open question. This change will be made without altering the technical contribution. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper defines the effective optimism level as the product of the exploration coefficient and regularization parameter, then derives a regret lower bound under an external uniform confidence assumption for GP-UCB with Matérn kernels. This shows that polynomial growth (up to logs) of that level precludes minimax-optimal rates. No step reduces the claimed lower bound to a quantity defined by the same fitted optimism level or by self-citation chains; the construction is presented as independent. The derivation remains self-contained against external benchmarks with no self-definitional, fitted-input, or ansatz-smuggling patterns exhibited in the abstract or described claims.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on one domain assumption extracted from the abstract; no free parameters or invented entities are mentioned.

axioms (1)
  • domain assumption uniform confidence assumption
    Invoked to prove the regret lower bound for GP-UCB.

pith-pipeline@v0.9.0 · 5687 in / 1195 out tokens · 23257 ms · 2026-05-24T05:26:29.378925+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Bayesian Optimization by Kernel Regression and Density-based Exploration

    math.OC 2025-02 unverdicted novelty 4.0

    BOKE uses kernel regression and density-based exploration in a confidence-bound acquisition function to achieve quadratic-complexity Bayesian optimization with a claimed global convergence guarantee under noise.

Reference graph

Works this paper leans on

34 extracted references · 34 canonical work pages · cited by 1 Pith paper · 1 internal anchor

  1. [1]

    , " * write output.state after.block = add.period write newline

    ENTRY address author booktitle chapter doi edition editor eid howpublished institution isbn issn journal key month note number organization pages publisher school series title type url volume year label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block FUNCTION init.state.consts #0 'before.all := #1...

  2. [2]

    write newline

    " write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in "" FUNCTION format.date year ...

  3. [3]

    Machine Learning 47(2--3):235--256

    Auer P, Cesa-Bianchi N, Fischer P (2002) Finite-time analysis of the multiarmed bandit problem. Machine Learning 47(2--3):235--256

  4. [4]

    Advances in Neural Information Processing Systems 33, 21524--21538

    Balandat M, Karrer B, Jiang DR, Daulton S, Letham B, Wilson AG, Bakshy E (2020) BoTorch : A framework for efficient Monte-Carlo Bayesian optimization. Advances in Neural Information Processing Systems 33, 21524--21538

  5. [5]

    Proceedings of the 31st Conference on Learning Theory, 1348--1361

    Belkin M (2018) Approximation beats concentration? An approximation view on inference with smooth radial kernels. Proceedings of the 31st Conference on Learning Theory, 1348--1361

  6. [6]

    Annals of Statistics 24(6):2524--2535

    Brown LD, Low MG (1996) A constrained risk inequality with applications to nonparametric functional estimation. Annals of Statistics 24(6):2524--2535

  7. [7]

    Theoretical Computer Science 412(19):1832--1852

    Bubeck S, Munos R, Stoltz G (2011) Pure exploration in finitely-armed and continuous-armed bandits. Theoretical Computer Science 412(19):1832--1852

  8. [8]

    Proceedings of the 34th International Conference on Machine Learning, 844--853

    Chowdhury SR, Gopalan A (2017) On kernelized multi-armed bandits. Proceedings of the 34th International Conference on Machine Learning, 844--853

  9. [9]

    Forrester A, Sobester A, Keane A (2008) Engineering Design via Surrogate Modelling: A Practical Guide (John Wiley & Sons)

  10. [10]

    Gel E, Ntaimo L, eds., Recent Advances in Optimization and Modeling of Contemporary Problems, 255--278, INFORMS TutORials in Operations Research (INFORMS)

    Frazier PI (2018) Bayesian optimization. Gel E, Ntaimo L, eds., Recent Advances in Optimization and Modeling of Contemporary Problems, 255--278, INFORMS TutORials in Operations Research (INFORMS)

  11. [11]

    Lookman T, Alexander FJ, Rajan K, eds., Information Science for Materials Discovery and Design, 45--75 (Springer)

    Frazier PI, Wang J (2016) Bayesian optimization for materials design. Lookman T, Alexander FJ, Rajan K, eds., Information Science for Materials Discovery and Design, 45--75 (Springer)

  12. [12]

    Fu MC, ed., Handbook of Simulation Optimization, 105--147 (Springer)

    Fu MC (2015) Stochastic gradient estimation. Fu MC, ed., Handbook of Simulation Optimization, 105--147 (Springer)

  13. [13]

    Carlsson JG, ed., Emerging Optimization Methods and Modeling Techniques with Applications, 287--311, INFORMS TutORials in Operations Research (INFORMS)

    Hong LJ, Zhang X (2021) Surrogate-based simulation optimization. Carlsson JG, ed., Emerging Optimization Methods and Modeling Techniques with Applications, 287--311, INFORMS TutORials in Operations Research (INFORMS)

  14. [14]

    Horn RA, Johnson CR (2012) Matrix Analysis (Cambridge University Press), 2nd edition

  15. [15]

    Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2486--2495

    Janz D, Burt D, Gonzalez J (2020) Bandit optimisation of functions in the Mat \'e rn kernel RKHS . Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics, 2486--2495

  16. [16]

    Journal of Global Optimization 13(4):455--492

    Jones DR, Schonlau M, Welch WJ (1998) Efficient global optimization of expensive black-box functions. Journal of Global Optimization 13(4):455--492

  17. [17]

    Gaussian Processes and Kernel Methods: A Review on Connections and Equivalences

    Kanagawa M, Hennig P, Sejdinovic D, Sriperumbudur BK (2018) Gaussian processes and kernel methods: A review on connections and equivalences. Preprint available at https://arxiv.org/abs/1807.02582

  18. [18]

    Journal of Machine Learning Research 20(145):1--30

    Letham B, Bakshy E (2019) Bayesian optimization for policy search via online-offline experimentation. Journal of Machine Learning Research 20(145):1--30

  19. [19]

    Journal of Machine Learning Research 23(54):1--9

    Lindauer M, Eggensperger K, Feurer M, Biedenkapp A, Deng D, Benjamins C, Ruhkopf T, Sass R, Hutter F (2022) SMAC3 : A versatile Bayesian optimization package for hyperparameter optimization. Journal of Machine Learning Research 23(54):1--9

  20. [20]

    Pi Mu Epsilon Journal 14(4):269--273

    Muleshkov A, Nguyen T (2016) Easy proof of the Jacobian for the n -dimensional polar coordinates. Pi Mu Epsilon Journal 14(4):269--273

  21. [21]

    Lectures on Probability Theory and Statistics: Ecole d'Eté de Probabilités de Saint-Flour XXVIII - 1998, volume 1738 of Lecture Notes in Mathematics, 85--277 (Springer)

    Nemirovski A (2000) Topics in non-parametric statistics. Lectures on Probability Theory and Statistics: Ecole d'Eté de Probabilités de Saint-Flour XXVIII - 1998, volume 1738 of Lecture Notes in Mathematics, 85--277 (Springer)

  22. [22]

    Pollard D (1990) Empirical Processes: Theory and Applications, volume 2 of NSF-CBMS Regional Conference Series in Probability and Statistics (Institute of Mathematical Statistics)

  23. [23]

    Rasmussen CE, Williams CKI (2006) Gaussian Processes for Machine Learning (MIT Press)

  24. [24]

    Advances in Computational Mathematics 42(4):973--993

    Santin G, Schaback R (2016) Approximation of eigenfunctions in kernel-based spaces. Advances in Computational Mathematics 42(4):973--993

  25. [25]

    Proceedings of the 30th Annual Conference on Learning Theory, 1723--1742

    Scarlett J, Bogunovic I, Cevher V (2017) Lower bounds on regret for noisy G aussian process bandit optimization. Proceedings of the 30th Annual Conference on Learning Theory, 1723--1742

  26. [26]

    Proceedings of the 27th International Conference on Machine Learning, 1015--1022

    Srinivas N, Krause A, Kakade S, Seeger M (2010) Gaussian process optimization in the bandit setting: No regret and experimental design. Proceedings of the 27th International Conference on Machine Learning, 1015--1022

  27. [27]

    IEEE Transactions on Information Theory 58(5):3250--3265

    Srinivas N, Krause A, Kakade SM, Seeger MW (2012) Information-theoretic regret bounds for Gaussian process optimization in the bandit setting. IEEE Transactions on Information Theory 58(5):3250--3265

  28. [28]

    Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 3--26

    Turner R, Eriksson D, McCourt M, Kiili J, Laaksonen E, Xu Z, Guyon I (2021) Bayesian optimization is superior to random search for machine learning hyperparameter tuning: Analysis of the black-box optimization challenge 2020. Proceedings of the NeurIPS 2020 Competition and Demonstration Track, 3--26

  29. [29]

    Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 82--90

    Vakili S, Khezeli K, Picheny V (2021) On information gain and regret bounds in Gaussian process bandits. Proceedings of the 24th International Conference on Artificial Intelligence and Statistics, 82--90

  30. [30]

    van de Geer SA (2000) Empirical Processes in M-estimation (Cambridge University Press)

  31. [31]

    Journal of the American Statistical Association 115(530):920--930

    Wang W, Tuo R, Wu CFJ (2020) On prediction properties of kriging: Uniform error bounds and robustness. Journal of the American Statistical Association 115(530):920--930

  32. [32]

    ACM Computing Surveys 55(13s):Article 287, 36 pages

    Wang X, Jin Y, Schmitt S, Olhofer M (2023) Recent advances in Bayesian optimization. ACM Computing Surveys 55(13s):Article 287, 36 pages

  33. [33]

    Wendland H (2004) Scattered Data Approximation (Cambridge University Press)

  34. [34]

    Preprint available at https://arxiv.org/abs/2307.07539

    Whitehouse J, Wu ZS, Ramdas A (2023) On the sublinear regret of GP-UCB . Preprint available at https://arxiv.org/abs/2307.07539