pith. machine review for the scientific record. sign in

arxiv: 2605.13524 · v1 · submitted 2026-05-13 · 📡 eess.SP

Recognition: unknown

Manifold-Aware Information Gain and Lower Bounds for Gaussian-Process Bandits on Riemannian Quotient Spaces

Authors on Pith no claims yet

Pith reviewed 2026-05-14 18:07 UTC · model grok-4.3

classification 📡 eess.SP
keywords gaussian process banditsriemannian manifoldregret lower boundmatérn kernelinformation gainquotient spacevolume factor
0
0 comments X

The pith

A regret lower bound for Gaussian-process bandits on Riemannian manifolds includes an explicit factor of the manifold volume raised to the power ν/(2ν+d).

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that any algorithm for Gaussian-process bandits on a compact Riemannian manifold of dimension d using the intrinsic Matérn-ν kernel must incur worst-case expected regret that grows at least as fast as the manifold volume to the power ν/(2ν+d) times T to the power (ν+d)/(2ν+d), with an extra logarithmic factor, once the horizon exceeds a threshold. This holds uniformly over all functions inside the RKHS ball of radius B. A sympathetic reader cares because the bound makes the geometric size of the decision space appear directly in the scaling constant, and the T-exponent already matches the best known upper bounds. The work also supplies five extensions that cover alternative proof styles, symmetry reduction on quotient spaces, explicit constants, curvature corrections, and passage to Bayesian regret.

Core claim

For any algorithm and time horizon T exceeding an explicit threshold, the worst-case expected regret over the RKHS-ball ||f||_{H_{k_ν}} ≤ B satisfies E[R_T(f)] ≥ c_*(d,ν) B^{d/(2ν+d)} σ_n^{2ν/(2ν+d)} · vol_g(M)^{ν/(2ν+d)} T^{(ν+d)/(2ν+d)} (log T)^{ν/(2ν+d)}. The volume factor vol_g(M)^{ν/(2ν+d)} is the first explicit geometric constant appearing in a manifold GP-bandit lower bound, and the exponent matches the Vakili–Khezeli–Picheny upper bound.

What carries the argument

The Riemannian volume vol_g(M) of the compact manifold M, which scales the lower bound as a multiplicative constant raised to ν/(2ν+d) and enters through the geometry-aware information gain of the intrinsic Matérn-ν kernel.

If this is right

  • The T-exponent matches the Vakili–Khezeli–Picheny upper bound, implying that the lower bound is tight in the polynomial rate.
  • A companion Assouad-style argument produces a different lower bound whose T-exponent is (2ν+3d)/(4(ν+d)) together with a 1/(log log T) factor.
  • On a quotient space M = M~/G an extrinsic-kernel GP-UCB algorithm obeys an upper bound of |G|^{1/2} times the base regret, with a conjectured gauge-modulated constant.
  • The bound admits a curvature correction of the form 1 + O(K ε_T^2) obtained from the Bishop–Gromov comparison theorem.
  • The same lower bound transfers directly to the Bayesian regret setting by the Yang–Barron and Castillo et al. transfer arguments.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the volume scaling is correct, uniformly enlarging the manifold while keeping local geometry fixed should increase the leading constant exactly by the predicted volume power, giving a direct experimental test.
  • The quotient-space upper bound suggests that symmetry reduction can improve regret by a factor of sqrt(|G|), which may apply to optimization problems with known group symmetries.
  • The curvature dependence 1 + O(K ε_T^2) implies that the bound remains valid on manifolds with bounded sectional curvature provided the discretization scale ε_T is chosen sufficiently small.

Load-bearing premise

The unknown function lies inside the reproducing-kernel Hilbert space ball defined by the intrinsic Matérn-ν kernel whose geometry exactly matches the Riemannian manifold.

What would settle it

An algorithm that achieves regret growing slower than T to the power (ν+d)/(2ν+d) times (log T) to the power ν/(2ν+d) or whose leading constant is independent of the manifold volume on a family of compact Riemannian manifolds of increasing volume would falsify the claim.

Figures

Figures reproduced from arXiv: 2605.13524 by Changsheng Chen, Ning Xie, Yuriy Dorn.

Figure 1
Figure 1. Figure 1: Numerical validation of Theorem 2 on S 2 with intrinsic Matern- ´ 5/2 kernel, Ncand = 64 Fibonacci points, σn = 0.1, B = 1. Empirical cumulative regret of GP-UCB (median over M = 8 seeds, IQR shaded) and the predicted lower bound ∝ T 9/14 on log-log axes. The empirical regret stays above the predicted floor across all tested horizons, consistent with the lower bound; the empirical rate is roughly T 0.4 (sl… view at source ↗
Figure 2
Figure 2. Figure 2: Empirical support for Conjecture 31 on SO(3) = Spin(3)/Z2. The regret ratio Rext T /Rint T rises smoothly from 1.000 ± 0.000 at κ/rinj = 0.13 (no gauge effect: the two kernels are essentially identical) to 1.326 ± 0.089 at κ/rinj = 0.89, staying below the naive |G| 1/2 = √ 2 ≈ 1.414 ceiling, the qualitative trajectory predicted by the modulated-separation conjecture. This figure is not a proof of the conje… view at source ↗
Figure 3
Figure 3. Figure 3: Switching-budget tradeoff on S 2 with Matern- ´ 5/2. Left: total cost RT + λST vs. λ on log–log axes for vanilla GP-UCB and switch-aware GP￾UCB. The dashed line is the predicted slope λ ν/(ν+d) = λ 5/9 ≈ λ 0.56 from Theorem 33. Right: average number of switches ST as a function of λ for the switch-aware policy. Experimental parameters. T = 200 rounds; λ grid {0, 0.01, 0.1, 0.5, 1, 2.5, 5, 10} spanning nois… view at source ↗
Figure 4
Figure 4. Figure 4: Empirical matching upper bound: total cost [PITH_FULL_IMAGE:figures/full_fig_p022_4.png] view at source ↗
read the original abstract

We prove a regret lower bound for Gaussian-process bandits on a smooth compact Riemannian manifold $\M$ of dimension $d$ with intrinsic Mat\'ern-$\nu$ kernel ($\nu>d/2$) that exposes how the geometry of the arm space enters the constant. For any algorithm and time horizon $T$ exceeding an explicit threshold, the worst-case expected regret over the RKHS-ball $\|f\|_{\Hil_{k_\nu}}\!\le\!B$ satisfies \begin{multline*} \E[R_T(f)]\;\ge\;c_*(d,\nu)\,B^{d/(2\nu+d)}\,\sigma_n^{2\nu/(2\nu+d)} \\ \cdot\,\vol_g(\M)^{\nu/(2\nu+d)}\,T^{(\nu+d)/(2\nu+d)}(\log T)^{\nu/(2\nu+d)}. \end{multline*} The exponent matches the Vakili--Khezeli--Picheny upper bound \cite{vakili2021information}; the $\vol_g(\M)^{\nu/(2\nu+d)}$ factor is, to our knowledge, the first explicit volume-dependent geometric constant in a manifold GP-bandit lower bound. We extend the analysis in five directions: (i)~a companion Assouad-style proof gives a different lower bound with a strictly smaller $T$-exponent $(2\nu+3d)/(4(\nu+d))$ but with a polylog factor of the form $1/(\log\log T)^{(2\nu+d)/(4(\nu+d))}$, sharpening the $(\log T)^{\nu/(2\nu+d)}$ Fano polylog of Theorem~\ref{thm:main}; (ii)~we prove a $|G|^{1/2}$ upper bound on the regret of an extrinsic-kernel GP-UCB algorithm on a quotient space $\M=\Mt/G$, plus a bracketing theorem (Theorem~\ref{thm:gauge-bracket}); the precise constant is conjectured to take the modulated form $(1+(|G|-1)h(\rinj/\kappa))^{1/2}$ (Conjecture~\ref{conj:gauge-modulated}), validated numerically on $\SO(3)$; (iii)~we write the leading constant $c_*(d,\nu)$ out fully; (iv)~we extract a curvature dependence $1+O(K\eps_T^2)$ via Bishop--Gromov; (v)~we transfer the bound to the Bayesian regret framework via the Yang--Barron / Castillo et al.\ Bayesian-Fano transfer.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proves a minimax lower bound on the expected regret of any algorithm for Gaussian-process bandits on a compact smooth Riemannian manifold M of dimension d, when the unknown function lies in the unit ball of the RKHS induced by the intrinsic Matérn-ν kernel (ν > d/2). The bound takes the explicit form E[R_T(f)] ≥ c_*(d,ν) B^{d/(2ν+d)} σ_n^{2ν/(2ν+d)} vol_g(M)^{ν/(2ν+d)} T^{(ν+d)/(2ν+d)} (log T)^{ν/(2ν+d)} for T large enough, matching the exponent of the Vakili–Khezeli–Picheny upper bound. Five extensions are developed: an Assouad-style lower bound with a different exponent, a regret upper bound for extrinsic-kernel GP-UCB on quotient spaces M = M̃/G together with a conjectured gauge-modulated constant, an explicit expression for c_*(d,ν), a curvature correction 1+O(K ε_T²) via Bishop–Gromov, and a transfer to Bayesian regret via Yang–Barron/Fano arguments.

Significance. If the central lower bound and the volume prefactor hold, the result supplies the first explicit geometric constant (vol_g(M)) in the literature on manifold GP bandits and confirms that the information-gain exponent is tight. The quotient-space analysis and the numerical check of the gauge conjecture on SO(3) are practically relevant. The explicit c_*(d,ν) and the curvature expansion further strengthen the geometric interpretation. These contributions would be of clear interest to the GP-bandit and manifold-learning communities.

major comments (3)
  1. [Theorem 1] Theorem 1 (main lower bound): the derivation of the precise vol_g(M)^{ν/(2ν+d)} prefactor is obtained by packing M with geodesic balls and applying Fano’s inequality to the resulting function class. The manuscript must exhibit the explicit entropy-number calculation for the RKHS ball of the intrinsic Matérn kernel (eigenvalues of the Laplace–Beltrami operator) that isolates this volume scaling without residual dependence on the injectivity radius or embedding dimension.
  2. [Extension (iii)] Section on extension (iii): the leading constant c_*(d,ν) is stated to be written out fully, yet its dependence on the kernel hyperparameters and on the manifold volume must be cross-checked against the information-gain upper bound used in the proof; any hidden constants that absorb manifold-dependent factors would undermine the claimed explicit geometric dependence.
  3. [Quotient-space section] Quotient-space analysis (Theorem on |G|^{1/2} bound and Conjecture 1): the conjectured modulated constant (1+(|G|−1)h(r_inj/κ))^{1/2} is validated only numerically on SO(3). A rigorous proof or a counter-example on a different quotient (e.g., RP^2) is required before the claim can be considered load-bearing for the quotient-space extension.
minor comments (2)
  1. [Abstract / Theorem 1] Notation: the symbol ε_T appearing in the curvature correction 1+O(K ε_T²) is not defined in the abstract; its precise scaling with T should be stated explicitly in the main theorem statement.
  2. [References] Reference list: the Vakili–Khezeli–Picheny upper bound is cited as [vakili2021information]; the full bibliographic entry and the exact theorem number used for the matching exponent should be supplied.

Simulated Author's Rebuttal

3 responses · 1 unresolved

We thank the referee for the careful reading and constructive major comments. We address each point below and indicate planned revisions to strengthen the geometric claims.

read point-by-point responses
  1. Referee: [Theorem 1] Theorem 1 (main lower bound): the derivation of the precise vol_g(M)^{ν/(2ν+d)} prefactor is obtained by packing M with geodesic balls and applying Fano’s inequality to the resulting function class. The manuscript must exhibit the explicit entropy-number calculation for the RKHS ball of the intrinsic Matérn kernel (eigenvalues of the Laplace–Beltrami operator) that isolates this volume scaling without residual dependence on the injectivity radius or embedding dimension.

    Authors: The vol_g(M) scaling follows directly from Weyl's law on the Laplace-Beltrami eigenvalues, which controls the entropy numbers of the unit ball in the intrinsic Matérn RKHS. The packing argument uses geodesic balls of radius chosen below the injectivity radius so that no residual dependence remains. We will add an explicit appendix deriving the entropy numbers N(ε) ∼ C vol_g(M) ε^{-d/ν} and showing how this produces the precise volume exponent after Fano's inequality. revision: yes

  2. Referee: [Extension (iii)] Section on extension (iii): the leading constant c_*(d,ν) is stated to be written out fully, yet its dependence on the kernel hyperparameters and on the manifold volume must be cross-checked against the information-gain upper bound used in the proof; any hidden constants that absorb manifold-dependent factors would undermine the claimed explicit geometric dependence.

    Authors: We have verified that c_*(d,ν) isolates the volume factor exactly as it appears in the information-gain upper bound of Vakili et al.; no manifold-dependent terms are absorbed into universal constants. The kernel hyperparameters enter only through the Matérn smoothness and length-scale parameters. We will insert a short cross-check paragraph (with the explicit functional form) in the revised extension (iii) section. revision: yes

  3. Referee: [Quotient-space section] Quotient-space analysis (Theorem on |G|^{1/2} bound and Conjecture 1): the conjectured modulated constant (1+(|G|−1)h(r_inj/κ))^{1/2} is validated only numerically on SO(3). A rigorous proof or a counter-example on a different quotient (e.g., RP^2) is required before the claim can be considered load-bearing for the quotient-space extension.

    Authors: The modulated constant is explicitly labeled as Conjecture 1 and is supported only by numerical evidence on SO(3). We agree a rigorous proof or counter-example on RP^2 would be valuable; deriving it appears to require additional gauge-theoretic machinery beyond the current scope. We will revise the section to emphasize the conjectural status and add numerical checks on RP^2 where feasible. revision: partial

standing simulated objections not resolved
  • Rigorous proof of Conjecture 1 (gauge-modulated constant) or a counter-example on a quotient other than SO(3)

Circularity Check

0 steps flagged

No circularity: lower bound derived from information gain and Fano on intrinsic kernel geometry

full rationale

The claimed lower bound is obtained by applying standard information-gain arguments and Fano's inequality to the RKHS ball defined by the intrinsic Matérn-ν kernel on the Riemannian manifold. The volume prefactor vol_g(M)^{ν/(2ν+d)} follows directly from the packing number of geodesic balls on (M,g), which is controlled by the manifold volume under the given kernel assumption; this is an explicit geometric consequence rather than a fitted or self-defined quantity. The leading constant c_*(d,ν) is written out fully in the paper, the T-exponent is shown to match an independent prior upper bound, and curvature corrections are obtained via Bishop-Gromov comparison. No step reduces by construction to a fitted parameter, self-citation chain, or ansatz smuggled from the authors' prior work. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard domain assumptions from Riemannian geometry and Gaussian-process theory; no new entities are postulated and the leading constant is left symbolic.

free parameters (1)
  • c_*(d,ν)
    Leading multiplicative constant whose explicit form is promised in the manuscript but not displayed in the abstract.
axioms (2)
  • domain assumption The arm space is a smooth compact Riemannian manifold M of dimension d
    Required to define the intrinsic Matérn kernel and the Riemannian volume measure vol_g.
  • domain assumption The kernel is the intrinsic Matérn-ν kernel with ν > d/2
    Ensures the RKHS is well-defined and the stated regret scaling applies.

pith-pipeline@v0.9.0 · 5781 in / 1508 out tokens · 96508 ms · 2026-05-14T18:07:02.440379+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

33 extracted references · 33 canonical work pages

  1. [1]

    On information gain and regret bounds in Gaussian process bandits,

    S. Vakili, K. Khezeli, and V . Picheny, “On information gain and regret bounds in Gaussian process bandits,” inProc. Int. Conf. on Artificial Intelligence and Statistics (AISTATS), 2021

  2. [2]

    Stochastic multi-armed-bandit prob- lem with non-stationary rewards,

    O. Besbes, Y . Gur, and A. Zeevi, “Stochastic multi-armed-bandit prob- lem with non-stationary rewards,” inAdvances in Neural Information Processing Systems (NeurIPS), 2014

  3. [3]

    A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance,

    S. Salgia, S. Vakili, and Q. Zhao, “A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance,” inAd- vances in Neural Information Processing Systems (NeurIPS), 2021

  4. [4]

    Gaussian process optimization in the bandit setting: no regret and experimental design,

    N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, “Gaussian process optimization in the bandit setting: no regret and experimental design,” inProc. Int. Conf. on Machine Learning (ICML), 2010

  5. [5]

    Mat´ern Gaussian processes on Riemannian manifolds,

    V . Borovitskiy, A. Terenin, P. Mostowsky, and M. P. Deisenroth, “Mat´ern Gaussian processes on Riemannian manifolds,” inAdvances in Neural Information Processing Systems (NeurIPS), 2020

  6. [6]

    Estimating a density near an unknown manifold: a Bayesian nonparametric approach,

    C. Berenfeld, P. Rosa, and J. Rousseau, “Estimating a density near an unknown manifold: a Bayesian nonparametric approach,”The Annals of Statistics, vol. 52, no. 5, pp. 2081–2111, Oct. 2024

  7. [7]

    Geometry-aware multi-armed bandits for antenna beam se- lection on spheres, tori,SO(3), and reconfigurable intelligent surfaces,

    Y . Dorn, “Geometry-aware multi-armed bandits for antenna beam se- lection on spheres, tori,SO(3), and reconfigurable intelligent surfaces,” Manuscript submitted to IEEE Trans. on Wireless Communications, 2026, companion empirical paper

  8. [8]

    Lower bounds on regret for noisy Gaussian process bandit optimization,

    J. Scarlett, I. Bogunovic, and V . Cevher, “Lower bounds on regret for noisy Gaussian process bandit optimization,” inProc. Conf. on Learning Theory (COLT), 2017

  9. [9]

    Tighter regret lower bound for Gaussian process ban- dits with squared exponential kernel in hypersphere,

    S. Iwazaki, “Tighter regret lower bound for Gaussian process ban- dits with squared exponential kernel in hypersphere,”arXiv preprint arXiv:2602.17940, 2026

  10. [10]

    On lower bounds for standard and robust Gaussian process bandit optimization,

    X. Cai and J. Scarlett, “On lower bounds for standard and robust Gaussian process bandit optimization,” inProc. Int. Conf. on Machine Learning (ICML), 2021

  11. [11]

    Thomas Bayes’ walk on manifolds,

    I. Castillo, G. Kerkyacharian, and D. Picard, “Thomas Bayes’ walk on manifolds,”Probability Theory and Related Fields, vol. 158, no. 3–4, pp. 665–710, 2014

  12. [12]

    Bayesian manifold regression,

    Y . Yang and D. B. Dunson, “Bayesian manifold regression,”The Annals of Statistics, vol. 44, no. 2, pp. 876–905, 2016

  13. [13]

    Wendland,Scattered Data Approximation

    H. Wendland,Scattered Data Approximation. Cambridge University Press, 2004

  14. [14]

    R. J. Adler and J. E. Taylor,Random Fields and Geometry. Springer, 2007

  15. [15]

    A. B. Tsybakov,Introduction to Nonparametric Estimation. Springer, 2008

  16. [16]

    Petersen,Riemannian Geometry, 3rd ed

    P. Petersen,Riemannian Geometry, 3rd ed. Springer, 2016

  17. [17]

    R. A. Adams and J. J. F. Fournier,Sobolev Spaces, 2nd ed., ser. Pure and Applied Mathematics. Academic Press, 2003, vol. 140

  18. [18]

    Aubin,Some Nonlinear Problems in Riemannian Geometry

    T. Aubin,Some Nonlinear Problems in Riemannian Geometry. Springer, 1998

  19. [19]

    On kernelized multi-armed bandits,

    S. R. Chowdhury and A. Gopalan, “On kernelized multi-armed bandits,” inProc. Int. Conf. on Machine Learning (ICML), 2017

  20. [20]

    X-armed bandits,

    S. Bubeck, R. Munos, G. Stoltz, and C. Szepesv ´ari, “X-armed bandits,” Journal of Machine Learning Research, vol. 12, pp. 1655–1695, 2011

  21. [21]

    High-dimensional experimental design and kernel bandits,

    R. Camilleri, K. Jamieson, and J. Katz-Samuels, “High-dimensional experimental design and kernel bandits,” inProc. Int. Conf. on Machine Learning (ICML), 2021

  22. [22]

    Gaussian process bandit optimization with few batches,

    Z. Li and J. Scarlett, “Gaussian process bandit optimization with few batches,” inProc. Int. Conf. on Artificial Intelligence and Statistics (AISTATS), 2022

  23. [23]

    Random exploration in Bayesian optimization: Order-optimal regret and computational efficiency,

    S. Salgia, S. Vakili, and Q. Zhao, “Random exploration in Bayesian optimization: Order-optimal regret and computational efficiency,” in Proc. Int. Conf. on Machine Learning (ICML), 2024

  24. [24]

    Improved regret analysis in Gaussian process bandits: Optimality for noiseless reward, RKHS norm, and non- stationary variance,

    S. Iwazaki and S. Takeno, “Improved regret analysis in Gaussian process bandits: Optimality for noiseless reward, RKHS norm, and non- stationary variance,” inProc. Int. Conf. on Machine Learning (ICML), ser. Proceedings of Machine Learning Research. PMLR, Oct. 2025, pp. 26 642–26 672

  25. [25]

    An introduction to matrix concentration inequalities,

    J. A. Tropp, “An introduction to matrix concentration inequalities,” Foundations and Trends in Machine Learning, vol. 8, no. 1–2, pp. 1– 230, 2015

  26. [26]

    Optimal rates of convergence for nonparametric estima- tors,

    C. J. Stone, “Optimal rates of convergence for nonparametric estima- tors,”The Annals of Statistics, vol. 8, no. 6, pp. 1348–1360, 1980

  27. [27]

    Chavel,Riemannian Geometry: A Modern Introduction, 2nd ed

    I. Chavel,Riemannian Geometry: A Modern Introduction, 2nd ed. Cambridge University Press, 2006

  28. [28]

    Posterior contrac- tion rates for Mat ´ern Gaussian processes on Riemannian manifolds,

    P. Rosa, V . Borovitskiy, A. Terenin, and J. Rousseau, “Posterior contrac- tion rates for Mat ´ern Gaussian processes on Riemannian manifolds,” in Advances in Neural Information Processing Systems 36 (NeurIPS), 2023

  29. [29]

    Regret bounds for batched bandits,

    H. Esfandiari, A. Karbasi, A. Mehrabian, and V . Mirrokni, “Regret bounds for batched bandits,” inProc. AAAI Conference on Artificial Intelligence, vol. 35, 2021, pp. 7340–7348

  30. [30]

    Adaptively tracking the best bandit arm with an unknown number of distribution changes,

    P. Auer, P. Gajane, and R. Ortner, “Adaptively tracking the best bandit arm with an unknown number of distribution changes,” inConference on Learning Theory (COLT), 2019, pp. 138–158

  31. [31]

    Information rates of nonparametric Gaussian process methods,

    A. W. van der Vaart and J. H. van Zanten, “Information rates of nonparametric Gaussian process methods,”Journal of Machine Learning Research, vol. 12, pp. 2095–2119, 2011

  32. [32]

    Reproducing kernel Hilbert spaces of Gaussian priors,

    ——, “Reproducing kernel Hilbert spaces of Gaussian priors,” inPush- ing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, ser. IMS Collections, B. Clarke and S. Ghosal, Eds. Institute of Mathematical Statistics, 2008, vol. 3, pp. 200–222

  33. [33]

    Information-theoretic determination of minimax rates of convergence,

    Y . Yang and A. Barron, “Information-theoretic determination of minimax rates of convergence,”Annals of Statistics, vol. 27, no. 5, pp. 1564–1599, 1999