Recognition: unknown
Manifold-Aware Information Gain and Lower Bounds for Gaussian-Process Bandits on Riemannian Quotient Spaces
Pith reviewed 2026-05-14 18:07 UTC · model grok-4.3
The pith
A regret lower bound for Gaussian-process bandits on Riemannian manifolds includes an explicit factor of the manifold volume raised to the power ν/(2ν+d).
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For any algorithm and time horizon T exceeding an explicit threshold, the worst-case expected regret over the RKHS-ball ||f||_{H_{k_ν}} ≤ B satisfies E[R_T(f)] ≥ c_*(d,ν) B^{d/(2ν+d)} σ_n^{2ν/(2ν+d)} · vol_g(M)^{ν/(2ν+d)} T^{(ν+d)/(2ν+d)} (log T)^{ν/(2ν+d)}. The volume factor vol_g(M)^{ν/(2ν+d)} is the first explicit geometric constant appearing in a manifold GP-bandit lower bound, and the exponent matches the Vakili–Khezeli–Picheny upper bound.
What carries the argument
The Riemannian volume vol_g(M) of the compact manifold M, which scales the lower bound as a multiplicative constant raised to ν/(2ν+d) and enters through the geometry-aware information gain of the intrinsic Matérn-ν kernel.
If this is right
- The T-exponent matches the Vakili–Khezeli–Picheny upper bound, implying that the lower bound is tight in the polynomial rate.
- A companion Assouad-style argument produces a different lower bound whose T-exponent is (2ν+3d)/(4(ν+d)) together with a 1/(log log T) factor.
- On a quotient space M = M~/G an extrinsic-kernel GP-UCB algorithm obeys an upper bound of |G|^{1/2} times the base regret, with a conjectured gauge-modulated constant.
- The bound admits a curvature correction of the form 1 + O(K ε_T^2) obtained from the Bishop–Gromov comparison theorem.
- The same lower bound transfers directly to the Bayesian regret setting by the Yang–Barron and Castillo et al. transfer arguments.
Where Pith is reading between the lines
- If the volume scaling is correct, uniformly enlarging the manifold while keeping local geometry fixed should increase the leading constant exactly by the predicted volume power, giving a direct experimental test.
- The quotient-space upper bound suggests that symmetry reduction can improve regret by a factor of sqrt(|G|), which may apply to optimization problems with known group symmetries.
- The curvature dependence 1 + O(K ε_T^2) implies that the bound remains valid on manifolds with bounded sectional curvature provided the discretization scale ε_T is chosen sufficiently small.
Load-bearing premise
The unknown function lies inside the reproducing-kernel Hilbert space ball defined by the intrinsic Matérn-ν kernel whose geometry exactly matches the Riemannian manifold.
What would settle it
An algorithm that achieves regret growing slower than T to the power (ν+d)/(2ν+d) times (log T) to the power ν/(2ν+d) or whose leading constant is independent of the manifold volume on a family of compact Riemannian manifolds of increasing volume would falsify the claim.
Figures
read the original abstract
We prove a regret lower bound for Gaussian-process bandits on a smooth compact Riemannian manifold $\M$ of dimension $d$ with intrinsic Mat\'ern-$\nu$ kernel ($\nu>d/2$) that exposes how the geometry of the arm space enters the constant. For any algorithm and time horizon $T$ exceeding an explicit threshold, the worst-case expected regret over the RKHS-ball $\|f\|_{\Hil_{k_\nu}}\!\le\!B$ satisfies \begin{multline*} \E[R_T(f)]\;\ge\;c_*(d,\nu)\,B^{d/(2\nu+d)}\,\sigma_n^{2\nu/(2\nu+d)} \\ \cdot\,\vol_g(\M)^{\nu/(2\nu+d)}\,T^{(\nu+d)/(2\nu+d)}(\log T)^{\nu/(2\nu+d)}. \end{multline*} The exponent matches the Vakili--Khezeli--Picheny upper bound \cite{vakili2021information}; the $\vol_g(\M)^{\nu/(2\nu+d)}$ factor is, to our knowledge, the first explicit volume-dependent geometric constant in a manifold GP-bandit lower bound. We extend the analysis in five directions: (i)~a companion Assouad-style proof gives a different lower bound with a strictly smaller $T$-exponent $(2\nu+3d)/(4(\nu+d))$ but with a polylog factor of the form $1/(\log\log T)^{(2\nu+d)/(4(\nu+d))}$, sharpening the $(\log T)^{\nu/(2\nu+d)}$ Fano polylog of Theorem~\ref{thm:main}; (ii)~we prove a $|G|^{1/2}$ upper bound on the regret of an extrinsic-kernel GP-UCB algorithm on a quotient space $\M=\Mt/G$, plus a bracketing theorem (Theorem~\ref{thm:gauge-bracket}); the precise constant is conjectured to take the modulated form $(1+(|G|-1)h(\rinj/\kappa))^{1/2}$ (Conjecture~\ref{conj:gauge-modulated}), validated numerically on $\SO(3)$; (iii)~we write the leading constant $c_*(d,\nu)$ out fully; (iv)~we extract a curvature dependence $1+O(K\eps_T^2)$ via Bishop--Gromov; (v)~we transfer the bound to the Bayesian regret framework via the Yang--Barron / Castillo et al.\ Bayesian-Fano transfer.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper proves a minimax lower bound on the expected regret of any algorithm for Gaussian-process bandits on a compact smooth Riemannian manifold M of dimension d, when the unknown function lies in the unit ball of the RKHS induced by the intrinsic Matérn-ν kernel (ν > d/2). The bound takes the explicit form E[R_T(f)] ≥ c_*(d,ν) B^{d/(2ν+d)} σ_n^{2ν/(2ν+d)} vol_g(M)^{ν/(2ν+d)} T^{(ν+d)/(2ν+d)} (log T)^{ν/(2ν+d)} for T large enough, matching the exponent of the Vakili–Khezeli–Picheny upper bound. Five extensions are developed: an Assouad-style lower bound with a different exponent, a regret upper bound for extrinsic-kernel GP-UCB on quotient spaces M = M̃/G together with a conjectured gauge-modulated constant, an explicit expression for c_*(d,ν), a curvature correction 1+O(K ε_T²) via Bishop–Gromov, and a transfer to Bayesian regret via Yang–Barron/Fano arguments.
Significance. If the central lower bound and the volume prefactor hold, the result supplies the first explicit geometric constant (vol_g(M)) in the literature on manifold GP bandits and confirms that the information-gain exponent is tight. The quotient-space analysis and the numerical check of the gauge conjecture on SO(3) are practically relevant. The explicit c_*(d,ν) and the curvature expansion further strengthen the geometric interpretation. These contributions would be of clear interest to the GP-bandit and manifold-learning communities.
major comments (3)
- [Theorem 1] Theorem 1 (main lower bound): the derivation of the precise vol_g(M)^{ν/(2ν+d)} prefactor is obtained by packing M with geodesic balls and applying Fano’s inequality to the resulting function class. The manuscript must exhibit the explicit entropy-number calculation for the RKHS ball of the intrinsic Matérn kernel (eigenvalues of the Laplace–Beltrami operator) that isolates this volume scaling without residual dependence on the injectivity radius or embedding dimension.
- [Extension (iii)] Section on extension (iii): the leading constant c_*(d,ν) is stated to be written out fully, yet its dependence on the kernel hyperparameters and on the manifold volume must be cross-checked against the information-gain upper bound used in the proof; any hidden constants that absorb manifold-dependent factors would undermine the claimed explicit geometric dependence.
- [Quotient-space section] Quotient-space analysis (Theorem on |G|^{1/2} bound and Conjecture 1): the conjectured modulated constant (1+(|G|−1)h(r_inj/κ))^{1/2} is validated only numerically on SO(3). A rigorous proof or a counter-example on a different quotient (e.g., RP^2) is required before the claim can be considered load-bearing for the quotient-space extension.
minor comments (2)
- [Abstract / Theorem 1] Notation: the symbol ε_T appearing in the curvature correction 1+O(K ε_T²) is not defined in the abstract; its precise scaling with T should be stated explicitly in the main theorem statement.
- [References] Reference list: the Vakili–Khezeli–Picheny upper bound is cited as [vakili2021information]; the full bibliographic entry and the exact theorem number used for the matching exponent should be supplied.
Simulated Author's Rebuttal
We thank the referee for the careful reading and constructive major comments. We address each point below and indicate planned revisions to strengthen the geometric claims.
read point-by-point responses
-
Referee: [Theorem 1] Theorem 1 (main lower bound): the derivation of the precise vol_g(M)^{ν/(2ν+d)} prefactor is obtained by packing M with geodesic balls and applying Fano’s inequality to the resulting function class. The manuscript must exhibit the explicit entropy-number calculation for the RKHS ball of the intrinsic Matérn kernel (eigenvalues of the Laplace–Beltrami operator) that isolates this volume scaling without residual dependence on the injectivity radius or embedding dimension.
Authors: The vol_g(M) scaling follows directly from Weyl's law on the Laplace-Beltrami eigenvalues, which controls the entropy numbers of the unit ball in the intrinsic Matérn RKHS. The packing argument uses geodesic balls of radius chosen below the injectivity radius so that no residual dependence remains. We will add an explicit appendix deriving the entropy numbers N(ε) ∼ C vol_g(M) ε^{-d/ν} and showing how this produces the precise volume exponent after Fano's inequality. revision: yes
-
Referee: [Extension (iii)] Section on extension (iii): the leading constant c_*(d,ν) is stated to be written out fully, yet its dependence on the kernel hyperparameters and on the manifold volume must be cross-checked against the information-gain upper bound used in the proof; any hidden constants that absorb manifold-dependent factors would undermine the claimed explicit geometric dependence.
Authors: We have verified that c_*(d,ν) isolates the volume factor exactly as it appears in the information-gain upper bound of Vakili et al.; no manifold-dependent terms are absorbed into universal constants. The kernel hyperparameters enter only through the Matérn smoothness and length-scale parameters. We will insert a short cross-check paragraph (with the explicit functional form) in the revised extension (iii) section. revision: yes
-
Referee: [Quotient-space section] Quotient-space analysis (Theorem on |G|^{1/2} bound and Conjecture 1): the conjectured modulated constant (1+(|G|−1)h(r_inj/κ))^{1/2} is validated only numerically on SO(3). A rigorous proof or a counter-example on a different quotient (e.g., RP^2) is required before the claim can be considered load-bearing for the quotient-space extension.
Authors: The modulated constant is explicitly labeled as Conjecture 1 and is supported only by numerical evidence on SO(3). We agree a rigorous proof or counter-example on RP^2 would be valuable; deriving it appears to require additional gauge-theoretic machinery beyond the current scope. We will revise the section to emphasize the conjectural status and add numerical checks on RP^2 where feasible. revision: partial
- Rigorous proof of Conjecture 1 (gauge-modulated constant) or a counter-example on a quotient other than SO(3)
Circularity Check
No circularity: lower bound derived from information gain and Fano on intrinsic kernel geometry
full rationale
The claimed lower bound is obtained by applying standard information-gain arguments and Fano's inequality to the RKHS ball defined by the intrinsic Matérn-ν kernel on the Riemannian manifold. The volume prefactor vol_g(M)^{ν/(2ν+d)} follows directly from the packing number of geodesic balls on (M,g), which is controlled by the manifold volume under the given kernel assumption; this is an explicit geometric consequence rather than a fitted or self-defined quantity. The leading constant c_*(d,ν) is written out fully in the paper, the T-exponent is shown to match an independent prior upper bound, and curvature corrections are obtained via Bishop-Gromov comparison. No step reduces by construction to a fitted parameter, self-citation chain, or ansatz smuggled from the authors' prior work. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- c_*(d,ν)
axioms (2)
- domain assumption The arm space is a smooth compact Riemannian manifold M of dimension d
- domain assumption The kernel is the intrinsic Matérn-ν kernel with ν > d/2
Reference graph
Works this paper leans on
-
[1]
On information gain and regret bounds in Gaussian process bandits,
S. Vakili, K. Khezeli, and V . Picheny, “On information gain and regret bounds in Gaussian process bandits,” inProc. Int. Conf. on Artificial Intelligence and Statistics (AISTATS), 2021
work page 2021
-
[2]
Stochastic multi-armed-bandit prob- lem with non-stationary rewards,
O. Besbes, Y . Gur, and A. Zeevi, “Stochastic multi-armed-bandit prob- lem with non-stationary rewards,” inAdvances in Neural Information Processing Systems (NeurIPS), 2014
work page 2014
-
[3]
A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance,
S. Salgia, S. Vakili, and Q. Zhao, “A domain-shrinking based Bayesian optimization algorithm with order-optimal regret performance,” inAd- vances in Neural Information Processing Systems (NeurIPS), 2021
work page 2021
-
[4]
Gaussian process optimization in the bandit setting: no regret and experimental design,
N. Srinivas, A. Krause, S. M. Kakade, and M. Seeger, “Gaussian process optimization in the bandit setting: no regret and experimental design,” inProc. Int. Conf. on Machine Learning (ICML), 2010
work page 2010
-
[5]
Mat´ern Gaussian processes on Riemannian manifolds,
V . Borovitskiy, A. Terenin, P. Mostowsky, and M. P. Deisenroth, “Mat´ern Gaussian processes on Riemannian manifolds,” inAdvances in Neural Information Processing Systems (NeurIPS), 2020
work page 2020
-
[6]
Estimating a density near an unknown manifold: a Bayesian nonparametric approach,
C. Berenfeld, P. Rosa, and J. Rousseau, “Estimating a density near an unknown manifold: a Bayesian nonparametric approach,”The Annals of Statistics, vol. 52, no. 5, pp. 2081–2111, Oct. 2024
work page 2081
-
[7]
Y . Dorn, “Geometry-aware multi-armed bandits for antenna beam se- lection on spheres, tori,SO(3), and reconfigurable intelligent surfaces,” Manuscript submitted to IEEE Trans. on Wireless Communications, 2026, companion empirical paper
work page 2026
-
[8]
Lower bounds on regret for noisy Gaussian process bandit optimization,
J. Scarlett, I. Bogunovic, and V . Cevher, “Lower bounds on regret for noisy Gaussian process bandit optimization,” inProc. Conf. on Learning Theory (COLT), 2017
work page 2017
-
[9]
S. Iwazaki, “Tighter regret lower bound for Gaussian process ban- dits with squared exponential kernel in hypersphere,”arXiv preprint arXiv:2602.17940, 2026
-
[10]
On lower bounds for standard and robust Gaussian process bandit optimization,
X. Cai and J. Scarlett, “On lower bounds for standard and robust Gaussian process bandit optimization,” inProc. Int. Conf. on Machine Learning (ICML), 2021
work page 2021
-
[11]
Thomas Bayes’ walk on manifolds,
I. Castillo, G. Kerkyacharian, and D. Picard, “Thomas Bayes’ walk on manifolds,”Probability Theory and Related Fields, vol. 158, no. 3–4, pp. 665–710, 2014
work page 2014
-
[12]
Y . Yang and D. B. Dunson, “Bayesian manifold regression,”The Annals of Statistics, vol. 44, no. 2, pp. 876–905, 2016
work page 2016
-
[13]
Wendland,Scattered Data Approximation
H. Wendland,Scattered Data Approximation. Cambridge University Press, 2004
work page 2004
-
[14]
R. J. Adler and J. E. Taylor,Random Fields and Geometry. Springer, 2007
work page 2007
-
[15]
A. B. Tsybakov,Introduction to Nonparametric Estimation. Springer, 2008
work page 2008
-
[16]
Petersen,Riemannian Geometry, 3rd ed
P. Petersen,Riemannian Geometry, 3rd ed. Springer, 2016
work page 2016
-
[17]
R. A. Adams and J. J. F. Fournier,Sobolev Spaces, 2nd ed., ser. Pure and Applied Mathematics. Academic Press, 2003, vol. 140
work page 2003
-
[18]
Aubin,Some Nonlinear Problems in Riemannian Geometry
T. Aubin,Some Nonlinear Problems in Riemannian Geometry. Springer, 1998
work page 1998
-
[19]
On kernelized multi-armed bandits,
S. R. Chowdhury and A. Gopalan, “On kernelized multi-armed bandits,” inProc. Int. Conf. on Machine Learning (ICML), 2017
work page 2017
-
[20]
S. Bubeck, R. Munos, G. Stoltz, and C. Szepesv ´ari, “X-armed bandits,” Journal of Machine Learning Research, vol. 12, pp. 1655–1695, 2011
work page 2011
-
[21]
High-dimensional experimental design and kernel bandits,
R. Camilleri, K. Jamieson, and J. Katz-Samuels, “High-dimensional experimental design and kernel bandits,” inProc. Int. Conf. on Machine Learning (ICML), 2021
work page 2021
-
[22]
Gaussian process bandit optimization with few batches,
Z. Li and J. Scarlett, “Gaussian process bandit optimization with few batches,” inProc. Int. Conf. on Artificial Intelligence and Statistics (AISTATS), 2022
work page 2022
-
[23]
Random exploration in Bayesian optimization: Order-optimal regret and computational efficiency,
S. Salgia, S. Vakili, and Q. Zhao, “Random exploration in Bayesian optimization: Order-optimal regret and computational efficiency,” in Proc. Int. Conf. on Machine Learning (ICML), 2024
work page 2024
-
[24]
S. Iwazaki and S. Takeno, “Improved regret analysis in Gaussian process bandits: Optimality for noiseless reward, RKHS norm, and non- stationary variance,” inProc. Int. Conf. on Machine Learning (ICML), ser. Proceedings of Machine Learning Research. PMLR, Oct. 2025, pp. 26 642–26 672
work page 2025
-
[25]
An introduction to matrix concentration inequalities,
J. A. Tropp, “An introduction to matrix concentration inequalities,” Foundations and Trends in Machine Learning, vol. 8, no. 1–2, pp. 1– 230, 2015
work page 2015
-
[26]
Optimal rates of convergence for nonparametric estima- tors,
C. J. Stone, “Optimal rates of convergence for nonparametric estima- tors,”The Annals of Statistics, vol. 8, no. 6, pp. 1348–1360, 1980
work page 1980
-
[27]
Chavel,Riemannian Geometry: A Modern Introduction, 2nd ed
I. Chavel,Riemannian Geometry: A Modern Introduction, 2nd ed. Cambridge University Press, 2006
work page 2006
-
[28]
Posterior contrac- tion rates for Mat ´ern Gaussian processes on Riemannian manifolds,
P. Rosa, V . Borovitskiy, A. Terenin, and J. Rousseau, “Posterior contrac- tion rates for Mat ´ern Gaussian processes on Riemannian manifolds,” in Advances in Neural Information Processing Systems 36 (NeurIPS), 2023
work page 2023
-
[29]
Regret bounds for batched bandits,
H. Esfandiari, A. Karbasi, A. Mehrabian, and V . Mirrokni, “Regret bounds for batched bandits,” inProc. AAAI Conference on Artificial Intelligence, vol. 35, 2021, pp. 7340–7348
work page 2021
-
[30]
Adaptively tracking the best bandit arm with an unknown number of distribution changes,
P. Auer, P. Gajane, and R. Ortner, “Adaptively tracking the best bandit arm with an unknown number of distribution changes,” inConference on Learning Theory (COLT), 2019, pp. 138–158
work page 2019
-
[31]
Information rates of nonparametric Gaussian process methods,
A. W. van der Vaart and J. H. van Zanten, “Information rates of nonparametric Gaussian process methods,”Journal of Machine Learning Research, vol. 12, pp. 2095–2119, 2011
work page 2095
-
[32]
Reproducing kernel Hilbert spaces of Gaussian priors,
——, “Reproducing kernel Hilbert spaces of Gaussian priors,” inPush- ing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh, ser. IMS Collections, B. Clarke and S. Ghosal, Eds. Institute of Mathematical Statistics, 2008, vol. 3, pp. 200–222
work page 2008
-
[33]
Information-theoretic determination of minimax rates of convergence,
Y . Yang and A. Barron, “Information-theoretic determination of minimax rates of convergence,”Annals of Statistics, vol. 27, no. 5, pp. 1564–1599, 1999
work page 1999
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.