Optimal Stochastic Krylov based Techniques for Large- Scale Log-Determinant Estimation
Pith reviewed 2026-06-27 21:31 UTC · model grok-4.3
The pith
OSA-IOP and OSLQ estimate log-determinants of large sparse positive definite matrices accurately at lower computational cost by extending Krylov techniques with stochastic sampling.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The OSA-IOP approach extends the Incomplete Orthogonalization Procedure to compute the action of the matrix logarithm on a vector and combines it with the randomized Hutch++ algorithm to significantly reduce computational cost while maintaining high accuracy. The OSLQ method estimates log-determinants by coupling Lanczos quadrature with Hutch++ and controlled orthogonalization, leveraging Krylov subspaces as efficient quadrature mechanisms to approximate quadratic forms involving the matrix logarithm. Error bounds are derived for both methods.
What carries the argument
The combination of the Incomplete Orthogonalization Procedure or Lanczos quadrature with the randomized Hutch++ algorithm to approximate the matrix logarithm action on vectors for stochastic trace estimation.
If this is right
- Error bounds are derived that quantify the approximation accuracy for both OSA-IOP and OSLQ.
- Numerical experiments on large-scale sparse matrices from real-world applications show maintained accuracy, robustness, and scalability.
- Both methods achieve significant reduction in computational cost relative to standard approaches for log-determinant estimation.
- The techniques apply directly to symmetric positive definite matrices arising in Gaussian processes and uncertainty quantification.
Where Pith is reading between the lines
- The reduced cost could enable log-determinant computations inside iterative solvers for larger Gaussian process models than previously feasible.
- The same Krylov-plus-stochastic-sampling pattern may extend to estimation of other matrix functions such as traces of matrix powers.
- Controlled orthogonalization offers a parameter that could be tuned for accuracy-speed trade-offs in related Krylov applications beyond log-determinants.
Load-bearing premise
That combining the Incomplete Orthogonalization Procedure with Hutch++ sampling (and similarly Lanczos quadrature with controlled orthogonalization) preserves high accuracy for the matrix logarithm on large sparse positive definite matrices without introducing unacceptable approximation errors.
What would settle it
A concrete test on one of the paper's real-world sparse matrices where the computed log-determinant deviates from a high-accuracy reference value by more than the stated error bound for either method.
Figures
read the original abstract
Estimating the logarithm of the determinant of large sparse positive definite symmetric matrices is an important task in numerical linear algebra, machine learning, Gaussian processes, and uncertainty quantification. In this work, we introduce two scalable and efficient methods for large-scale log-determinant termed the Optimal Stochastic Arnoldi with Incomplete Orthogonalization Procedure (OSA-IOP) and the Optimal Stochastic Lanczos Quadrature (OSLQ). The OSA-IOP approach extends the Incomplete Orthogonalization Procedure (IOP), originally developed for matrix exponential functions for exponential time stepping integrators, to compute the action of the matrix algorithm on a vector. We observe that combining IOP with a randomized Hutch++ algorithm, the OSA-IOP significantly reduces computational cost while maintaining high accuracy. The OSLQ method estimates log-determinants by coupling Lanczos quadrature with Hutch++ and controlled orthogonalization, leveraging Krylov subspaces as efficient quadrature mechanisms to approximate quadratic forms involving the matrix logarithm. We derive error bounds for both methods. Extensive numerical experiments on large-scale sparse matrices from real-world applications demonstrate the accuracy, robustness, and scalability of the proposed approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces OSA-IOP (IOP combined with Hutch++) and OSLQ (Lanczos quadrature with Hutch++ and controlled orthogonalization) for estimating log-determinants of large sparse symmetric positive definite matrices. It extends prior Krylov techniques originally developed for the matrix exponential, derives error bounds for both new methods, and reports numerical experiments on real-world sparse matrices from applications to demonstrate accuracy, robustness, and reduced computational cost relative to existing approaches.
Significance. If the derived error bounds hold and control the approximation quality for the logarithm (including on matrices with eigenvalues near zero), the work would provide practically useful scalable estimators for a core task in Gaussian processes, uncertainty quantification, and machine learning. The explicit combination of deterministic Krylov truncation with variance-reduced stochastic estimation, together with the provision of error bounds, is a constructive contribution that could be adopted in large-scale settings.
major comments (2)
- [Error analysis for OSA-IOP] The central error analysis (presumably the section deriving bounds for OSA-IOP) must explicitly address whether the IOP truncation error, originally analyzed for the entire function exp(tA), remains controlled when the target is log(A) with its branch point at zero. The interaction between the IOP tolerance and the Hutch++ variance reduction term is load-bearing for the claimed bounds; without a specific estimate of this interaction, the guarantee for near-singular or ill-conditioned matrices is not yet established.
- [Numerical experiments] Numerical experiments section: the reported test matrices should include at least one family with smallest eigenvalue ≤ 10^{-8} (or explicitly state the spectral gap) to verify that the combined OSA-IOP and OSLQ errors remain within the derived bounds; the current description of “real-world sparse matrices” does not confirm coverage of the regime where the skeptic concern is most acute.
minor comments (2)
- [Abstract and method definitions] Notation for the controlled orthogonalization parameter in OSLQ should be introduced once and used consistently; the abstract refers to “controlled orthogonalization” without a symbol that later sections can reference.
- [Complexity discussion] The manuscript would benefit from a short table comparing the leading-term flop counts of OSA-IOP, OSLQ, plain Hutch++, and Lanczos quadrature on an n imes n matrix with k-step Krylov subspaces.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment below and will make the indicated revisions to strengthen the presentation.
read point-by-point responses
-
Referee: [Error analysis for OSA-IOP] The central error analysis (presumably the section deriving bounds for OSA-IOP) must explicitly address whether the IOP truncation error, originally analyzed for the entire function exp(tA), remains controlled when the target is log(A) with its branch point at zero. The interaction between the IOP tolerance and the Hutch++ variance reduction term is load-bearing for the claimed bounds; without a specific estimate of this interaction, the guarantee for near-singular or ill-conditioned matrices is not yet established.
Authors: We appreciate this observation. The error bounds for OSA-IOP in the manuscript are derived by adapting the IOP truncation analysis specifically to the logarithm, using the spectral properties of positive definite matrices and bounding the remainder via the Krylov subspace approximation for f(A) = log(A). The branch point is controlled by the assumption that the matrices are positive definite with eigenvalues bounded away from zero. To make the interaction with Hutch++ explicit, we will add a dedicated paragraph or remark in the error analysis section providing a combined error estimate that quantifies how the IOP tolerance affects the variance-reduced stochastic term, including a note on the ill-conditioned regime. This clarification will be included in the revised manuscript. revision: yes
-
Referee: [Numerical experiments] Numerical experiments section: the reported test matrices should include at least one family with smallest eigenvalue ≤ 10^{-8} (or explicitly state the spectral gap) to verify that the combined OSA-IOP and OSLQ errors remain within the derived bounds; the current description of “real-world sparse matrices” does not confirm coverage of the regime where the skeptic concern is most acute.
Authors: We agree that explicit coverage of the near-singular regime is valuable for validating the bounds. Our current experiments use real-world sparse matrices from applications, some of which have high condition numbers, but we will revise the numerical section to report the smallest eigenvalues (or spectral gaps) for all tested matrices and add or emphasize at least one matrix family with λ_min ≤ 10^{-8}. We will also include a short verification that the observed approximation errors remain consistent with the derived theoretical bounds in this regime. revision: yes
Circularity Check
No circularity: extensions and bounds are independent of inputs
full rationale
The paper extends IOP (originally for exp) and Lanczos quadrature to log-det via combination with Hutch++, then derives separate error bounds and validates on external sparse matrices. No equation reduces a claimed prediction or bound to a fitted parameter or self-citation by construction; the central claims rest on the new combinations and numerical evidence rather than tautological renaming or load-bearing self-reference.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Target matrices are large sparse positive definite symmetric
Reference graph
Works this paper leans on
-
[1]
Iterative numerical methods for sampling from high dimensional gaussian distributions.Statistics and Computing, 23:501–521, 2013
Erlend Aune, Jo Eidsvik, and Yngve Pokern. Iterative numerical methods for sampling from high dimensional gaussian distributions.Statistics and Computing, 23:501–521, 2013
2013
-
[2]
Parameter estimation in high dimensional gaussian distributions.Statistics and Computing, 24:247–263, 2014
Erlend Aune, Daniel P Simpson, and Jo Eidsvik. Parameter estimation in high dimensional gaussian distributions.Statistics and Computing, 24:247–263, 2014
2014
-
[3]
Some large-scale matrix computation problems
Zhaojun Bai, Gark Fahey, and Gene Golub. Some large-scale matrix computation problems. Journal of Computational and Applied Mathematics, 74(1-2):71–89, 1996
1996
-
[4]
Krylov-aware stochastic trace estimation.SIAM Journal on Matrix Analysis and Applications, 44(3):1218–1244, 2023
Tyler Chen and Eric Hallman. Krylov-aware stochastic trace estimation.SIAM Journal on Matrix Analysis and Applications, 44(3):1218–1244, 2023
2023
-
[5]
On randomized trace estimates for indefinite matrices with an application to determinants.Foundations of Computational Mathematics, pages 1–29, 2022
Alice Cortinovis and Daniel Kressner. On randomized trace estimates for indefinite matrices with an application to determinants.Foundations of Computational Mathematics, pages 1–29, 2022
2022
-
[6]
The university of florida sparse matrix collection.ACM Transactions on Mathematical Software (TOMS), 38(1):1–25, 2011
Timothy A Davis and Yifan Hu. The university of florida sparse matrix collection.ACM Transactions on Mathematical Software (TOMS), 38(1):1–25, 2011
2011
-
[7]
Scalable log determinants for gaussian process kernel learning
Ke Dong, David Eriksson, Hannes Nickisch, David Bindel, and Andrew Gordon Wilson. Scalable log determinants for gaussian process kernel learning. InAdvances in Neural Information Processing Systems, volume 30, 2017
2017
-
[8]
Kiops: A fast adaptive krylov subspace solver for exponential integrators.Journal of Computational Physics, 372:236–255, 2018
Sophie Gaudreault, Mayya Tokman, and Gregory Rainwater. Kiops: A fast adaptive krylov subspace solver for exponential integrators.Journal of Computational Physics, 372:236–255, 2018
2018
-
[9]
Golub and G´erard Meurant.Matrices, moments and quadrature with applications
Gene H. Golub and G´erard Meurant.Matrices, moments and quadrature with applications. Springer, Berlin, 1994
1994
-
[10]
Approximating the spectral sums of large-scale matrices using stochastic chebyshev approximations.SIAM Journal on Scientific Computing, 39(4):A1558–A1585, 2017
Insu Han, Dmitry Malioutov, Haim Avron, and Jinwoo Shin. Approximating the spectral sums of large-scale matrices using stochastic chebyshev approximations.SIAM Journal on Scientific Computing, 39(4):A1558–A1585, 2017
2017
-
[11]
Large-scale log-determinant computation through stochastic chebyshev expansions
Insu Han, Dmitry Malioutov, and Jinwoo Shin. Large-scale log-determinant computation through stochastic chebyshev expansions. InInternational Conference on Machine Learning, pages 908–917. PMLR, 2015. 25
2015
-
[12]
Zongyuan Han, Wenhao Li, Yixuan Huang, and Shengxin Zhu. Suboptimal subspace con- struction for log-determinant approximation.arXiv preprint arXiv:2307.02152, 2023
-
[13]
Deep gaussian process priors for bayesian image reconstruction.Inverse Problems, 41(6):065016, 2025
Jonas Latz, Aretha L Teckentrup, and Simon Urbainczyk. Deep gaussian process priors for bayesian image reconstruction.Inverse Problems, 41(6):065016, 2025
2025
-
[14]
David J. C. MacKay.Information Theory, Inference, and Learning Algorithms. Cambridge University Press, Cambridge, UK, 2003
2003
-
[15]
Verlon Roel Mbingui, Antoine Tambue, and Issa Karambal. Novel technique based on l\’eja points approximation for log-determinant estimation of large matrices.arXiv preprint arXiv:2603.02207, 2026
-
[16]
SIAM, 2006
G´erard Meurant.The Lanczos and conjugate gradient algorithms: from theory to finite precision computations. SIAM, 2006
2006
-
[17]
Hutch++: Optimal stochastic trace estimation
Raphael A Meyer, Cameron Musco, Christopher Musco, and David P Woodruff. Hutch++: Optimal stochastic trace estimation. InSymposium on Simplicity in Algorithms (SOSA), pages 142–155. SIAM, 2021
2021
-
[18]
Carl Edward Rasmussen and Christopher K. I. Williams.Gaussian Processes for Machine Learning. MIT Press, Cambridge, MA, 2005
2005
-
[19]
Chapman and Hall/CRC, New York, 1st edition, 2005
H˚avard Rue and Leonhard Held.Gaussian Markov Random Fields: Theory and Applications. Chapman and Hall/CRC, New York, 1st edition, 2005
2005
-
[20]
Analysis of some krylov subspace approximations to the matrix exponential operator.SIAM Journal on Numerical Analysis, 29(1):209–228, 1992
Yousef Saad. Analysis of some krylov subspace approximations to the matrix exponential operator.SIAM Journal on Numerical Analysis, 29(1):209–228, 1992
1992
-
[21]
SIAM, 2003
Yousef Saad.Iterative methods for sparse linear systems. SIAM, 2003
2003
-
[22]
Springer Science & Business Media, 2013
James William Thomas.Numerical partial differential equations: finite difference methods, volume 22. Springer Science & Business Media, 2013
2013
-
[23]
Fast estimation of tr(f(a)) via stochastic lanczos quadrature.SIAM Journal on Matrix Analysis and Applications, 38(4):1075–1099, 2017
Shashanka Ubaru, Jie Chen, and Yousef Saad. Fast estimation of tr(f(a)) via stochastic lanczos quadrature.SIAM Journal on Matrix Analysis and Applications, 38(4):1075–1099, 2017
2017
-
[24]
Huy D V o and Roger B Sidje. Approximating the large sparse matrix exponential using incomplete orthogonalization and krylov subspaces of variable dimension.Numerical linear algebra with applications, 24(3):e2090, 2017
2017
-
[25]
Log-determinant relaxation for approximate infer- ence in discrete markov random fields.IEEE transactions on signal processing, 54(6):2099– 2109, 2006
Martin J Wainwright and Michael I Jordan. Log-determinant relaxation for approximate infer- ence in discrete markov random fields.IEEE transactions on signal processing, 54(6):2099– 2109, 2006
2099
-
[26]
Kingsley Yeon, Promit Ghosal, and Mihai Anitescu. Bolt: Block-orthonormal lanczos for trace estimation of matrix functions.arXiv preprint arXiv:2505.12289, 2025. 26
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.