pith. sign in

arxiv: 2411.05869 · v3 · submitted 2024-11-07 · 📊 stat.ML · cs.LG· stat.AP· stat.CO· stat.ME

Compactly-supported nonstationary kernels for computing exact Gaussian processes on big data

Pith reviewed 2026-05-23 17:53 UTC · model grok-4.3

classification 📊 stat.ML cs.LGstat.APstat.COstat.ME
keywords Gaussian processesnonstationary kernelscompactly supported kernelsexact inferencebig datasparsityspace-time predictionBayesian modeling
0
0 comments X

The pith

A derived kernel encodes both sparsity and nonstationarity to permit exact Gaussian process inference on data sets larger than one million points.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper derives an alternative kernel that simultaneously induces sparsity through compact support and captures nonstationary behavior. This kernel is placed inside a fully Bayesian Gaussian process model and paired with high-performance computing to perform exact inference on massive data without approximating the likelihood. The method is shown to work on synthetic examples and on more than one million daily maximum temperature observations, where it produces better predictions than existing exact and approximate approaches. A sympathetic reader would care because standard Gaussian processes have been limited to roughly ten thousand points by the combination of stationary kernels and the cost of exact matrix operations.

Core claim

We explicitly derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We embed the kernel within a fully Bayesian GP model and leverage high-performance computing resources to enable the analysis of massive data sets. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods across a variety of synthetic data examples. Furthermore, we conduct space-time prediction based on more than one million measurements of daily maximum temperature and verify that our results outperform state-of-the-art methods in the Earth sciences.

What carries the argument

The compactly-supported nonstationary kernel that remains positive definite while producing a sparse covariance matrix suitable for exact Cholesky factorization.

If this is right

  • Exact Cholesky or similar factorizations become feasible at million-point scale without any approximation to the likelihood.
  • The kernel produces better predictions than both exact and approximate GP baselines on synthetic nonstationary data.
  • Space-time predictions on more than one million temperature measurements outperform current state-of-the-art Earth science methods.
  • Gaussian processes can be applied directly to big data problems while retaining exact inference and full uncertainty quantification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same kernel construction might extend to other covariance families or to problems outside space-time settings.
  • Domains that already use sensor networks or climate archives could adopt the approach to obtain exact posterior uncertainty at scales previously requiring approximations.
  • Further work could test whether the sparsity pattern remains favorable when the number of observations grows by another order of magnitude.

Load-bearing premise

The derived kernel must stay positive definite for all parameter values and must create enough zeros in the covariance matrix for exact factorizations to succeed at million-point scale.

What would settle it

A test on one million points in which the kernel matrix ceases to be positive definite for admissible parameter values or in which the exact factorization cannot complete because the induced sparsity is insufficient.

read the original abstract

The Gaussian process (GP) is a widely used probabilistic machine learning method with implicit uncertainty characterization for stochastic function approximation, stochastic modeling, and analyzing real-world measurements of nonlinear processes. Traditional implementations of GPs involve stationary kernels (also termed covariance functions) that limit their flexibility, and exact methods for inference that prevent application to data sets with more than about ten thousand points. Modern approaches to address stationarity assumptions generally fail to accommodate large data sets, while all attempts to address scalability focus on approximating the Gaussian likelihood, which can involve subjectivity and lead to inaccuracies. In this work, we explicitly derive an alternative kernel that can discover and encode both sparsity and nonstationarity. We embed the kernel within a fully Bayesian GP model and leverage high-performance computing resources to enable the analysis of massive data sets. We demonstrate the favorable performance of our novel kernel relative to existing exact and approximate GP methods across a variety of synthetic data examples. Furthermore, we conduct space-time prediction based on more than one million measurements of daily maximum temperature and verify that our results outperform state-of-the-art methods in the Earth sciences. More broadly, having access to exact GPs that use ultra-scalable, sparsity-discovering, nonstationary kernels allows GP methods to truly compete with a wide variety of machine learning methods.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to derive a compactly-supported nonstationary kernel that simultaneously encodes sparsity and nonstationarity, embed it in a fully Bayesian GP model, and thereby enable exact (non-approximate) GP inference and prediction on datasets exceeding one million points, with favorable performance shown on synthetic examples and a space-time temperature dataset of >1M observations.

Significance. If the kernel is provably positive definite for the claimed parameter ranges and the induced sparsity pattern permits exact Cholesky factorization at million-point scale without approximation, the result would meaningfully extend the reach of exact GPs to big-data regimes where only approximate methods have been feasible, with direct relevance to spatial statistics and Earth-science applications.

major comments (2)
  1. [§3] §3 (kernel derivation): the manuscript does not supply a general proof that the nonstationary modification of the compactly-supported base kernel remains positive definite for arbitrary (nonstationary) length-scale or amplitude functions; verification appears limited to restricted parameter grids or empirical checks, which is insufficient to guarantee the exact-inference claim across all regimes asserted in the abstract and §4.
  2. [§4.3] §4.3 (scaling experiments): the reported wall-clock times and memory usage for the 1M-point temperature dataset presuppose that the sparsity pattern remains sufficiently sparse for exact factorization under the fitted nonstationary parameters, yet no explicit bound on the number of non-zero entries (or fill-in during Cholesky) as a function of the nonstationarity parameters is provided; this leaves open whether the method scales for data configurations outside the demonstrated examples.
minor comments (2)
  1. Notation for the nonstationary parameters (e.g., the functions modulating length scale) is introduced without a consolidated table of symbols, making cross-references between the derivation and the experiments harder to follow.
  2. Figure 2 caption does not state the exact number of inducing points or the precise sparsity threshold used for the baseline methods, complicating direct comparison.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive report and the opportunity to clarify the positive-definiteness and scaling aspects of the work. We address each major comment below, indicating planned revisions where the manuscript can be strengthened without misrepresenting its current content.

read point-by-point responses
  1. Referee: [§3] §3 (kernel derivation): the manuscript does not supply a general proof that the nonstationary modification of the compactly-supported base kernel remains positive definite for arbitrary (nonstationary) length-scale or amplitude functions; verification appears limited to restricted parameter grids or empirical checks, which is insufficient to guarantee the exact-inference claim across all regimes asserted in the abstract and §4.

    Authors: We agree that the manuscript does not contain a general theorem establishing positive definiteness for completely arbitrary positive continuous length-scale and amplitude functions. Section 3 constructs the kernel by pointwise multiplication of a compactly-supported positive-definite base kernel with positive scalar functions; this construction preserves positive definiteness whenever the base kernel is positive definite and the modulating functions are positive and continuous, but the paper presents this only as a derivation rather than a formal proof covering all regimes. Empirical checks and restricted grids are indeed the primary verification supplied. We will revise §3 to state the precise conditions under which positive definiteness is guaranteed by the construction and to expand the numerical verification to a broader grid of length-scale and amplitude functions. This revision will better support the exact-inference claims. revision: yes

  2. Referee: [§4.3] §4.3 (scaling experiments): the reported wall-clock times and memory usage for the 1M-point temperature dataset presuppose that the sparsity pattern remains sufficiently sparse for exact factorization under the fitted nonstationary parameters, yet no explicit bound on the number of non-zero entries (or fill-in during Cholesky) as a function of the nonstationarity parameters is provided; this leaves open whether the method scales for data configurations outside the demonstrated examples.

    Authors: We concur that an explicit, parameter-dependent bound on non-zero entries or Cholesky fill-in is absent. Sparsity is controlled by the local length-scale function through the compact support radius of the base kernel; in the temperature example the fitted length-scales produce a sparsity pattern permitting exact factorization at the reported scale. The manuscript does not derive a general bound that would guarantee the same behavior for arbitrary nonstationary functions or data geometries. In revision we will add a paragraph in §4.3 describing how the support radius is determined by the learned length-scale field, report the observed number of non-zeros for the fitted model, and include a brief sensitivity plot showing sparsity as a function of length-scale magnitude. This supplies concrete information for the demonstrated regime while acknowledging that a universal bound remains future work. revision: partial

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents an explicit derivation of a compactly-supported nonstationary kernel, which is then embedded in a fully Bayesian GP model for exact inference on large datasets. No load-bearing step reduces by construction to a fitted quantity from the same data, a self-citation chain, or a renaming of an input; the kernel form and its claimed positive-definiteness/sparsity properties are asserted via the derivation itself rather than via parameter fitting or prior self-referential results. Empirical demonstrations on synthetic examples and >1M-point temperature data serve as independent verification rather than tautological confirmation. The central claim therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review prevents identification of specific free parameters, axioms, or invented entities; the kernel itself is the central new object but its construction details are unavailable.

pith-pipeline@v0.9.0 · 5778 in / 1000 out tokens · 17512 ms · 2026-05-23T17:53:24.630637+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages · 1 internal anchor

  1. [1]

    John Wiley & Sons, New York (1991) 24

    Cressie, N.A.C.: Statistics for Spatial Data. John Wiley & Sons, New York (1991) 24

  2. [2]

    Probability and Statistics

    Cressie, N., Wikle, C.K.: Statistics for Spatio-Temporal Data. Probability and Statistics. John Wiley & Sons, New York (2011)

  3. [3]

    Deisenroth, M.P.: Efficient Reinforcement Learning Using Gaussian Processes vol. 9. KIT Scientific Publishing, Print on Demand (2010)

  4. [4]

    In: International Conference on Machine Learning, pp

    Vinogradska, J., Bischoff, B., Nguyen-Tuong, D., Romer, A., Schmidt, H., Peters, J.: Stability of con- trollers for gaussian process forward models. In: International Conference on Machine Learning, pp. 545–554 (2016). PMLR

  5. [5]

    Journal of Computational and Graphical Statistics33(3), 855–868 (2024)

    Luo, H., Cho, Y., Demmel, J.W., Li, X.S., Liu, Y.: Hybrid parameter search and dynamic model selection for mixed-variable bayesian optimization. Journal of Computational and Graphical Statistics33(3), 855–868 (2024)

  6. [6]

    In: International Conference on Artificial Intelligence and Statistics, pp

    Tuo, R., Wang, W.: Uncertainty quantification for bayesian optimization. In: International Conference on Artificial Intelligence and Statistics, pp. 2862–2884 (2022). PMLR

  7. [7]

    Electronic Journal of Statistics15(2), 5014–5066 (2021)

    Wang, W.: On the inference of applying gaussian process modeling to a deterministic function. Electronic Journal of Statistics15(2), 5014–5066 (2021)

  8. [8]

    Scientific reports9(1), 11809 (2019)

    Noack, M.M., Yager, K.G., Fukuto, M., Doerk, G.S., Li, R., Sethian, J.A.: A kriging-based approach to autonomous experimentation with applications to x-ray scattering. Scientific reports9(1), 11809 (2019)

  9. [9]

    Matter4(9), 2702–2726 (2021)

    Stach, E., DeCost, B., Kusne, A.G., Hattrick-Simpers, J., Brown, K.A., Reyes, K.G., Schrier, J., Billinge, S., Buonassisi, T., Foster, I.,et al.: Autonomous experimentation systems for materials development: A community perspective. Matter4(9), 2702–2726 (2021)

  10. [10]

    Nature Reviews Physics3(10), 685–697 (2021)

    Noack, M.M., Zwart, P.H., Ushizima, D.M., Fukuto, M., Yager, K.G., Elbert, K.C., Murray, C.B., Stein, A., Doerk, G.S., Tsai, E.H.,et al.: Gaussian processes for autonomous data acquisition at large-scale synchrotron and neutron facilities. Nature Reviews Physics3(10), 685–697 (2021)

  11. [11]

    npj Computational Materials8(1), 99 (2022)

    Thomas, J.C., Rossi, A., Smalley, D., Francaviglia, L., Yu, Z., Zhang, T., Kumari, S., Robinson, J.A., Terrones, M., Ishigami, M.,et al.: Autonomous scanning probe microscopy investigations over ws2 and au{111}. npj Computational Materials8(1), 99 (2022)

  12. [12]

    Processes8(1), 24 (2020)

    Pilario, K.E., Shafiee, M., Cao, Y., Lao, L., Yang, S.-H.: A review of kernel methods for feature extraction in nonlinear process monitoring. Processes8(1), 24 (2020)

  13. [13]

    Mixed-Stationary Gaussian Process for Flexible Non-Stationary Modeling of Spatial Outcomes

    Duan, L.L., Wang, X., Szczesniak, R.D.: Mixed-stationary gaussian process for flexible non-stationary modeling of spatial outcomes. arXiv preprint arXiv:1807.06656 (2018)

  14. [14]

    Sampson, P.D., Guttorp, P.: Nonparametric estimation of nonstationary spatial covariance structure 87(417), 108–119 (1992)

  15. [15]

    Environmental and Ecological Statistics5(2), 173–190 (1998) 25

    Higdon, D.: A process-convolution approach to modelling temperatures in the North Atlantic Ocean. Environmental and Ecological Statistics5(2), 173–190 (1998) 25

  16. [16]

    Environ- metrics12(5), 469–483 (2001)

    Fuentes, M.: A high frequency kriging approach for non-stationary environmental processes. Environ- metrics12(5), 469–483 (2001)

  17. [17]

    Environmetrics17, 483–506 (2006)

    Paciorek, C.J., Schervish, M.J.: Spatial modeling using a new class of nonstationary covariance functions. Environmetrics17, 483–506 (2006)

  18. [18]

    In: Artificial Intelligence and Statistics, pp

    Damianou, A., Lawrence, N.D.: Deep gaussian processes. In: Artificial Intelligence and Statistics, pp. 207–215 (2013). PMLR

  19. [19]

    In: Artificial Intelligence and Statistics, pp

    Wilson, A.G., Hu, Z., Salakhutdinov, R., Xing, E.P.: Deep kernel learning. In: Artificial Intelligence and Statistics, pp. 370–378 (2016). PMLR

  20. [20]

    Geographical Analysis48(1), 82–111 (2016)

    Dearmon, J., Smith, T.E.: Gaussian Process Regression and Bayesian Model Averaging: An Alternative Approach to Modeling Spatial Phenomena. Geographical Analysis48(1), 82–111 (2016)

  21. [21]

    Machine Learning with Applications13, 100487 (2023)

    Manzhos, S., Ihara, M.: Rectangularization of gaussian process regression for optimization of hyperpa- rameters. Machine Learning with Applications13, 100487 (2023)

  22. [22]

    Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning vol. 2. MIT press Cambridge, Boston, MA (2006)

  23. [23]

    arXiv:2306.00361, 1–46 (2022)

    Luo, H., Pratola, M.T.: Sharded Bayesian Additive Regression Trees. arXiv:2306.00361, 1–46 (2022)

  24. [24]

    SIAM/ASA Journal on Uncertainty Quantification12(4), 1192–1212 (2024)

    Luo, H., Strait, J.D.: Multiple closed curve modeling with uncertainty quantification for shape analysis. SIAM/ASA Journal on Uncertainty Quantification12(4), 1192–1212 (2024)

  25. [25]

    In: International Conference on Machine Learning, pp

    Cohen, S., Mbuvha, R., Marwala, T., Deisenroth, M.: Healing products of gaussian process experts. In: International Conference on Machine Learning, pp. 2068–2077 (2020). PMLR

  26. [26]

    Journal of the Royal Statistical Society: Series B (Statistical Methodology)70(4), 825–848 (2008)

    Banerjee, S., Gelfand, A.E., Finley, A.O., Sang, H.: Gaussian predictive process models for large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology)70(4), 825–848 (2008)

  27. [27]

    Journal of the Royal Statistical Society: Series B (Statistical Methodology)70(1), 209–226 (2008)

    Cressie, N., Johannesson, G.: Fixed rank kriging for very large spatial data sets. Journal of the Royal Statistical Society: Series B (Statistical Methodology)70(1), 209–226 (2008)

  28. [28]

    Journal of Machine Learning Research23(61), 1–34 (2022)

    Luo, H., Nattino, G., Pratola, M.T.: Sparse Additive Gaussian Process Regression. Journal of Machine Learning Research23(61), 1–34 (2022)

  29. [29]

    Journal of Computational and Graphical Statistics15(3), 502–523 (2006) https://doi.org/10.1198/ 106186006x132178

    Furrer, R., Genton, M.G., Nychka, D.: Covariance tapering for interpolation of large spatial datasets. Journal of Computational and Graphical Statistics15(3), 502–523 (2006) https://doi.org/10.1198/ 106186006x132178

  30. [30]

    Journal of the American Statistical Association103(484), 1545–1555 (2008) https://doi.org/10.1198/016214508000000959

    Kaufman, C.G., Schervish, M.J., Nychka, D.W.: Covariance tapering for likelihood-based estimation in large spatial data sets. Journal of the American Statistical Association103(484), 1545–1555 (2008) https://doi.org/10.1198/016214508000000959

  31. [31]

    In: 26 International Conference on Machine Learning, pp

    Wilson, A., Nickisch, H.: Kernel interpolation for scalable structured gaussian processes (kiss-gp). In: 26 International Conference on Machine Learning, pp. 1775–1784 (2015). PMLR

  32. [32]

    Journal of the Royal Statistical Society

    Vecchia, A.V.: Estimation and model identification for continuous spatial processes. Journal of the Royal Statistical Society. Series B (Methodological)50(2), 297–312 (1988)

  33. [33]

    Statistical Science36(1) (2021)

    Katzfuss, M., Guinness, J.: A General Framework for Vecchia Approximations of Gaussian Processes. Statistical Science36(1) (2021)

  34. [34]

    arXiv preprint arXiv:2410.10649 (2024)

    Szabo, B., Zhu, Y.: Vecchia gaussian processes: Probabilistic properties, minimax rates and method- ological developments. arXiv preprint arXiv:2410.10649 (2024)

  35. [35]

    Journal of Agricultural, Biological and Environmental Statistics24, 398–425 (2019)

    Heaton, M.J., Datta, A., Finley, A.O., Furrer, R., Guinness, J., Guhaniyogi, R., Gerber, F., Gramacy, R.B., Hammerling, D., Katzfuss, M.,et al.: A case study competition among methods for analyzing large spatial data. Journal of Agricultural, Biological and Environmental Statistics24, 398–425 (2019)

  36. [36]

    Advances in Computational Mathematics4(1), 389–396 (1995)

    Wendland, H.: Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in Computational Mathematics4(1), 389–396 (1995)

  37. [37]

    Mathematics of Computation 70(233), 307–318 (2001)

    Buhmann, M.: A new class of radial basis functions with compact support. Mathematics of Computation 70(233), 307–318 (2001)

  38. [38]

    Journal of Machine Learning Research2(Dec), 299–312 (2001)

    Genton, M.G.: Classes of kernels for machine learning: a statistics perspective. Journal of Machine Learning Research2(Dec), 299–312 (2001)

  39. [39]

    Journal of Multivariate Analysis83(2), 493– 508 (2002)

    Gneiting, T.: Compactly supported correlation functions. Journal of Multivariate Analysis83(2), 493– 508 (2002)

  40. [40]

    In: Proceedings of the 21st International Joint Conference on Artificial Intelligence

    Melkumyan, A., Ramos, F.T.: A sparse covariance function for exact gaussian process inference in large datasets. In: Proceedings of the 21st International Joint Conference on Artificial Intelligence. IJCAI’09, pp. 1936–1942 (2009)

  41. [41]

    Cambridge university press, Cambridge, UK (2012)

    Horn, R.A., Johnson, C.R.: Matrix Analysis. Cambridge university press, Cambridge, UK (2012)

  42. [42]

    Journal of Computational and Applied Mathematics325, 97–112 (2017)

    Noack, M.M., Funke, S.W.: Hybrid genetic deflated newton method for global optimisation. Journal of Computational and Applied Mathematics325, 97–112 (2017)

  43. [43]

    Journal of the Atmospheric Sciences42(23), 2652–2662 (1985) https://doi.org/10.1175/1520-0469(1985)042⟨2652:ENALN⟩2.0.CO;2

    Philander, S.G.H.: El Ni˜ no and La Ni˜ na. Journal of the Atmospheric Sciences42(23), 2652–2662 (1985) https://doi.org/10.1175/1520-0469(1985)042⟨2652:ENALN⟩2.0.CO;2

  44. [44]

    Journal of Machine Learning Research6, 1939–1959 (2005)

    Quinonero-Candela, J., Rasmussen, C.E.: A unifying view of sparse approximate gaussian process regression. Journal of Machine Learning Research6, 1939–1959 (2005)

  45. [45]

    Environmetrics26(4), 284–297 (2015)

    Risser, M.D., Calder, C.A.: Regression-based covariance functions for nonstationary spatial modeling. Environmetrics26(4), 284–297 (2015)

  46. [46]

    Journal of Statistical Computation and Simulation90(16), 2902–2928 (2020) 27

    Risser, M.D., Turek, D.: Bayesian inference for high-dimensional nonstationary gaussian processes. Journal of Statistical Computation and Simulation90(16), 2902–2928 (2020) 27

  47. [47]

    APL Machine Learning2(1) (2024)

    Noack, M.M., Luo, H., Risser, M.D.: A unifying perspective on non-stationary kernels for deeper gaussian processes. APL Machine Learning2(1) (2024)

  48. [48]

    Vaart, A.W., Van Zanten, J.H.: Rates of contraction of posterior distributions based on gaussian process priors (2008)

  49. [49]

    CRC press, B.V

    Gilks, W.R., Richardson, S., Spiegelhalter, D.: Markov Chain Monte Carlo in Practice. CRC press, B.V. (1995)

  50. [50]

    Scientific Reports13(1) (2023)

    Noack, M.M., Krishnan, H., Risser, M.D., Reyes, K.G.: Exact Gaussian processes for massive datasets via non-stationary sparsity-discovering kernels. Scientific Reports13(1) (2023)

  51. [51]

    Journal of the American Statistical Association111(514), 800– 812 (2016)

    Datta, A., Banerjee, S., Finley, A.O., Gelfand, A.E.: Hierarchical nearest-neighbor gaussian process models for large geostatistical datasets. Journal of the American Statistical Association111(514), 800– 812 (2016)

  52. [52]

    Journal of the American Statistical Association102(477), 359–378 (2007)

    Gneiting, T., Raftery, A.E.: Strictly proper scoring rules, prediction, and estimation. Journal of the American Statistical Association102(477), 359–378 (2007)

  53. [53]

    Turek, D., Risser, M.: BayesNSGP: Bayesian Analysis of Non-Stationary Gaussian Process Models. (2022). R package version 0.1.2. https://CRAN.R-project.org/package=BayesNSGP

  54. [54]

    Journal of Statistical Software19(4), 1 (2007)

    Finley, A.O., Banerjee, S., Carlin, B.P.: spbayes: an r package for univariate and multivariate hierarchical point-referenced spatial models. Journal of Statistical Software19(4), 1 (2007)

  55. [55]

    Scientific data2(1), 1–12 (2015)

    Livneh, B., Bohn, T.J., Pierce, D.W., Munoz-Arriola, F., Nijssen, B., Vose, R., Cayan, D.R., Brekke, L.: A spatially comprehensive, hydrometeorological data set for Mexico, the US, and Southern Canada 1950–2013. Scientific data2(1), 1–12 (2015)

  56. [56]

    NOAA National Centers for Environmental Information

    Livneh, B., Bohn, T.J., Pierce, D.W., Munoz-Arriola, F., Nijssen, B., Vose, R., Cayan, D.R., Brekke, L.: A spatially comprehensive, hydrometeorological data set for Mexico, the US, and Southern Canada (NCEI Accession 0129374). NOAA National Centers for Environmental Information. Dataset. (Daily precipitation). (2015) https://doi.org/10.7289/v5x34vf6 . Acc...

  57. [57]

    Spatial statistics and models, 133–145 (1984)

    Shepard, D.S.: Computer mapping: The symap interpolation algorithm. Spatial statistics and models, 133–145 (1984)

  58. [58]

    Journal of Atmospheric and Oceanic Technology29(7), 897–910 (2012)

    Menne, M.J., Durre, I., Vose, R.S., Gleason, B.E., Houston, T.G.: An overview of the Global Historical Climatology Network-Daily database. Journal of Atmospheric and Oceanic Technology29(7), 897–910 (2012)

  59. [59]

    SIAM journal on numerical analysis12(4), 617–629 (1975)

    Paige, C.C., Saunders, M.A.: Solution of sparse indefinite systems of linear equations. SIAM journal on numerical analysis12(4), 617–629 (1975)

  60. [60]

    Advances in neural information processing systems32(2019)

    Wang, K., Pleiss, G., Gardner, J., Tyree, S., Weinberger, K.Q., Wilson, A.G.: Exact gaussian processes on a million data points. Advances in neural information processing systems32(2019)

  61. [61]

    SIAM 28 Journal on Matrix Analysis and Applications38(4), 1075–1099 (2017)

    Ubaru, S., Chen, J., Saad, Y.: Fast estimation oftr(f(a)) via stochastic lanczos quadrature. SIAM 28 Journal on Matrix Analysis and Applications38(4), 1075–1099 (2017)

  62. [62]

    Van Den Berg, C., Christensen, J.P.R., Ressel, P.: Harmonic Analysis on Semigroups: Theory of Positive Definite and Related Functions vol. 100. Springer, ??? (2012)

  63. [63]

    Bernoulli7, 223–242 (2001)

    Haario, H., Saksman, E., Tamminen, J.: An adaptive metropolis algorithm. Bernoulli7, 223–242 (2001)

  64. [64]

    Currently under review1(1), 17 (2010)

    Shaby, B., Wells, M.T.: Exploring an adaptive metropolis algorithm. Currently under review1(1), 17 (2010)

  65. [65]

    Hijmans, R.J.: Raster: Geographic Data Analysis and Modeling. (2022). R package version 3.5-15. https://CRAN.R-project.org/package=raster

  66. [66]

    https://oceancolor.gsfc.nasa.gov/docs/distfromcoast/

    NASA’s Ocean Biology Processing Group: Distance to the Nearest Coast gridded data set (2009). https://oceancolor.gsfc.nasa.gov/docs/distfromcoast/

  67. [67]

    parametric

    Wessel, P., Smith, W.H.: New, improved version of generic mapping tools released. Eos, Transactions American Geophysical Union79(47), 579–579 (1998) 29 A Proof of Propositions Proposition 1.The kernelC sparse is strictly positive definite. Furthermore,C y is strictly positive definite wheneverC core is. More formally, for every admissible parameter vector...

  68. [68]

    The prior mean function coefficientsβ(note that a closed-form Gibbs update is available forβunder the likelihood and prior choice described above; we found that an adaptive Metropolis update worked just as well and obviated additional computational steps)

  69. [69]

    The core kernel hyperparametersθ core, the error variance hyperparametersθ z, and Wendland hyperpa- rameterss 0 andr 0

  70. [70]

    λ1(s) 0 0λ 2(s) # ,Γ(s) =

    The bump function positions{h ij}and radii{r ij}. A fourth block updates all of the amplitudes{a ij}in a single step. Since these are binary variables, the proposal distribution matches the prior, i.e., a proposeda ∗ ij is drawn from a Bernoulli distribution with success probabilityπ curr ij (the current value of the prior probabilities). Lastly, we can u...