pith. sign in

arxiv: 2606.10409 · v1 · pith:LWPKGERBnew · submitted 2026-06-09 · 📊 stat.ME

Robust Bayesian Predictive Model Selection using Bregman Divergence

Pith reviewed 2026-06-27 12:41 UTC · model grok-4.3

classification 📊 stat.ME
keywords Bayesian model selectionBregman divergencegeneralized ELPDleave-one-out cross-validationmodel misspecificationrobust predictive comparisonbeta-divergencegeneralized posterior
0
0 comments X

The pith

Replacing the log score with a Bregman divergence in leave-one-out cross-validation yields a predictive model selector that asymptotically picks the closest distribution under misspecification.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a generalized expected log predictive density that substitutes a Bregman scoring rule for the usual log score when updating parameters via a generalized posterior and when scoring out-of-sample predictive utility. Candidate models are then ranked directly by this proper-score utility, with the beta-divergence family singled out because its tuning parameter reduces the influence of low-density observations. Under misspecification the ranking is shown to converge to the model whose predictive distribution minimizes the chosen Bregman divergence to the data-generating process. The change in scoring rule is motivated by the known sensitivity of the log score to outliers and tail behavior in standard LOO cross-validation.

Core claim

A score-matched generalized ELPD framework replaces the log score by a Bregman scoring rule both to form the generalized posterior and to evaluate leave-one-out predictive utility; under model misspecification this procedure asymptotically selects the model whose predictive distribution is closest to the data-generating process under the chosen Bregman divergence.

What carries the argument

The Bregman scoring rule and its associated generalized posterior, which together define the generalized ELPD used for predictive utility ranking.

If this is right

  • Model rankings become tunable for outlier sensitivity by choice of the beta parameter in the beta-divergence family.
  • In microbial and forensic data examples the selected model can differ from the one chosen by ordinary ELPD because low-density observations exert less influence.
  • The framework supplies a direct proper-score generalization of standard leave-one-out cross-validation.
  • Asymptotic consistency targets the predictive distribution that minimizes the chosen divergence rather than the Kullback-Leibler divergence.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same generalized-posterior construction could be applied with other proper scoring rules to achieve robustness properties not limited to the Bregman family.
  • In settings with heavy tails or contamination the method offers a concrete way to trade bias for reduced variance in model selection.
  • The divergence-minimizing property suggests that model averaging weights derived from the generalized ELPD would also converge to weights concentrated on the closest predictive distributions.

Load-bearing premise

The Bregman scoring rule and generalized posterior produce an out-of-sample utility ranking that is asymptotically consistent for the divergence-minimizing model, without explicit conditions stated on the model class or data-generating process.

What would settle it

A Monte Carlo experiment in which the procedure repeatedly selects a model whose predictive distribution does not minimize the target Bregman divergence to the known data-generating process would falsify the asymptotic selection claim.

Figures

Figures reproduced from arXiv: 2606.10409 by Dipak K. Dey, Jongwoo Choi, Neil A. Spencer.

Figure 1
Figure 1. Figure 1: Histogram shows n = 1000 simulated data from q(x) = (1−ϵ) N (x; 0, 1)+ϵ N (x; 0, 102 ), with overlays of M1 : N (0, 1) (center-correct, light-tailed) and M2 : t2(3, 1) (miscentered, heavy￾tailed). Despite their broad success, ELPD-based criteria can behave undesirably in the pres￾ence of outliers or heavy-tailed observations. To make this concrete, consider the contam￾inated normal DGP q(x) = (1 − ϵ) N (x;… view at source ↗
Figure 2
Figure 2. Figure 2: Robust predictive model selection in the contaminated normal simulation. [PITH_FULL_IMAGE:figures/full_fig_p018_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Robust predictive model selection for the [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: The binarized contact grid and RAC locations for the five shoe treads. Orange tiles [PITH_FULL_IMAGE:figures/full_fig_p021_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Robust predictive model selection for the synthetic JESA dataset. [PITH_FULL_IMAGE:figures/full_fig_p023_5.png] view at source ↗
read the original abstract

Predictive Bayesian model comparison often relies on leave-one-out (LOO) cross-validation criteria such as the expected log predictive density (ELPD). However, model rankings can be overly sensitive to outliers and tail mismatch because ELPD is based on the log score. We propose a score-matched generalized ELPD framework that replaces the log score by a Bregman scoring rule to update model parameters through a generalized posterior and to evaluate LOO predictive utility. Candidate posterior predictive distributions are ranked by out-of-sample utility under the chosen scoring rule, yielding a direct proper-score generalization of standard ELPD. We focus especially on the $\beta$-divergence family, where $\beta$ controls the sensitivity of predictive comparison to low-density observations. Under model misspecification, the procedure asymptotically selects the model whose predictive distribution is closest to the data-generating process under the chosen Bregman divergence. A simulation study and applications to microbial and forensic data show that the generalized ELPD can change the selected model through reduced sensitivity to low-density observations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper proposes a generalized ELPD framework that replaces the log score with a Bregman scoring rule (focusing on the β-divergence family) both to form a generalized posterior and to compute LOO predictive utility for model ranking. Under misspecification the procedure is claimed to asymptotically select the predictive distribution minimizing the chosen divergence to the DGP. Simulations and applications to microbial and forensic data are reported to produce different model rankings than standard ELPD due to reduced sensitivity to low-density observations.

Significance. If the asymptotic selection property can be rigorously established, the framework would supply a tunable robust alternative to ELPD-based predictive model comparison. The empirical illustrations already show that altering the scoring rule can change selected models, which is of practical interest in misspecified settings. However, the absence of any derivation, regularity conditions, or quantitative verification of the generalized posterior concentration undermines the central claim and therefore the current significance of the contribution.

major comments (2)
  1. [Abstract] Abstract: The asymptotic selection claim (“the procedure asymptotically selects the model whose predictive distribution is closest to the data-generating process under the chosen Bregman divergence”) is stated without any derivation, reference to a theorem, or list of regularity conditions (compactness of parameter space, uniform integrability of the score, uniqueness of the minimizer, ergodicity of the data process). This is the load-bearing theoretical result; its absence prevents assessment of whether the generalized posterior and LOO utility ranking are consistent for the divergence minimizer.
  2. [Abstract / Method description] The construction of the generalized posterior via replacement of the log score by the Bregman scoring rule is described only at a high level; no explicit form of the generalized posterior, no proof that it concentrates at the expected-score minimizer, and no discussion of how the β parameter enters the posterior are supplied. These steps are required for the subsequent LOO ranking argument.
minor comments (1)
  1. [Abstract] The phrase “score-matched generalized ELPD framework” is introduced without a precise definition or equation linking the Bregman score to the leave-one-out utility; a short clarifying equation would improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thorough review and for highlighting the need for explicit theoretical support. We agree that the current manuscript presents the asymptotic selection property and the generalized posterior construction at a high level. Below we address each major comment and commit to adding the required derivations and explicit forms in a revised version.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The asymptotic selection claim (“the procedure asymptotically selects the model whose predictive distribution is closest to the data-generating process under the chosen Bregman divergence”) is stated without any derivation, reference to a theorem, or list of regularity conditions (compactness of parameter space, uniform integrability of the score, uniqueness of the minimizer, ergodicity of the data process). This is the load-bearing theoretical result; its absence prevents assessment of whether the generalized posterior and LOO utility ranking are consistent for the divergence minimizer.

    Authors: We acknowledge that the abstract asserts the asymptotic selection property without a derivation or list of regularity conditions in the main text. Although the claim is a direct consequence of standard consistency results for generalized posteriors defined by proper scoring rules, we agree that a self-contained argument is required. In the revision we will add a dedicated theoretical section that derives the asymptotic selection result under explicit regularity conditions (compact parameter space, uniform integrability of the Bregman score, uniqueness of the minimizer, and ergodicity of the data-generating process). revision: yes

  2. Referee: [Abstract / Method description] The construction of the generalized posterior via replacement of the log score by the Bregman scoring rule is described only at a high level; no explicit form of the generalized posterior, no proof that it concentrates at the expected-score minimizer, and no discussion of how the β parameter enters the posterior are supplied. These steps are required for the subsequent LOO ranking argument.

    Authors: We accept the criticism that the generalized posterior is introduced only conceptually. The revised manuscript will supply the explicit functional form of the generalized posterior, prove its concentration at the minimizer of the expected Bregman score (under the regularity conditions listed in the response to the first comment), and detail how the tuning parameter β enters both the posterior and the LOO utility through the β-divergence scoring rule. revision: yes

Circularity Check

0 steps flagged

No circularity: asymptotic claim rests on external properties of proper scoring rules

full rationale

The paper's central claim—that the procedure asymptotically selects the Bregman-divergence-minimizing predictive distribution under misspecification—is presented as a direct consequence of the general theory of proper scoring rules and generalized posteriors. No equation or derivation step within the abstract or described framework reduces this result to a fitted parameter, self-defined quantity, or load-bearing self-citation internal to the paper. The consistency argument is invoked from established scoring-rule properties rather than constructed tautologically inside the manuscript, rendering the derivation self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The framework rests on the standard mathematical fact that Bregman divergences induce proper scoring rules and on the domain assumption that a generalized posterior defined via the same score yields asymptotically consistent model ranking under misspecification.

free parameters (1)
  • β
    Tuning parameter that controls sensitivity to low-density observations; chosen by the user.
axioms (1)
  • standard math Bregman divergences define proper scoring rules whose expected value is minimized by the true predictive distribution.
    Invoked to justify both the generalized posterior and the out-of-sample utility ranking.

pith-pipeline@v0.9.1-grok · 5706 in / 1138 out tokens · 17112 ms · 2026-06-27T12:41:14.680216+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

184 extracted references · 7 canonical work pages

  1. [1]

    S.-I. Amari. -divergence is unique, belonging to both f -divergence and Bregman divergence classes . IEEE Transactions on Information Theory, 55 0 (11): 0 4925--4931, 2009

  2. [2]

    M. J. Angilletta Jr. Estimating and comparing thermal performance curves. Journal of Thermal Biology, 31 0 (7): 0 541--545, 2006

  3. [3]

    Banerjee, S

    A. Banerjee, S. Merugu, I. S. Dhillon, J. Ghosh, and J. Lafferty. Clustering with Bregman divergences. Journal of machine learning research, 6 0 (10), 2005

  4. [4]

    A. Basu, I. R. Harris, N. L. Hjort, and M. Jones. Robust and efficient estimation by minimising a density power divergence. Biometrika, 85 0 (3): 0 549--559, 1998

  5. [5]

    J. O. Berger. Statistical decision theory and Bayesian analysis. Springer Science & Business Media, 1985

  6. [6]

    J. O. Berger. An overview of robust Bayesian analysis . Test, 3 0 (1): 0 5--124, 1994

  7. [7]

    R. H. Berk. Limiting behavior of posterior distributions when the model is incorrect. The Annals of Mathematical Statistics, 37 0 (1): 0 51--58, 1966

  8. [8]

    J. M. Bernardo and A. F. Smith. Bayesian Theory, volume 586. Wiley Online Library, 1994

  9. [9]

    Besag, J

    J. Besag, J. York, and A. Molli \'e . Bayesian image restoration, with two applications in spatial statistics. Annals of the institute of statistical mathematics, 43 0 (1): 0 1--20, 1991

  10. [10]

    Bayesian fractional posteriors , volume =

    A. Bhattacharya, D. Pati, and Y. Yang. Bayesian fractional posteriors. The Annals of Statistics, 47 0 (1): 0 39 -- 66, 2019. doi:10.1214/18-AOS1712. URL https://doi.org/10.1214/18-AOS1712

  11. [11]

    P. G. Bissiri, C. C. Holmes, and S. G. Walker. A general framework for updating belief distributions. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 78 0 (5): 0 1103--1130, 2016

  12. [12]

    L. M. Bregman. The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming. USSR computational mathematics and mathematical physics, 7 0 (3): 0 200--217, 1967

  13. [13]

    Bunke and X

    O. Bunke and X. Milhaud. Asymptotic behavior of Bayes estimates under possibly incorrect models . The Annals of Statistics, 26 0 (2): 0 617 -- 644, 1998. doi:10.1214/aos/1028144851. URL https://doi.org/10.1214/aos/1028144851

  14. [14]

    Carpenter, A

    B. Carpenter, A. Gelman, M. D. Hoffman, D. Lee, B. Goodrich, M. Betancourt, M. Brubaker, J. Guo, P. Li, and A. Riddell. Stan: A probabilistic programming language. Journal of statistical software, 76: 0 1--32, 2017

  15. [15]

    P. S. Chodrow. Equivalence of informations characterizes Bregman divergences. Entropy, 27 0 (7), 2025. ISSN 1099-4300. doi:10.3390/e27070766. URL https://www.mdpi.com/1099-4300/27/7/766

  16. [16]

    D. K. Dey and L. R. Birmiwal. Robust Bayesian analysis using divergence measures . Statistics & Probability Letters, 20 0 (4): 0 287--294, 1994

  17. [17]

    B. A. Frigyik, S. Srivastava, and M. R. Gupta. Functional Bregman Divergence and Bayesian Estimation of Distributions . IEEE Transactions on Information Theory, 54 0 (11): 0 5130--5139, 2008. doi:10.1109/TIT.2008.929943

  18. [18]

    S. Geisser. The predictive sample reuse method with applications. Journal of the American statistical Association, 70 0 (350): 0 320--328, 1975

  19. [19]

    A. E. Gelfand, D. K. Dey, and H. Chang. Model determination using predictive distributions with implementation via sampling based methods. In J. Bernardo, J. Berger, A. Dawid, and A. Smith, editors, Bayesian Statistics 4, pages 147--167. Oxford University Press, 1992

  20. [20]

    Ghosh and A

    A. Ghosh and A. Basu. Robust Bayes estimation using the density power divergence. Annals of the Institute of Statistical Mathematics, 68 0 (2): 0 413--437, 2016

  21. [21]

    Girardi, L

    P. Girardi, L. Greco, V. Mameli, M. Musio, W. Racugno, E. Ruli, and L. Ventura. Robust inference for non-linear regression models from the Tsallis score: application to coronavirus disease 2019 contagion in Italy . Stat, 9 0 (1): 0 e309, 2020

  22. [22]

    Giummol \`e , V

    F. Giummol \`e , V. Mameli, E. Ruli, and L. Ventura. Objective Bayesian inference with proper scoring rules . Test, 28 0 (3): 0 728--755, 2019

  23. [23]

    Gneiting and A

    T. Gneiting and A. E. Raftery. Strictly proper scoring rules, prediction, and estimation. Journal of the American statistical Association, 102 0 (477): 0 359--378, 2007

  24. [24]

    Goh and D

    G. Goh and D. K. Dey. Bayesian model diagnostics using functional Bregman divergence . Journal of Multivariate Analysis, 124: 0 371--383, 2014

  25. [25]

    Goh and D

    G. Goh and D. K. Dey. Bayesian model assessment and selection using Bregman divergence . Advances in Statistics-Theory and Applications: Honoring the Contributions of Barry C. Arnold in Statistical Science, pages 295--313, 2021

  26. [26]

    Gr \"u nwald

    P. Gr \"u nwald. The safe Bayesian : learning the learning rate via the mixability gap. In International Conference on Algorithmic Learning Theory, pages 169--183. Springer, 2012

  27. [27]

    P. D. Gr \"u nwald and A. P. Dawid. Game theory, maximum entropy, minimum discrepancy and robust Bayesian decision theory . The Annals of Statistics, 32 0 (4): 0 1367 -- 1433, 2004

  28. [28]

    J. A. Hoeting, D. Madigan, A. E. Raftery, and C. T. Volinsky. Bayesian model averaging: a tutorial (with comments by M. Clyde, David Draper and E. I. George, and a rejoinder by the authors . Statistical Science, 14 0 (4): 0 382 -- 417, 1999. doi:10.1214/ss/1009212519

  29. [29]

    Hooker and A

    G. Hooker and A. N. Vidyashankar. Bayesian model robustness via disparities. Test, 23 0 (3): 0 556--584, 2014

  30. [30]

    P. J. Huber. Robust Statistics. Wiley Series in Probability and Mathematical Statistics. Wiley, New York, 1981

  31. [31]

    Jewson, J

    J. Jewson, J. Q. Smith, and C. Holmes. Principles of Bayesian inference using general divergence criteria . Entropy, 20 0 (6): 0 442, 2018

  32. [32]

    Jewson, J

    J. Jewson, J. Q. Smith, and C. Holmes. On the Stability of General Bayesian Inference . Bayesian Analysis, pages 1 -- 31, 2024. doi:10.1214/24-BA1502. URL https://doi.org/10.1214/24-BA1502

  33. [33]

    Kaplan-Damary, M

    N. Kaplan-Damary, M. Mandel, Y. Yekutieli, Y. Shor, and S. Wiesner. Location distribution of randomly acquired characteristics on a shoe sole. Journal of Forensic Sciences, 67 0 (5): 0 1801--1809, 2022

  34. [34]

    Kellermann, S

    V. Kellermann, S. L. Chown, M. F. Schou, I. Aitkenhead, C. Janion-Scheepers, A. Clemson, M. T. Scott, and C. M. Sgr \`o . Comparing thermal performance curves across traits: how consistent are they? Journal of Experimental Biology, 222 0 (11): 0 jeb193433, 2019

  35. [35]

    Kellett, D

    D. Kellett, D. Lagnado, R. Morgan, and S. Nakhaeizadeh. A Bayesian network approach to evaluating footwear evidence. Forensic Science International: Synergy, 12: 0 100673, 2026. ISSN 2589-871X. doi:https://doi.org/10.1016/j.fsisyn.2026.100673. URL https://www.sciencedirect.com/science/article/pii/S2589871X26000161

  36. [36]

    Knoblauch, J

    J. Knoblauch, J. E. Jewson, and T. Damoulas. Doubly robust B ayesian inference for non-stationary streaming data with -divergences. Advances in Neural Information Processing Systems, 31, 2018

  37. [37]

    Knoblauch, J

    J. Knoblauch, J. Jewson, and T. Damoulas. An optimization-centric view on Bayes' rule: reviewing and generalizing variational inference . Journal of Machine Learning Research, 23 0 (132): 0 1--109, 2022

  38. [38]

    Kontopoulos, A

    D.-G. Kontopoulos, A. Sentis, M. Daufresne, N. Glazman, A. I. Dell, and S. Pawar. No universal mathematical model for thermal performance curves across traits and taxonomic groups. Nature communications, 15 0 (1): 0 8855, 2024

  39. [39]

    D. V. Lindley. The choice of variables in B ayesian analysis. Journal of the Royal Statistical Society. Series B (Methodological), 30 0 (2): 0 239--251, 1968

  40. [41]

    Martin and N

    R. Martin and N. Syring. Direct Gibbs posterior inference on risk minimizers: Construction, concentration, and calibration. In Handbook of Statistics, volume 47, pages 1--41. Elsevier, 2022

  41. [42]

    Matsubara, J

    T. Matsubara, J. Knoblauch, F.-X. Briol, and C. J. Oates. Robust generalised Bayesian inference for intractable likelihoods . Journal of the Royal Statistical Society Series B: Statistical Methodology, 84 0 (3): 0 997--1022, 2022

  42. [43]

    McLatchie, E

    Y. McLatchie, E. Fong, D. T. Frazier, and J. Knoblauch. Predictive performance of power posteriors. Biometrika, page asaf034, 2025 a

  43. [44]

    McLatchie, S

    Y. McLatchie, S. R \"o gnvaldsson, F. Weber, and A. Vehtari. Advances in projection predictive inference. Statistical Science, 40 0 (1): 0 128--147, 2025 b

  44. [45]

    J. W. Miller. Asymptotic normality, concentration, and coverage of generalized posteriors. The Journal of Machine Learning Research, 22 0 (1): 0 7598--7650, 2021

  45. [46]

    J. W. Miller and D. B. Dunson. Robust Bayesian inference via coarsening. Journal of the American Statistical Association, 114 0 (527): 0 1113--1125, 2019

  46. [47]

    Nakagawa and S

    T. Nakagawa and S. Hashimoto. Robust Bayesian inference via -divergence . Communications in Statistics-Theory and Methods, 49 0 (2): 0 343--360, 2020

  47. [48]

    Pacchiardi, S

    L. Pacchiardi, S. Khoo, and R. Dutta. Generalized Bayesian likelihood-free inference . Electronic Journal of Statistics, 18 0 (2): 0 3628--3686, 2024

  48. [49]

    Piironen and A

    J. Piironen and A. Vehtari. Comparison of Bayesian predictive methods for model selection . Statistics and Computing, 27: 0 711--735, 2017

  49. [50]

    Piironen, M

    J. Piironen, M. Paasiniemi, and A. Vehtari. Projective inference in high-dimensional problems: prediction and feature selection. Electronic Journal of Statistics, 14 0 (1): 0 2155 -- 2197, 2020

  50. [51]

    D. A. Ratkowsky, J. Olley, and T. Ross. Unifying temperature effects on the growth rate of bacteria and the stability of globular proteins. Journal of theoretical biology, 233 0 (3): 0 351--362, 2005

  51. [52]

    T. Sawa. Information criteria for discriminating among alternative regression models. Econometrica: Journal of the Econometric Society, pages 1273--1291, 1978

  52. [53]

    B. J. Sinclair, K. E. Marshall, M. A. Sewell, D. L. Levesque, C. S. Willett, S. Slotsbo, Y. Dong, C. D. Harley, D. J. Marshall, B. S. Helmuth, et al. Can we predict ectotherm responses to climate change using thermal performance curves and body temperatures? Ecology letters, 19 0 (11): 0 1372--1385, 2016

  53. [54]

    Sivula, M

    T. Sivula, M. Magnusson, A. A. Matamoros, and A. Vehtari. Uncertainty in Bayesian leave-one-out cross-validation based model comparison . Bayesian Analysis, 1 0 (1): 0 1--31, 2025

  54. [55]

    N. A. Spencer and J. S. Murray. A Bayesian hierarchical model for evaluating forensic footwear evidence. The Annals of Applied Statistics, 14 0 (3): 0 1449--1470, 2020

  55. [56]

    M. Stone. Cross-validation and multinomial prediction. Biometrika, pages 509--515, 1974

  56. [57]

    Sugasawa

    S. Sugasawa. Robust empirical Bayes small area estimation with density power divergence. Biometrika, 107 0 (2): 0 467--480, 2020

  57. [58]

    Vehtari and J

    A. Vehtari and J. Ojanen. A survey of Bayesian predictive methods for model assessment, selection and comparison . Statistics Surveys, 6 0 (none): 0 142 -- 228, 2012

  58. [59]

    Vehtari, A

    A. Vehtari, A. Gelman, and J. Gabry. Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC . Statistics and computing, 27: 0 1413--1432, 2017

  59. [60]

    Vehtari, D

    A. Vehtari, D. Simpson, A. Gelman, Y. Yao, and J. Gabry. Pareto smoothed importance sampling. Journal of Machine Learning Research, 25 0 (72): 0 1--58, 2024

  60. [61]

    Wiesner, Y

    S. Wiesner, Y. Shor, T. Tsach, N. Kaplan-Damary, and Y. Yekutieli. Dataset of digitized racs and their rarity score analysis for strengthening shoeprint evidence. Journal of forensic sciences, 65 0 (3): 0 762--774, 2020

  61. [62]

    Y. Yao, A. Vehtari, D. Simpson, and A. Gelman. Using stacking to average Bayesian predictive distributions (with discussion) . Bayesian Analysis, 13 0 (3): 0 917--1003, 2018

  62. [63]

    Statistics Surveys , number =

    Aki Vehtari and Janne Ojanen , title =. Statistics Surveys , number =

  63. [64]

    Bayesian Analysis , volume=

    Sivula, Tuomas and Magnusson, M. Bayesian Analysis , volume=. 2025 , publisher=

  64. [65]

    Journal of statistical software , volume=

    Stan: A probabilistic programming language , author=. Journal of statistical software , volume=

  65. [66]

    2017 , publisher=

    Piironen, Juho and Vehtari, Aki , journal=. 2017 , publisher=

  66. [67]

    Journal of the American statistical Association , volume=

    The predictive sample reuse method with applications , author=. Journal of the American statistical Association , volume=. 1975 , publisher=

  67. [68]

    Journal of the American Statistical Association , volume=

    A predictive approach to model selection , author=. Journal of the American Statistical Association , volume=. 1979 , publisher=

  68. [69]

    , author=

    Asymptotic equivalence of Bayes cross validation and widely applicable information criterion in singular learning theory. , author=. Journal of machine learning research , volume=

  69. [70]

    Danyela Kellett and David Lagnado and Ruth Morgan and Sherry Nakhaeizadeh , doi =. A. Forensic Science International: Synergy , keywords =. 2026 , bdsk-url-1 =

  70. [71]

    Journal of forensic sciences , volume=

    Dataset of digitized RACs and their rarity score analysis for strengthening shoeprint evidence , author=. Journal of forensic sciences , volume=. 2020 , publisher=

  71. [72]

    Spencer, Neil A and Murray, Jared S , journal=. A. 2020 , publisher=

  72. [73]

    Journal of Forensic Sciences , volume=

    Location distribution of randomly acquired characteristics on a shoe sole , author=. Journal of Forensic Sciences , volume=. 2022 , publisher=

  73. [74]

    Annals of the institute of statistical mathematics , volume=

    Bayesian image restoration, with two applications in spatial statistics , author=. Annals of the institute of statistical mathematics , volume=. 1991 , publisher=

  74. [75]

    arXiv preprint arXiv:2602.07006 , year=

    Scalable spatial point process models for forensic footwear analysis , author=. arXiv preprint arXiv:2602.07006 , year=

  75. [76]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Bayesian measures of model complexity and fit , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 2002 , publisher=

  76. [77]

    Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=

    Predictive model selection , author=. Journal of the Royal Statistical Society Series B: Statistical Methodology , volume=. 1995 , publisher=

  77. [78]

    Biometrika , volume=

    Model choice: a minimum posterior predictive loss approach , author=. Biometrika , volume=. 1998 , publisher=

  78. [79]

    Journal of the American Statistical Association , volume=

    Bayes factors , author=. Journal of the American Statistical Association , volume=. 1995 , publisher=

  79. [80]

    Journal of the American Statistical Association , volume=

    Markov chain monte carlo methods for computing Bayes factors: A comparative review , author=. Journal of the American Statistical Association , volume=. 2001 , publisher=

  80. [81]

    Optimal predictive model selection , author=

Showing first 80 references.