pith. sign in

arxiv: 2606.11617 · v1 · pith:452X4GVEnew · submitted 2026-06-10 · ✦ hep-lat

Conditional Model-Adequacy Tests for Spectral Uncertainty Claims in Lattice QCD

Pith reviewed 2026-06-27 07:59 UTC · model grok-4.3

classification ✦ hep-lat
keywords lattice QCDspectral functionsuncertainty quantificationmodel adequacy testshear correlatorEuclidean correlatorsinverse problemfinite temperature
0
0 comments X

The pith

A conditional test shows that reported uncertainty bands on lattice QCD spectra often fail to cover physical summaries like peak heights.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper formulates a target-wise adequacy test that checks whether nominal uncertainty intervals on reconstructed spectral functions actually cover chosen physical summaries. It applies the test to Euclidean-admissible mock correlators whose true spectra are known, using empirical coverage, calibration ranks, and stress diagnostics. Benchmarks reveal that peak locations calibrate substantially better than peak heights or low-frequency weights because of differing functional identifiability under the Euclidean kernel. When the same test is run on a finite-temperature shear correlator family of BG-style reconstructions that are compatible with the data at chi-squared per degree of freedom around 1.3, a W_low-calibrated representative can be identified but pointwise peak-height intervals are not certified. The result establishes that Euclidean compatibility is necessary yet insufficient for validating spectral uncertainty claims.

Core claim

Within the scanned grid and the stated observable-matched mock extension, a W_low-calibrated representative can be identified for the finite-temperature shear correlator, whereas pointwise peak-height intervals are not certified for the tested BG-style uncertainty law; peak locations are substantially better calibrated than peak heights or low-frequency weights, showing that different summaries possess different degrees of identifiability under the Euclidean smoothing kernel.

What carries the argument

The target-wise adequacy test that evaluates reported intervals on mock correlators with known truth using empirical coverage and physical diagnostics for a chosen summary T[rho].

If this is right

  • Peak locations receive substantially better calibration than peak heights under the Euclidean kernel.
  • Low-frequency weights exhibit identifiability properties distinct from both peak locations and heights.
  • Euclidean compatibility at chi-squared per degree of freedom near 1.3 is compatible with the existence of a W_low-calibrated representative.
  • Pointwise peak-height intervals fail certification under the tested BG-style uncertainty prescription.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same conditional test could be applied to other reconstruction families or to different temperature regimes to map which summaries remain uncertified.
  • Failure on peak heights suggests that alternative uncertainty constructions, such as those that incorporate global constraints, may be needed for height-sensitive observables.
  • The distinction between location and height calibration points to a general feature of ill-posed inverse problems where the kernel smooths features at different rates.

Load-bearing premise

The particular mock extension chosen to produce Euclidean-admissible correlators with known truth spans the relevant space of physical spectra.

What would settle it

A set of mock correlators generated from a different spectral family that produces coverage fractions for peak heights far from the nominal level under the same reconstruction procedure would falsify the adequacy claim for that uncertainty law.

Figures

Figures reproduced from arXiv: 2606.11617 by Haozheng Li.

Figure 1
Figure 1. Figure 1: FIG. 1. Protocol workflow. The left dashed box is the added mock-calibration branch: spectra are sampled from the mock [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. Representative spectra and Euclidean correlators [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Summary of the primary calibration stress matrix over the 3 [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. Observable-matched finite-temperature shear bench [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. Euclidean compatibility versus target-wise calibration for the 8001 fixed BG-style shear reconstruction settings. Left: [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: FIG. 6. Supporting checks for the three low- [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. Full primary calibration stress matrix over [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗
read the original abstract

Euclidean lattice correlators determine spectral functions only through a smoothing integral transform, so a nominal uncertainty band on a reconstructed spectrum need not have a coverage interpretation for a physical summary. We formulate this as a target-wise adequacy test for reported spectral uncertainties. For a chosen summary \(T[\rho]\), the reported interval is tested on Euclidean-admissible mock correlators with known truth using empirical coverage, simulation-based calibration ranks, physical diagnostics, and stress tests. The test is conditional, but it is a useful falsification tool: passing it does not prove that a reconstruction is the QCD truth, while failing it shows that the reported uncertainty law is not adequate for the chosen functional under the stated mock extension. In a generic benchmark, peak locations are substantially better calibrated than peak heights or low-frequency weights, reflecting different degrees of functional identifiability under the Euclidean kernel. We then apply the same logic to a finite-temperature shear correlator. A family of BG-style reconstructions is compatible with the Euclidean data at \(\chi^2/N_\tau\simeq 1.3\). Within the scanned grid and stated observable-matched mock extension, a \(W_{\rm low}\)-calibrated representative can be identified, whereas pointwise peak-height intervals are not certified for the tested BG-style uncertainty law. Thus Euclidean compatibility is a necessary consistency check, but not a sufficient adequacy criterion for spectral uncertainty claims.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper formulates conditional model-adequacy tests for uncertainty bands on spectral functions reconstructed from Euclidean lattice QCD correlators. For a chosen summary T[ρ], reported intervals are evaluated via empirical coverage, simulation-based calibration ranks, physical diagnostics, and stress tests on Euclidean-admissible mock correlators with known truth. A generic benchmark shows peak locations are better calibrated than peak heights or low-frequency weights. Applied to a finite-temperature shear correlator, a family of BG-style reconstructions compatible with the data at χ²/N_τ ≃ 1.3 is tested; within the scanned grid and observable-matched mock extension, a W_low-calibrated representative passes while pointwise peak-height intervals fail coverage under the tested BG-style law. The test is explicitly conditional and serves as a falsification tool.

Significance. If the central claims hold, the work supplies a practical, scoped falsification procedure for spectral uncertainty statements in an ill-posed inverse problem. It correctly separates Euclidean data compatibility from adequacy for physical functionals and demonstrates the distinction on both generic mocks and a real shear-correlator application. Credit is due for the explicit conditional framing, use of multiple diagnostics (coverage, ranks, stress tests), and avoidance of universal claims about the mock extension.

major comments (2)
  1. [Abstract / shear application section] Abstract and § on the shear-correlator application: the statement that 'pointwise peak-height intervals are not certified' is load-bearing for the main result; the manuscript must report the precise empirical coverage fraction (and its uncertainty) for the peak-height summary under the chosen BG-style law and mock extension so that readers can judge the magnitude of the failure.
  2. [Benchmark section] Benchmark section: the claim that 'peak locations are substantially better calibrated than peak heights or low-frequency weights' requires the tabulated coverage values or rank distributions for each functional; without these numbers the comparative statement cannot be verified and is central to the generic benchmark result.
minor comments (2)
  1. [Methods] Notation: the definition of the observable-matched mock extension should be given an explicit equation or algorithm box so that the conditional scope is reproducible.
  2. [Figures] Figure captions: ensure every coverage plot states the exact number of mock realizations and the precise definition of the W_low calibration used.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and the recommendation of minor revision. The comments correctly identify places where explicit numerical reporting will improve verifiability of the central claims without altering the conditional nature of the tests.

read point-by-point responses
  1. Referee: [Abstract / shear application section] Abstract and § on the shear-correlator application: the statement that 'pointwise peak-height intervals are not certified' is load-bearing for the main result; the manuscript must report the precise empirical coverage fraction (and its uncertainty) for the peak-height summary under the chosen BG-style law and mock extension so that readers can judge the magnitude of the failure.

    Authors: We agree that the precise empirical coverage fraction (with uncertainty) for the peak-height summary must be stated explicitly so readers can assess the magnitude of the failure. In the revised manuscript we will report these values, computed from the mock ensemble under the stated BG-style law and observable-matched extension, both in the shear-correlator section and, for completeness, in the abstract. revision: yes

  2. Referee: [Benchmark section] Benchmark section: the claim that 'peak locations are substantially better calibrated than peak heights or low-frequency weights' requires the tabulated coverage values or rank distributions for each functional; without these numbers the comparative statement cannot be verified and is central to the generic benchmark result.

    Authors: We accept that the comparative claim requires explicit tabulated coverage fractions and rank distributions. The revised manuscript will include a table in the benchmark section listing these quantities for peak locations, peak heights, and low-frequency weights, allowing direct verification of the relative calibration performance. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper formulates a conditional adequacy test that applies reported uncertainty intervals to Euclidean-admissible mock correlators generated with known truth values. Coverage, calibration ranks, and diagnostics are computed directly against these external mocks rather than against any fitted parameter or quantity defined inside the paper. No self-definitional loops, fitted-input predictions, load-bearing self-citations, or ansatz smuggling appear in the derivation; the central claim is scoped explicitly to the chosen mock extension and therefore remains falsifiable by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract alone supplies no explicit free parameters, axioms, or invented entities; the approach rests on standard statistical notions of coverage and mock-data generation.

pith-pipeline@v0.9.1-grok · 5774 in / 1098 out tokens · 18364 ms · 2026-06-27T07:59:22.329717+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

43 extracted references · 1 canonical work pages

  1. [1]

    In the 2000-sample soft-alignment report the mean absolute residual fromS ⋆ 0 = 1 is 3.9589×10 −3, the median is 3.2777×10 −3, and the 95th percentile is 9.6817×10 −3

    The soft-alignment mode multiplies this rescaling by u∼LogNormal − σ2 u 2 , σu , σ u = 0.005,(B5) so thatE[u] = 1. In the 2000-sample soft-alignment report the mean absolute residual fromS ⋆ 0 = 1 is 3.9589×10 −3, the median is 3.2777×10 −3, and the 95th percentile is 9.6817×10 −3. A 5000-sample diversity report gives effective spectral mean 6.62997 with ...

  2. [2]

    Kubo, Statistical-mechanical theory of irreversible processes

    R. Kubo, Statistical-mechanical theory of irreversible processes. i. general theory and simple applications to magnetic and conduction problems, Journal of the Phys- ical Society of Japan12, 570 (1957)

  3. [3]

    Aarts, C

    G. Aarts, C. Allton, J. Foley, S. Hands, and S. Kim, Spec- tral functions at small energies and the electrical conduc- tivity in hot quenched lattice qcd, Phys. Rev. Lett.99, 022002 (2007)

  4. [4]

    H. B. Meyer, Calculation of the shear viscosity in su(3) gluodynamics, Phys. Rev. D76, 101701(R) (2007)

  5. [5]

    H. B. Meyer, Calculation of the bulk viscosity in su(3) gluodynamics, Phys. Rev. Lett.100, 162001 (2008)

  6. [6]

    H. B. Meyer, Transport properties of the quark-gluon plasma, The European Physical Journal A47, 86 (2011)

  7. [7]

    Aarts and A

    G. Aarts and A. Nikolaev, Electrical conductivity of the quark-gluon plasma: perspective from lattice qcd, The European Physical Journal A57, 118 (2021)

  8. [8]

    N. Y. Astrakhantsev, V. V. Braguta, and A. Y. Ko- tov, Temperature dependence of shear viscosity of su(3)- gluodynamics within lattice simulation, Journal of High Energy Physics2017, 101 (2017)

  9. [9]

    Altenkort, A

    L. Altenkort, A. M. Eller, A. Francis, O. Kaczmarek, L. Mazur, G. D. Moore, and H.-T. Shu, Viscosity of pure-glue qcd from the lattice, Phys. Rev. D108, 014503 (2023)

  10. [10]

    R. K. Bryan, Maximum entropy analysis of oversam- pled data problems, European Biophysics Journal18, 165 (1990)

  11. [11]

    Jarrell and J

    M. Jarrell and J. Gubernatis, Bayesian inference and the analytic continuation of imaginary-time quantum monte carlo data, Physics Reports269, 133 (1996)

  12. [12]

    Nakahara, M

    Y. Nakahara, M. Asakawa, and T. Hatsuda, Hadronic spectral functions in lattice qcd, Phys. Rev. D60, 091503(R) (1999)

  13. [13]

    Asakawa, Y

    M. Asakawa, Y. Nakahara, and T. Hatsuda, Maximum entropy analysis of the spectral functions in lattice qcd, Progress in Particle and Nuclear Physics46, 459 (2001)

  14. [14]

    Burnier and A

    Y. Burnier and A. Rothkopf, Bayesian approach to spec- tral function reconstruction for euclidean quantum field theories, Phys. Rev. Lett.111, 182003 (2013)

  15. [15]

    Backus and F

    G. Backus and F. Gilbert, The resolving power of gross earth data, Geophysical Journal International16, 169 (1968)

  16. [16]

    A. W. Sandvik, Stochastic method for analytic contin- uation of quantum monte carlo data, Phys. Rev. B57, 10287 (1998)

  17. [17]

    K. S. D. Beach, Identifying the maximum entropy method as a special limit of stochastic analytic continu- ation (2004), arXiv:cond-mat/0403055

  18. [18]

    Hansen, A

    M. Hansen, A. Lupo, and N. Tantalo, Extraction of spec- tral densities from lattice correlators, Phys. Rev. D99, 094508 (2019)

  19. [19]

    Bailas, S

    G. Bailas, S. Hashimoto, and T. Ishikawa, Reconstruc- tion of smeared spectral functions from euclidean correla- tion functions, Progress of Theoretical and Experimental Physics2020, 043B07 (2020)

  20. [20]

    Del Debbio, A

    L. Del Debbio, A. Lupo, M. Panero, and N. Tantalo, Bayesian solution to the inverse problem and its relation to backus–gilbert methods, The European Physical Jour- nal C85, 185 (2025)

  21. [21]

    H.-T. Ding, O. Kaczmarek, S. Mukherjee, H. Ohno, and H.-T. Shu, Stochastic reconstructions of spectral func- tions: Application to lattice qcd, Phys. Rev. D97, 094503 (2018)

  22. [22]

    Kades, J

    L. Kades, J. M. Pawlowski, A. Rothkopf, M. Scherzer, J. M. Urban, S. J. Wetzel, N. Wink, and F. P. G. Ziegler, Spectral reconstruction with deep neural networks, Phys. Rev. D102, 096001 (2020)

  23. [23]

    Horak, J

    J. Horak, J. M. Pawlowski, J. Rodr´ ıguez-Quintero, J. Turnwald, J. M. Urban, N. Wink, and S. Zafeiropou- los, Reconstructing qcd spectral functions with gaussian processes, Phys. Rev. D105, 036014 (2022)

  24. [24]

    L. Wang, S. Shi, and K. Zhou, Reconstructing spec- tral functions via automatic differentiation, Phys. Rev. D106, L051502 (2022)

  25. [25]

    Buzzicotti, A

    M. Buzzicotti, A. De Santis, and N. Tantalo, Teaching to extract spectral densities from lattice correlators to a broad audience of learning-machines, The European Physical Journal C84, 32 (2024)

  26. [26]

    Huang and S

    L. Huang and S. Liang, Reconstructing lattice qcd spec- tral functions with stochastic pole expansion and nevan- linna analytic continuation, Phys. Rev. D109, 054508 (2024)

  27. [27]

    Tierney and J

    L. Tierney and J. B. Kadane, Accurate approximations for posterior moments and marginal densities, Journal of the American Statistical Association81, 82 (1986)

  28. [28]

    Efron and R

    B. Efron and R. Tibshirani,An Introduction to the Boot- strap, 1st ed. (Chapman and Hall/CRC, 1994)

  29. [29]

    Rothkopf, Bayesian inference of real-time dynamics from lattice qcd, Frontiers in PhysicsV olume 10 - 2022, 10.3389/fphy.2022.1028995 (2022)

    A. Rothkopf, Bayesian inference of real-time dynamics from lattice qcd, Frontiers in PhysicsV olume 10 - 2022, 10.3389/fphy.2022.1028995 (2022)

  30. [30]

    Frison, Bayesian Inference for Contemporary Lat- tice Quantum Field Theory, PoSLA TTICE2023, 027 (2024)

    J. Frison, Bayesian Inference for Contemporary Lat- tice Quantum Field Theory, PoSLA TTICE2023, 027 (2024)

  31. [31]

    Jay, Approaches to the Inverse Problem, PoSLA T- TICE2024, 002 (2025)

    W. Jay, Approaches to the Inverse Problem, PoSLA T- TICE2024, 002 (2025)

  32. [32]

    S. R. Cook, A. Gelman, and D. B. Rubin, Validation of software for bayesian models using posterior quantiles, Journal of Computational and Graphical Statistics15, 675 (2006)

  33. [33]

    Talts, M

    S. Talts, M. Betancourt, D. Simpson, A. Vehtari, and A. Gelman, Validating Bayesian Inference Al- gorithms with Simulation-Based Calibration (2018), arXiv:1804.06788 [stat.ME]

  34. [34]

    A. M. Stuart, Inverse problems: A bayesian perspective, Acta Numerica19, 451–559 (2010)

  35. [35]

    Osterwalder and R

    K. Osterwalder and R. Schrader, Axioms for euclidean green’s functions, Communications in Mathematical Physics31, 83 (1973)

  36. [36]

    Osterwalder and R

    K. Osterwalder and R. Schrader, Axioms for euclidean green’s functions ii, Communications in Mathematical Physics42, 281 (1975)

  37. [37]

    Hausdorff, Summationsmethoden und momentfolgen

    F. Hausdorff, Summationsmethoden und momentfolgen. i, Mathematische Zeitschrift9, 74 (1921)

  38. [38]

    Widder,The Laplace Transform, Princeton mathe- matical series (Princeton University Press, 1941)

    D. Widder,The Laplace Transform, Princeton mathe- matical series (Princeton University Press, 1941)

  39. [39]

    P. C. Martin and J. Schwinger, Theory of many-particle systems. i, Phys. Rev.115, 1342 (1959)

  40. [40]

    Lowdon, Euclidean thermal correlation functions in local qft, Phys

    P. Lowdon, Euclidean thermal correlation functions in local qft, Phys. Rev. D106, 045028 (2022)

  41. [41]

    Hyv¨ arinen, Estimation of non-normalized statistical models by score matching, Journal of Machine Learning 17 Research6, 695 (2005)

    A. Hyv¨ arinen, Estimation of non-normalized statistical models by score matching, Journal of Machine Learning 17 Research6, 695 (2005)

  42. [42]

    Vincent, A connection between score matching and denoising autoencoders, Neural Computation23, 1661 (2011)

    P. Vincent, A connection between score matching and denoising autoencoders, Neural Computation23, 1661 (2011)

  43. [43]

    Y. Song, J. Sohl-Dickstein, D. P. Kingma, A. Kumar, S. Ermon, and B. Poole, Score-Based Generative Mod- eling through Stochastic Differential Equations (2020), arXiv:2011.13456 [cs.LG]