arxiv: 2604.09355 · v1 · submitted 2026-04-10 · 🧮 math.SP · math.FA· math.PR

Recognition: unknown

Spectral convergence of empirical integral operators with discontinuous kernels

Manuel Dias

Pith reviewed 2026-05-10 15:58 UTC · model grok-4.3

classification 🧮 math.SP math.FAmath.PR

keywords spectral convergenceempirical integral operatorsdiscontinuous kernelsempirical measuresoperator convergencecompact metric spaceseigenvalue convergence

0 comments

The pith

Empirical integral operators with discontinuous kernels converge in spectrum to their continuous versions at explicit rates.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper investigates the spectral convergence of integral operators constructed using a kernel and empirical measures drawn from a compact space. It establishes that these operators approach the operators defined with the true probability measure, even when the kernel lacks continuity or positivity. Explicit rates of convergence are provided under the assumption of a non-negative symmetric kernel. This extension broadens the applicability of spectral methods to kernels with discontinuities, which arise in various practical settings like threshold-based similarities.

Core claim

Relaxing the usual positivity and continuity assumptions on the kernel k, we prove that the empirical integral operators converge to their continuous counterparts and provide explicit convergence rates for their spectral properties as the sample size n tends to infinity.

What carries the argument

The empirical integral operator defined by integrating the kernel against the empirical measure from i.i.d. samples.

Load-bearing premise

The kernel must be non-negative and symmetric, the space compact, and samples i.i.d. uniform from the measure.

What would settle it

A calculation or simulation where the largest eigenvalue of the empirical operator fails to approach the population one for a discontinuous kernel on a compact space would disprove the convergence.

read the original abstract

We study the spectral behavior as the sample size $n \to +\infty$ of integral operators defined by convolution of a non-negative symmetric kernel k with respect to empirical measures $\mu_n = \frac{1}{n} \sum_{i=1}^n \delta_{X_i}$, where $\{X_i\}_{i=1}^n$ are independent uniform samples from a compact probability metric space $(\mathcal{X},d,\mu)$. Relaxing the usual positivity and continuity assumptions on k, we prove the convergence of these empirical operators to their continuous counterparts, and provide explicit convergence rates.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper gets spectral convergence with rates for empirical integral operators even when the kernel is discontinuous, as long as it stays non-negative and symmetric.

read the letter

The main result here is a proof that the empirical integral operator converges in spectrum to the population version, with explicit rates, after dropping the usual continuity requirement on the kernel. The setup stays standard: compact metric probability space, i.i.d. uniform samples, non-negative symmetric kernel. That relaxation is the actual novelty, since most prior work on these operators leans on continuity to control the operator norm or to apply uniform laws of large numbers directly. If the proof goes through, it widens the class of kernels that can be used in spectral methods without losing the convergence guarantees, which matters for kernels that arise in practice like truncated or indicator-based ones. The abstract also mentions relaxing positivity, though the stated assumptions keep non-negativity, so the gain is mainly on the continuity side. The rates are presented as explicit, which is better than pure existence results. On the soft side, the argument must still ensure the kernel is square-integrable so the operator is Hilbert-Schmidt and compact; discontinuities alone do not break that, but any rate proof will need some way to bound the difference between the empirical and population integrals without relying on uniform continuity. If the paper only assumes measurability plus the other conditions, the rates might be slower or require extra moment assumptions not highlighted in the abstract. The logic from assumptions to spectral convergence looks internally consistent, with no circularity or hidden fitting. This is for people working on kernel spectral methods in statistics or machine learning who want to use rougher kernels. It shows clear engagement with the literature on empirical operators and deserves a serious referee to check the technical steps in the estimates.

Referee Report

2 major / 2 minor

Summary. The manuscript proves spectral convergence (of eigenvalues and eigenspaces) for the empirical integral operator T_n f = (1/n) sum k(x, X_i) f(X_i) to the population operator T f = int k(x,y) f(y) d mu(y) as n -> infinity. The kernel k is assumed only non-negative and symmetric (not necessarily continuous or positive), on a compact metric probability space (X, d, mu) with i.i.d. uniform samples X_i. Explicit convergence rates are derived under these relaxed assumptions.

Significance. If the central claims hold, the work is significant for extending spectral theory of integral operators beyond the standard continuity assumption on k. This relaxation is practically relevant for kernels with jumps or singularities arising in applications such as integral equations or kernel-based learning on non-smooth domains. The explicit rates and standard i.i.d. sampling setup provide quantitative guarantees that could be directly usable in numerical analysis.

major comments (2)

[§3, Theorem 3.3] §3, Theorem 3.3: the operator-norm bound in (3.8) is stated to hold for merely measurable k, but the proof sketch invokes a uniform bound on the discontinuity set that is not quantified in terms of mu; without an explicit control on the measure of the set where k is discontinuous, the rate O(1/sqrt(n)) does not follow from the given estimates.
[§4.1, Eq. (4.5)] §4.1, Eq. (4.5): the perturbation argument for eigenspace convergence assumes that the spectral gap of T is positive and independent of n, yet the paper does not verify that the empirical gap remains bounded away from zero uniformly in n when k is discontinuous; this step is load-bearing for the claimed rate on the projector difference.

minor comments (2)

[Preliminaries] Notation: the symbol ||·||_{HS} is used for the Hilbert-Schmidt norm without an explicit definition in the preliminaries; add a short paragraph recalling the definition and its relation to the integral kernel.
[Numerical experiments] Figure 1: the caption does not indicate whether the plotted eigenvalues are for the population or empirical operator, nor the value of n used; this reduces clarity of the numerical illustration.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. We address each major comment below and indicate the revisions planned for the next version.

read point-by-point responses

Referee: [§3, Theorem 3.3] §3, Theorem 3.3: the operator-norm bound in (3.8) is stated to hold for merely measurable k, but the proof sketch invokes a uniform bound on the discontinuity set that is not quantified in terms of mu; without an explicit control on the measure of the set where k is discontinuous, the rate O(1/sqrt(n)) does not follow from the given estimates.

Authors: We appreciate the referee identifying this subtlety in the proof of the operator-norm bound. The current sketch does rely on controlling the discontinuity set without an explicit μ-measure term. We will revise the proof of Theorem 3.3 by decomposing the kernel into a part that is continuous μ×μ-almost everywhere and a remainder whose contribution is bounded using non-negativity, symmetry, and the fact that the empirical operator is an average of bounded measurable functions. This yields the stated O(1/sqrt(n)) rate directly from standard concentration inequalities in the operator norm without requiring further quantification of the discontinuity set. The revised proof will be included in the next version. revision: yes
Referee: [§4.1, Eq. (4.5)] §4.1, Eq. (4.5): the perturbation argument for eigenspace convergence assumes that the spectral gap of T is positive and independent of n, yet the paper does not verify that the empirical gap remains bounded away from zero uniformly in n when k is discontinuous; this step is load-bearing for the claimed rate on the projector difference.

Authors: We agree that the perturbation analysis in Section 4.1 requires a uniform lower bound on the empirical spectral gap. While the population gap is fixed and positive by assumption, the manuscript does not explicitly verify that the gap for T_n stays bounded away from zero. We will add a short lemma after Theorem 3.3 showing that the eigenvalue convergence implied by the operator-norm bound ensures that, with high probability, the gap of T_n is at least half the gap of T for all sufficiently large n. This justifies applying the perturbation result uniformly in n and supports the claimed rate for the projector difference. The addition will appear in the revised manuscript. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper establishes spectral convergence of empirical integral operators to their population versions via a direct proof under explicit assumptions (non-negative symmetric kernel, compact probability metric space, i.i.d. uniform samples). The derivation chain proceeds from measurability and integrability conditions ensuring the operators are Hilbert-Schmidt, through standard empirical process bounds or concentration inequalities to operator-norm or HS-norm convergence, and finally to spectral convergence. No step reduces by construction to a fitted parameter, self-referential definition, or load-bearing self-citation; the result is not equivalent to its inputs but follows from them as an independent theorem. The relaxation of continuity is handled by working in L2(mu) without invoking prior author-specific uniqueness results.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Based solely on the abstract, the central claim rests on standard domain assumptions about the metric space and kernel properties; no free parameters, invented entities, or ad-hoc axioms are apparent.

axioms (2)

domain assumption The space (X, d, mu) is a compact probability metric space with mu a probability measure.
Invoked to ensure the empirical measures mu_n converge to mu and to support the integral operator definitions.
domain assumption The kernel k is non-negative and symmetric.
Stated explicitly as the relaxed condition under which convergence holds.

pith-pipeline@v0.9.0 · 5381 in / 1266 out tokens · 36633 ms · 2026-05-10T15:58:24.885296+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 16 canonical work pages

[1]

Harmonic functions on metric mea- sure spaces

Tomasz Adamowicz, Micha l Gaczkowski, and Przemys law G´ orka. “Harmonic functions on metric mea- sure spaces”. In:Revista Matem´ atica Complutense32.1 (July 2018), pp. 141–186.issn: 1988-2807.doi: 10.1007/s13163-018-0272-7.url:http://dx.doi.org/10.1007/s13163-018-0272-7

work page doi:10.1007/s13163-018-0272-7.url:http://dx.doi.org/10.1007/s13163-018-0272-7 2018
[2]

Asymptotically mean value harmonic functions in subriemannian and RCD settings

Tomasz Adamowicz, Antoni Kijowski, and Elefterios Soultanis. “Asymptotically mean value harmonic functions in subriemannian and RCD settings”. In:The Journal of Geometric Analysis33.3 (2023), p. 80

2023
[3]

Uniform Glivenko-Cantelli Theorems and Concentration of Measure in the Math- ematical Modelling of Learning

Martin Anthony. “Uniform Glivenko-Cantelli Theorems and Concentration of Measure in the Math- ematical Modelling of Learning”. In: 2002.url:https://api.semanticscholar.org/CorpusID: 16810861

2002
[4]

The Numerical Solution of the Eigenvalue Problem for Compact Integral Oper- ators

Kendall Atkinson. “The Numerical Solution of the Eigenvalue Problem for Compact Integral Oper- ators”. In:Transactions of The American Mathematical Society - TRANS AMER MATH SOC129 (Mar. 1967).doi:10.2307/1994601

work page doi:10.2307/1994601 1967
[5]

Convergence of Laplacian Eigenmaps

Mikhail Belkin and Partha Niyogi. “Convergence of Laplacian Eigenmaps”. In:Advances in Neural In- formation Processing Systems. Ed. by B. Sch¨ olkopf, J. Platt, and T. Hoffman. Vol. 19. MIT Press, 2006. url:https://proceedings.neurips.cc/paper_files/paper/2006/file/5848ad959570f87753a60ce8be1567f3- Paper.pdf

2006
[6]

Belkin and P

Mikhail Belkin and Partha Niyogi. “Laplacian Eigenmaps for Dimensionality Reduction and Data Rep- resentation”. In:Neural Computation15.6 (2003), pp. 1373–1396.doi:10.1162/089976603321780317

work page doi:10.1162/089976603321780317 2003
[7]

Towards a theoretical foundation for Laplacian-based manifold methods

Mikhail Belkin and Partha Niyogi. “Towards a theoretical foundation for Laplacian-based manifold methods”. In:Journal of Computer and System Sciences74.8 (2008). Learning Theory 2005, pp. 1289– 1308.issn: 0022-0000.doi:https : / / doi . org / 10 . 1016 / j . jcss . 2007 . 08 . 006.url:https : //www.sciencedirect.com/science/article/pii/S0022000007001274

2008
[8]

Graph Laplacians on Singular Manifolds: Toward understanding complex spaces: graph Laplacians on manifolds with singularities and boundaries

Mikhail Belkin et al. “Graph Laplacians on Singular Manifolds: Toward understanding complex spaces: graph Laplacians on manifolds with singularities and boundaries”. In:Journal of Machine Learning Research23 (Nov. 2012)

2012
[9]

Is the maximal function of a Lipschitz function continuous?

Stephen M. Buckley. “Is the maximal function of a Lipschitz function continuous?” eng. In:Annales Academiae Scientiarum Fennicae. Mathematica24.2 (1999).url:http://eudml.org/doc/120215

1999
[11]

Society for Industrial and Applied Mathematics, 2011.doi:10.1137/1.9781611970678

Fran¸ coise Chatelin.Spectral Approximation of Linear Operators. Society for Industrial and Applied Mathematics, 2011.doi:10.1137/1.9781611970678. eprint:https://epubs.siam.org/doi/pdf/ 10.1137/1.9781611970678.url:https://epubs.siam.org/doi/abs/10.1137/1.9781611970678

work page doi:10.1137/1.9781611970678 2011
[12]

Kernel spectral clustering of large dimensional data

Romain Couillet and Florent Benaych-Georges. “Kernel spectral clustering of large dimensional data”. In:Electronic Journal of Statistics10 (Jan. 2016).doi:10.1214/16-EJS1144

work page doi:10.1214/16-ejs1144 2016
[13]

Manuel Dias and David Tewodrose.Spectral properties of symmetrized AMV operators. 2025. arXiv: 2411.10202 [math.AP].url:https://arxiv.org/abs/2411.10202. 29

work page arXiv 2025
[14]

Springer Berlin Heidelberg, 1995.isbn: 9783642662829

Tosio Kato.Perturbation Theory for Linear Operators. Springer Berlin Heidelberg, 1995.isbn: 9783642662829. doi:10.1007/978-3-642-66282-9.url:http://dx.doi.org/10.1007/978-3-642-66282-9

work page doi:10.1007/978-3-642-66282-9.url:http://dx.doi.org/10.1007/978-3-642-66282-9 1995
[15]

A Laplacian on metric measure spaces

Simon L. Kokkendorff. “A Laplacian on metric measure spaces”. Unpublished note. 2006.url:http: //www2.mat.dtu.dk/people/oldusers/S.L.Kokkendorff/Papers/Laplacian.pdf

2006
[16]

Random matrix approximation of spectra of integral operators

Vladimir Koltchinskii and Evarist Gin´ e. “Random Matrix Approximation of Spectra of Integral Oper- ators”. In:Bernoulli6 (Feb. 2000).doi:10.2307/3318636

work page doi:10.2307/3318636 2000
[17]

Dynamic nested sampling: an improved algorithm for parameter estimation and evidence calculation,

Ulrike von Luxburg. “A tutorial on spectral clustering”. In:Statistics and Computing17.4 (Aug. 2007), pp. 395–416.issn: 1573-1375.doi:10.1007/s11222- 007- 9033- z.url:http://dx.doi.org/10. 1007/s11222-007-9033-z

work page doi:10.1007/s11222- 2007
[18]

Consistency of spectral clustering

Ulrike von Luxburg, Mikhail Belkin, and Olivier Bousquet. “Consistency of spectral clustering”. In: The Annals of Statistics36.2 (Apr. 2008).issn: 0090-5364.doi:10.1214/009053607000000640.url: http://dx.doi.org/10.1214/009053607000000640

work page doi:10.1214/009053607000000640.url: 2008
[19]

A Few Notes on Statistical Learning Theory

Shahar Mendelson. “A Few Notes on Statistical Learning Theory”. In:Advanced Lectures in Machine Learning2600 (Aug. 2002).doi:10.1007/3-540-36434-X_1

work page doi:10.1007/3-540-36434-x_1 2002
[20]

C., Hinneburg, A., & Keim, D

Shahar Mendelson. “A Few Notes on Statistical Learning Theory”. In:Advanced Lectures on Ma- chine Learning: Machine Learning Summer School 2002 Canberra, Australia, February 11–22, 2002 Revised Lectures. Ed. by Shahar Mendelson and Alexander J. Smola. Berlin, Heidelberg: Springer Berlin Heidelberg, 2003, pp. 1–40.isbn: 978-3-540-36434-4.doi:10.1007/3- 540...

work page doi:10.1007/3- 2002
[21]

Symmetrized and non-symmetrized Asymptotic Mean Value Laplacian in metric measure spaces

Andreas Minne and David Tewodrose. “Symmetrized and non-symmetrized Asymptotic Mean Value Laplacian in metric measure spaces”. In:Proceedings of the Royal Society of Edinburgh Section A: Mathematics(2023), pp. 1–38

2023
[22]

On Spectral Clustering: Analysis and an algorithm

Andrew Ng, Michael Jordan, and Yair Weiss. “On Spectral Clustering: Analysis and an algorithm”. In: Advances in Neural Information Processing Systems. Ed. by T. Dietterich, S. Becker, and Z. Ghahra- mani. Vol. 14. MIT Press, 2001.url:https://proceedings.neurips.cc/paper_files/paper/ 2001/file/801272ee79cfde7fa5960571fee36b9b-Paper.pdf

2001
[23]

Susovan Pal and David Tewodrose.Manifolds with kinks and the asymptotic behavior of the graph Laplacian operator with Gaussian kernel. 2026. arXiv:2507.07751 [math.DG].url:https://arxiv. org/abs/2507.07751

work page arXiv 2026
[24]

On Learning with Integral Operators

Lorenzo Rosasco, Mikhail Belkin, and Ernesto De Vito. “On Learning with Integral Operators”. In: Journal of Machine Learning Research11 (Feb. 2010), pp. 905–934.doi:10.1145/1756006.1756036

work page doi:10.1145/1756006.1756036 2010
[25]

Shorack and Jon A

Galen R. Shorack and Jon A. Wellner.Empirical Processes with Applications to Statistics. Society for Industrial and Applied Mathematics, 2009.doi:10 . 1137 / 1 . 9780898719017. eprint:https : //epubs.siam.org/doi/pdf/10.1137/1.9780898719017.url:https://epubs.siam.org/doi/abs/ 10.1137/1.9780898719017

work page doi:10.1137/1.9780898719017.url:https://epubs.siam.org/doi/abs/ 2009
[26]

van der Vaart and Jon A

Aad W. van der Vaart and Jon A. Wellner.Weak Convergence and Empirical Processes. Springer New York, 1996.isbn: 9781475725452.doi:10.1007/978-1-4757-2545-2.url:http://dx.doi.org/10. 1007/978-1-4757-2545-2. 30

work page doi:10.1007/978-1-4757-2545-2.url:http://dx.doi.org/10 1996