arxiv: 2605.14343 · v1 · submitted 2026-05-14 · 💻 cs.LG · math.ST· stat.ML· stat.TH

Recognition: 2 theorem links

· Lean Theorem

Nearest-Neighbor Radii under Dependent Sampling

Yuanyuan Gao , Yilong Hou , Zhexiao Lin

Authors on Pith no claims yet

Pith reviewed 2026-05-15 01:36 UTC · model grok-4.3

classification 💻 cs.LG math.STstat.MLstat.TH

keywords nearest-neighbor radiistrong mixingdependent samplingintrinsic dimensionalmost sure convergencemoment boundstime seriesmachine learning

0 comments

The pith

Nearest-neighbor radii converge almost surely to zero under polynomial strong mixing, with moment bounds controlled by local intrinsic dimension rather than ambient dimension.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines nearest-neighbor radii when data points are drawn from a dependent process rather than independent samples. It proves that polynomial-rate strong mixing is enough to guarantee distribution-free almost-sure convergence of these radii to zero. Under the stronger geometric mixing condition it supplies explicit non-asymptotic bounds on the moments of the radii. These bounds scale with the local intrinsic dimension of the data, which allows the results to cover high-dimensional observations that concentrate near lower-dimensional manifolds. The findings matter because many practical datasets, including time series, exhibit dependence yet still possess the low-dimensional structure that makes nearest-neighbor geometry useful.

Core claim

For observations satisfying strong mixing, the nearest-neighbor radius around a fixed point converges almost surely to zero when the mixing coefficients decay polynomially, and the moments of the radius admit sharp non-asymptotic upper bounds when the mixing is geometric; both sets of results are distribution-free and the moment bounds depend only on the local intrinsic dimension.

What carries the argument

The nearest-neighbor radius (distance from a query point to its k-th nearest neighbor) under strong mixing dependence on the observation sequence.

If this is right

Nearest-neighbor estimators remain consistent for dependent data once the mixing rate is polynomial.
High-dimensional data lying near a lower-dimensional manifold inherit the same moment bounds on nearest-neighbor radii as in the independent case, provided geometric mixing holds.
Time-series benchmarks can safely employ nearest-neighbor methods when dependence decays geometrically.
The classical geometric properties of nearest neighbors extend to dependent sampling without requiring algorithmic modifications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the mixing rate can be estimated from data, one could check whether a given time series satisfies the conditions needed for the convergence and moment guarantees.
The same radius analysis may apply directly to other local nonparametric procedures such as kernel density estimation or graph-based clustering under dependence.
Explicit convergence rates that incorporate the mixing coefficients could be derived as a direct follow-up.

Load-bearing premise

The observations obey strong mixing conditions with a decay rate that is known or can be bounded from above.

What would settle it

A counterexample sequence whose mixing coefficients decay slower than any polynomial rate, yet the nearest-neighbor radii fail to converge almost surely to zero.

Figures

Figures reproduced from arXiv: 2605.14343 by Yilong Hou, Yuanyuan Gao, Zhexiao Lin.

**Figure 2.** Figure 2: Experiment 2 entropy estimation. The dotted horizontal line is the true entropy. [PITH_FULL_IMAGE:figures/full_fig_p012_2.png] view at source ↗

**Figure 3.** Figure 3: Comparison of long-horizon forecasting using test MSE across datasets and forecast [PITH_FULL_IMAGE:figures/full_fig_p014_3.png] view at source ↗

**Figure 4.** Figure 4: Comparisons on short-horizon forecasting and classification. The left panel com [PITH_FULL_IMAGE:figures/full_fig_p015_4.png] view at source ↗

**Figure 5.** Figure 5: Comparison of long-horizon forecasting using test MAE across datasets and forecast [PITH_FULL_IMAGE:figures/full_fig_p031_5.png] view at source ↗

read the original abstract

Nearest-neighbor methods are fundamental to classical and modern machine learning, yet their geometric properties are typically analyzed under independent sampling. In this paper, we study the nearest-neighbor radii under dependent sampling. We consider strong mixing dependent observations and ask whether dependence changes the scale of nearest-neighbor neighborhoods. We establish distribution-free almost sure convergence under polynomial mixing and sharp non-asymptotic moment bounds under geometric mixing. The moment bounds depend on the local intrinsic dimension rather than the ambient dimension, making the results applicable to high-dimensional data concentrated near lower-dimensional manifolds. Synthetic experiments and real-world time-series benchmarks support the theory, showing that nearest-neighbor geometry remains informative under dependence sampling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper gives the first distribution-free almost-sure convergence for nearest-neighbor radii under polynomial mixing plus intrinsic-dimension moment bounds under geometric mixing.

read the letter

The main thing to know is that the authors prove almost-sure convergence of nearest-neighbor radii for observations that are strongly mixing with polynomial decay, along with non-asymptotic moment bounds under geometric mixing. These bounds depend on the local intrinsic dimension, which is useful when data lies on lower-dimensional structures even if the ambient space is high-dimensional. This extends the usual i.i.d. analysis that dominates the nearest-neighbor literature. Dependence is common in practice for time series and spatial statistics, so removing that restriction is a genuine step forward. The distribution-free aspect means the results do not require specific distributional assumptions beyond the mixing condition. The paper does well in connecting the theory to applications by including synthetic experiments and real-world time-series benchmarks. These show that the nearest-neighbor geometry remains informative despite the dependence. The soft spot is in the details of the polynomial mixing assumption for the almost-sure convergence. Such results typically rely on a Borel-Cantelli lemma for dependent variables, which imposes a specific rate on how fast the mixing coefficients go to zero. If the paper only assumes generic polynomial decay without an explicit lower bound on the exponent, the claim might not hold for all polynomially mixing processes. The stress-test note flags this, and it would be good to see the precise condition in the proofs. That said, the geometric mixing case for moments looks more straightforward. This paper is for people who work on geometric properties of machine learning methods in dependent data settings. It is the kind of theoretical extension that deserves a serious referee, even if some clarifications on the mixing rates are needed.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes nearest-neighbor radii for observations drawn from a strongly mixing stationary process. It claims to establish distribution-free almost-sure convergence of these radii under polynomial mixing and to derive sharp non-asymptotic moment bounds under geometric mixing; the moment bounds are stated to depend on local intrinsic dimension rather than ambient dimension. Synthetic experiments and real-world time-series benchmarks are invoked in support.

Significance. If the stated convergence and moment results hold with the required mixing-rate qualifications, the work would usefully extend classical nearest-neighbor geometry to dependent sampling regimes common in time series and spatial data. The intrinsic-dimension dependence is a practically relevant feature for high-dimensional manifold-supported data. The presence of both theoretical bounds and empirical checks adds value, though the strength of the contribution hinges on the precision of the mixing assumptions.

major comments (2)

[Main convergence theorem] Statement of the almost-sure convergence result: the claim is made under generic 'polynomial mixing' without an explicit decay-rate threshold (e.g., mixing coefficients β(n) = O(n^{-α}) with α > 1 to guarantee summability in the dependent Borel-Cantelli lemma). If the paper only assumes polynomial decay without this qualification, the a.s. convergence does not hold for all polynomially mixing processes, rendering the distribution-free claim load-bearing and incomplete.
[Moment bounds section] Non-asymptotic moment bounds under geometric mixing: the sharpness claim and the explicit dependence on intrinsic dimension are central, yet the bounds must be shown to remain valid when the geometric mixing rate is only marginally faster than the polynomial case; otherwise the separation between the two regimes is not fully justified.

minor comments (2)

[Abstract] The abstract states that experiments 'support the theory' but supplies no quantitative metrics, dataset descriptions, or comparison baselines; this weakens the empirical section.
[Preliminaries] Notation for the mixing coefficients and the radius random variables should be introduced once in a dedicated preliminaries section and used consistently thereafter.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments on our manuscript. The points raised regarding the precise mixing-rate qualifications and the separation of regimes are helpful for clarifying the scope of the results. We address each major comment below and indicate the planned revisions.

read point-by-point responses

Referee: [Main convergence theorem] Statement of the almost-sure convergence result: the claim is made under generic 'polynomial mixing' without an explicit decay-rate threshold (e.g., mixing coefficients β(n) = O(n^{-α}) with α > 1 to guarantee summability in the dependent Borel-Cantelli lemma). If the paper only assumes polynomial decay without this qualification, the a.s. convergence does not hold for all polynomially mixing processes, rendering the distribution-free claim load-bearing and incomplete.

Authors: We agree that the almost-sure convergence requires summability of the mixing coefficients for the dependent Borel-Cantelli lemma to apply. Our proof indeed relies on this, so the generic 'polynomial mixing' phrasing is imprecise. We will revise the main theorem (and the abstract) to explicitly require β(n) = O(n^{-α}) for some α > 1. This qualification preserves the distribution-free character of the result while making the assumption complete and necessary. revision: yes
Referee: [Moment bounds section] Non-asymptotic moment bounds under geometric mixing: the sharpness claim and the explicit dependence on intrinsic dimension are central, yet the bounds must be shown to remain valid when the geometric mixing rate is only marginally faster than the polynomial case; otherwise the separation between the two regimes is not fully justified.

Authors: Geometric mixing is defined via exponential decay β(n) ≤ Cρ^n with ρ < 1, which is strictly faster than any polynomial rate. The non-asymptotic moment bounds exploit this exponential decay to obtain the local-dimension dependence and sharpness. Rates that are only marginally faster than polynomial fall outside the geometric-mixing regime and would require separate analysis; we do not claim the same bounds hold there. We will add a short clarifying paragraph in the discussion section distinguishing the regimes and noting the role of exponential decay in the proof. revision: partial

Circularity Check

0 steps flagged

No circularity: results derived from external mixing assumptions via standard ergodic tools

full rationale

The paper derives almost-sure convergence and moment bounds for nearest-neighbor radii from strong mixing conditions (polynomial or geometric decay) using dependent versions of Borel-Cantelli and moment inequalities. These assumptions are stated as external modeling inputs rather than defined in terms of the target radii or fitted from the same data. No self-citation chain, ansatz smuggling, or renaming of known results occurs; the intrinsic-dimension dependence follows directly from volume arguments under the mixing hypothesis. The derivation chain remains non-circular and externally falsifiable.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on strong mixing assumptions whose decay rates are taken as given; no free parameters are fitted inside the theorems themselves, and no new entities are postulated.

axioms (1)

domain assumption Observations satisfy strong mixing with polynomial or geometric decay of dependence coefficients.
Invoked to replace independence in the classical nearest-neighbor arguments; appears in the statement of the main theorems.

pith-pipeline@v0.9.0 · 5410 in / 1158 out tokens · 26543 ms · 2026-05-15T01:36:21.409022+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

59 extracted references · 5 canonical work pages

[1]

Convergence of distributions generated by stationary stochastic processes , volume =

Davydov, Yu A , date-added =. Convergence of distributions generated by stationary stochastic processes , volume =. Theory of Probability & Its Applications , number =
[2]

The Functional Law of the Iterated Logarithm for Stationary Strongly Mixing Sequences , volume =

Rio, Emmanuel , date-added =. The Functional Law of the Iterated Logarithm for Stationary Strongly Mixing Sequences , volume =. The Annals of Probability , number =
[3]

A distribution-free theory of nonparametric regression , year =

Gy. A distribution-free theory of nonparametric regression , year =
[4]

Non-asymptotic uniform rates of consistency for k-nn regression , volume =

Jiang, Heinrich , booktitle =. Non-asymptotic uniform rates of consistency for k-nn regression , volume =
[5]

Strong convergence of sums of -mixing random variables with applications to density estimation , volume =

Liebscher, Eckhard , date-added =. Strong convergence of sums of -mixing random variables with applications to density estimation , volume =. Stochastic Processes and Their Applications , number =
[6]

Bernstein inequality and moderate deviations under strong mixing conditions , volume =

Merlev. Bernstein inequality and moderate deviations under strong mixing conditions , volume =. High dimensional probability V: the Luminy volume , date-added =
[7]

Mixing: Properties and Examples , volume =

Doukhan, Paul , date-added =. Mixing: Properties and Examples , volume =
[8]

Mixing properties of ARMA processes , volume =

Mokkadem, Abdelkader , date-added =. Mixing properties of ARMA processes , volume =. Stochastic Processes and Their Applications , number =
[9]

Basic Properties of Strong Mixing Conditions

Bradley, Richard C , date-added =. Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions , volume =. Probability Surveys , pages =
[10]

Rates of Convergence for Empirical Processes of Stationary Mixing Sequences , volume =

Yu, Bin , date-added =. Rates of Convergence for Empirical Processes of Stationary Mixing Sequences , volume =. The Annals of Probability , number =
[11]

Consistency of the k-Nearest Neighbor Classifier for Spatially Dependent Data , volume =

Younso, Ahmad and Kanaya, Ziad and Azhari, Nour , date-added =. Consistency of the k-Nearest Neighbor Classifier for Spatially Dependent Data , volume =. Communications in Mathematics and Statistics , number =
[12]

Rates of convergence of nearest neighbor estimation under arbitrary sampling , volume =

Kulkarni, Sanjeev R and Posner, Steven E , date-added =. Rates of convergence of nearest neighbor estimation under arbitrary sampling , volume =. IEEE Transactions on Information Theory , number =
[13]

Limits to classification and regression estimation from ergodic processes , volume =

Nobel, Andrew B , date-added =. Limits to classification and regression estimation from ergodic processes , volume =. The Annals of Statistics , number =
[14]

Stability Bounds for Stationary -mixing and -mixing Processes , volume =

Mohri, Mehryar and Rostamizadeh, Afshin , date-added =. Stability Bounds for Stationary -mixing and -mixing Processes , volume =. Journal of Machine Learning Research , pages =
[15]

Nearest neighbor classification with dependent training sequences , volume =

Holst, M and Irle, A , date-added =. Nearest neighbor classification with dependent training sequences , volume =. The Annals of Statistics , number =
[16]

On the multivariate runs test , volume =

Henze, Norbert and Penrose, Mathew D , date-added =. On the multivariate runs test , volume =. The Annals of Statistics , number =
[17]

The Annals of Statistics , volume=

Multivariate Generalizations of the Wald-Wolfowitz and Smirnov Two-Sample Tests , author=. The Annals of Statistics , volume=. 1979 , publisher=

1979
[18]

Estimating mutual information , volume =

Kraskov, Alexander and St. Estimating mutual information , volume =. Physical Review E---Statistical, Nonlinear, and Soft Matter Physics , number =
[19]

Sample Estimate of the Entropy of a Random Vector , volume =

Kozachenko, LF and Leonenko, Nikolai N , date-added =. Sample Estimate of the Entropy of a Random Vector , volume =. Problemy Peredachi Informatsii , number =
[20]

A new coefficient of correlation , volume =

Chatterjee, Sourav , date-added =. A new coefficient of correlation , volume =. Journal of the American Statistical Association , number =
[21]

arXiv preprint arXiv:2204.08031 , title =

Lin, Zhexiao and Han, Fang , date-added =. arXiv preprint arXiv:2204.08031 , title =

work page arXiv
[22]

Estimation based on nearest neighbor matching: from density ratio to average treatment effect , volume =

Lin, Zhexiao and Ding, Peng and Han, Fang , date-added =. Estimation based on nearest neighbor matching: from density ratio to average treatment effect , volume =. Econometrica , number =
[23]

On boosting the power of Chatterjee's rank correlation , volume =

Lin, Zhexiao and Han, Fang , date-added =. On boosting the power of Chatterjee's rank correlation , volume =. Biometrika , number =
[24]

On the failure of the bootstrap for Chatterjee's rank correlation , volume =

Lin, Zhexiao and Han, Fang , date-added =. On the failure of the bootstrap for Chatterjee's rank correlation , volume =. Biometrika , number =
[25]

On regression-adjusted imputation estimators of average treatment effects , volume =

Lin, Zhexiao and Han, Fang , date-added =. On regression-adjusted imputation estimators of average treatment effects , volume =. Journal of Econometrics , pages =
[26]

On Rosenbaum's rank-based matching estimator , volume =

Cattaneo, Matias D and Han, Fang and Lin, Zhexiao , date-added =. On Rosenbaum's rank-based matching estimator , volume =. Biometrika , number =
[27]

Large Sample Properties of Matching Estimators for Average Treatment Effects , volume =

Abadie, Alberto and Imbens, Guido W , date-added =. Large Sample Properties of Matching Estimators for Average Treatment Effects , volume =. Econometrica , number =
[28]

k-NNN: nearest neighbors of neighbors for anomaly detection , year =

Nizan, Ori and Tal, Ayellet , booktitle =. k-NNN: nearest neighbors of neighbors for anomaly detection , year =
[29]

Lorann: Low-rank matrix factorization for approximate nearest neighbor search , volume =

J. Lorann: Low-rank matrix factorization for approximate nearest neighbor search , volume =. Advances in Neural Information Processing Systems , pages =
[30]

arXiv preprint arXiv:2307.14338 , title =

Gorishniy, Yury and Rubachev, Ivan and Kartashev, Nikolay and Shlenskii, Daniil and Kotelnikov, Akim and Babenko, Artem , date-added =. arXiv preprint arXiv:2307.14338 , title =

work page arXiv
[31]

Retrievalguard: Provably robust 1-nearest neighbor image retrieval , year =

Wu, Yihan and Zhang, Hongyang and Huang, Heng , booktitle =. Retrievalguard: Provably robust 1-nearest neighbor image retrieval , year =
[32]

CSPG: crossing sparse proximity graphs for approximate nearest neighbor search , volume =

Yang, Ming and Cai, Yuzheng and Zheng, Weiguo , date-added =. CSPG: crossing sparse proximity graphs for approximate nearest neighbor search , volume =. Advances in Neural Information Processing Systems , pages =
[33]

Spann: Highly-efficient billion-scale approximate nearest neighborhood search , volume =

Chen, Qi and Zhao, Bing and Wang, Haidong and Li, Mingqin and Liu, Chuanjie and Li, Zengzhong and Yang, Mao and Wang, Jingdong , date-added =. Spann: Highly-efficient billion-scale approximate nearest neighborhood search , volume =. Advances in Neural Information Processing Systems , pages =
[34]

arXiv preprint arXiv:2303.13824 , title =

Xu, Benfeng and Wang, Quan and Mao, Zhendong and Lyu, Yajuan and She, Qiaoqiao and Zhang, Yongdong , date-added =. arXiv preprint arXiv:2303.13824 , title =

work page arXiv
[35]

Embedding Dimension of Contrastive Learning and k -Nearest Neighbors , volume =

Avdiukhin, Dmitrii and Chatziafratis, Vaggos and Fischer, Orr and Yaroslavtsev, Grigory , date-added =. Embedding Dimension of Contrastive Learning and k -Nearest Neighbors , volume =. Advances in Neural Information Processing Systems , pages =
[36]

MNN: Mixed nearest-neighbors for self-supervised learning , volume =

Long, Xianzhong and Peng, Chen and Li, Yun , date-added =. MNN: Mixed nearest-neighbors for self-supervised learning , volume =. Pattern Recognition , pages =
[37]

Advances in Neural Information Processing Systems , title =

Ansuini, Alessio and Laio, Alessandro and Macke, Jakob H and Zoccolan, Davide , date-added =. Advances in Neural Information Processing Systems , title =
[38]

Testing the manifold hypothesis , volume =

Fefferman, Charles and Mitter, Sanjoy and Narayanan, Hariharan , date-added =. Testing the manifold hypothesis , volume =. Journal of the American Mathematical Society , number =
[39]

Learning from many trajectories , volume =

Tu, Stephen and Frostig, Roy and Soltanolkotabi, Mahdi , date-added =. Learning from many trajectories , volume =. Journal of Machine Learning Research , number =
[40]

A nonparametric estimate of a multivariate density function , volume =

Loftsgaarden, Don O and Quesenberry, Charles P , date-added =. A nonparametric estimate of a multivariate density function , volume =. The Annals of Mathematical Statistics , number =
[41]

Advances in Neural Information Processing Systems , title =

Kpotufe, Samory , date-added =. Advances in Neural Information Processing Systems , title =
[42]

International Conference on Learning Representations , year=

Nearest Neighbor Machine Translation , author=. International Conference on Learning Representations , year=
[43]

Why do nearest neighbor language models work? , year =

Xu, Frank F and Alon, Uri and Neubig, Graham , booktitle =. Why do nearest neighbor language models work? , year =
[44]

arXiv preprint arXiv:2101.12631 , title =

Wang, Mengzhao and Xu, Xiaoliang and Yue, Qiang and Wang, Yuxiang , date-added =. arXiv preprint arXiv:2101.12631 , title =

work page arXiv
[45]

Semi-supervised learning using gaussian fields and harmonic functions , year =

Zhu, Xiaojin and Ghahramani, Zoubin and Lafferty, John D , booktitle =. Semi-supervised learning using gaussian fields and harmonic functions , year =
[46]

Advances in Neural Information Processing Systems , title =

Zelnik-Manor, Lihi and Perona, Pietro , date-added =. Advances in Neural Information Processing Systems , title =
[47]

Laplacian eigenmaps for dimensionality reduction and data representation , volume =

Belkin, Mikhail and Niyogi, Partha , date-added =. Laplacian eigenmaps for dimensionality reduction and data representation , volume =. Neural Computation , number =
[48]

Nearest neighbor pattern classification , volume =

Cover, Thomas and Hart, Peter , date-added =. Nearest neighbor pattern classification , volume =. IEEE transactions on information theory , number =
[49]

Consistent Nonparametric Regression , volume =

Stone, Charles J , date-added =. Consistent Nonparametric Regression , volume =. The Annals of Statistics , number =
[50]

Lectures on the nearest neighbor method , volume =

Biau, G. Lectures on the nearest neighbor method , volume =
[51]

International Conference on Machine Learning , year=

MOMENT: A Family of Open Time-series Foundation Models , author=. International Conference on Machine Learning , year=
[52]

The Annals of Statistics , volume=

A simple measure of conditional dependence , author=. The Annals of Statistics , volume=. 2021 , publisher=

2021
[53]

Bernoulli News , volume=

On extensions of rank correlation coefficients to multivariate spaces , author=. Bernoulli News , volume=
[54]

Bernoulli , volume=

On Azadkia--Chatterjee’s conditional dependence coefficient , author=. Bernoulli , volume=. 2024 , publisher=

2024
[55]

The Annals of Applied Probability , volume=

Azadkia--Chatterjee’s correlation coefficient adapts to manifold data , author=. The Annals of Applied Probability , volume=. 2024 , publisher=

2024
[56]

Journal of Business & Economic Statistics , volume=

Bias-Corrected Matching Estimators for Average Treatment Effects , author=. Journal of Business & Economic Statistics , volume=. 2011 , publisher=

2011
[57]

IEEE/CAA Journal of Automatica Sinica , volume=

The UCR time series archive , author=. IEEE/CAA Journal of Automatica Sinica , volume=. 2019 , publisher=

2019
[58]

Proceedings of the AAAI conference on artificial intelligence , volume=

Informer: Beyond efficient transformer for long sequence time-series forecasting , author=. Proceedings of the AAAI conference on artificial intelligence , volume=
[59]

arXiv preprint arXiv:2105.06643 , year=

Monash time series forecasting archive , author=. arXiv preprint arXiv:2105.06643 , year=

work page arXiv