Exact Closed-Form Formulae for Linear and Circular Continuous Scan Statistics: P_c(N - 1; N, w), P_c(3; N, w), and P(3; N, w)
Pith reviewed 2026-05-07 15:15 UTC · model grok-4.3
The pith
Exact closed-form expressions are derived for the continuous scan statistics Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) at arbitrary N and window width w.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the cumulative distribution functions Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) admit exact closed-form expressions obtained by bypassing recursive methods and matching the underlying geometric probabilities of uniform points on an interval or circle. The survival function 1 minus Pc(k; N, w) equals the probability that N random arcs of length 1 minus w cover every point on the circle at least N plus 1 minus k times.
What carries the argument
Direct closed-form expressions for the scan statistic probabilities, built from combinatorial counting of spacings and arc-covering configurations on the circle or line.
If this is right
- The expressions supply exact baseline distributions for extreme spacings among uniform points.
- They reduce computational effort by eliminating the need for iterative or recursive evaluation at every N and w.
- The circular versions connect directly to multiple-coverage probabilities by N arcs of fixed length.
- The formulas hold for any positive integer N and any window width w between 0 and 1.
Where Pith is reading between the lines
- The closed forms could be used to calibrate simulation studies or to test software implementations of scan-statistic algorithms for other values of k.
- Similar spacing arguments might extend to derive closed forms for additional small values of k such as 4 or 5.
- In applied settings the exact expressions allow precise p-value calculation when testing for clustering in one-dimensional spatial data.
Load-bearing premise
The algebraic manipulations in the derivations produce expressions that exactly equal the geometric probabilities defined by placing N uniform points or arcs.
What would settle it
Generate thousands of independent samples of N uniform points on the unit interval or circle, compute the empirical fraction satisfying the scan condition for chosen N and w, and check whether the value matches the closed-form expression within sampling error.
read the original abstract
The continuous linear $P(k; N, w)$ and circular scan statistics $P_c(k; N, w)$ are fundamental tools in probability and spatial statistics, frequently used to detect clustering in uniform data. Let $X_1, X_2, \dots, X_N$ be independently and uniformly distributed random variables on a unit interval or unit ring. The exact distribution of these scan statistics relies on the minimum window width required to capture exactly $k$ points. Furthermore, the survival function $1 - P_c(k; N, w)$ directly corresponds to the geometric probability that if $N$ arcs of length $1 - w$ are uniformly and randomly placed on a unit circle, every point on the circle is covered at least $N + 1 - k$ times. Historically, evaluating the exact cumulative distribution functions, $P(k; N, w)$ and $P_c(k; N, w)$, relies heavily on complex recursive approximations. In this paper, we bypass these traditional recursive methods to derive direct, generalized closed-form expressions for some linear and circular continuous scan statistics. Specifically, we present the exact analytical solutions for $P_c(N - 1; N, w)$, $P_c(3; N, w)$, and $P(3; N, w)$ for arbitrary values of $N$ and window width $w$. These newly derived closed-form expressions not only provide exact baseline distributions for extreme spacings but also significantly simplify computational complexity compared to existing iterative approaches.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript derives exact closed-form expressions for the continuous linear scan statistic P(3; N, w) and the circular scan statistics P_c(N-1; N, w) and P_c(3; N, w) for arbitrary N and window width w, obtained via direct combinatorial counting of spacing configurations among N uniform order statistics on the unit interval or circle.
Significance. If the derivations hold, these formulas supply exact, non-recursive, parameter-free expressions that match the underlying geometric probability definitions exactly. This provides verifiable baselines for extreme spacings in scan statistics, simplifies computation relative to recursive methods, and strengthens the toolkit for spatial statistics and probability applications. The direct combinatorial approach without approximation or hidden dependencies is a clear strength.
minor comments (2)
- [Abstract] The abstract states the results but could briefly indicate the combinatorial spacing-counting technique to orient readers before the detailed derivations.
- [Introduction] Notation for the survival function 1 - P_c(k; N, w) and its coverage interpretation is introduced but would benefit from an explicit cross-reference to the relevant equation when first used.
Simulated Author's Rebuttal
We thank the referee for their positive review, recognition of the combinatorial approach, and recommendation to accept the manuscript. We appreciate the assessment that the closed-form expressions provide exact baselines and simplify computation.
Circularity Check
No significant circularity; derivations are direct combinatorial
full rationale
The paper derives Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) via direct combinatorial counting of spacing configurations on the interval and circle. These expressions match the geometric probability definitions exactly through finite sums over ordered uniform order statistics, without recursion, approximation, fitted parameters, or load-bearing self-citations. The approach is self-contained against the stated uniform-point definitions and requires no external uniqueness theorems or prior results from the same authors to close the derivation chain.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption X1 to XN are independently and uniformly distributed on the unit interval or unit circle.
Reference graph
Works this paper leans on
-
[1]
William Burnside and Andrew Russell Forsyth,Theory of probability, Cambridge University Press, 1928
work page 1928
-
[2]
Joseph Glaz and Joseph Naus,Multiple coverage of the line, The annals of probability (1979), 900–906
work page 1979
-
[3]
Joseph Glaz, Joseph Naus, and Sylvan Wallenstein,Scanning n uniform distributed points: Bounds, Scan Statistics, Springer, 2001, pp. 141–159
work page 2001
- [4]
-
[5]
Fred W Huffer and Chien-Tai Lin,Approximating the distribution of the scan statistic using moments of the number of clumps, Journal of the American Statistical Association92(1997), no. 440, 1466–1475
work page 1997
-
[6]
,Computing the exact distribution of the extremes of sums of consecutive spacings, Computational statistics & data analysis26(1997), no. 2, 117–132
work page 1997
-
[7]
,Computing the joint distribution of general linear combinations of spacings or exponential vari- ates, Statistica Sinica (2001), 1141–1157
work page 2001
-
[8]
Raymond J Huntington and Joseph I Naus,A simpler expression for k th nearest neighbor coincidence probabilities, The Annals of Probability (1975), 894–896
work page 1975
-
[9]
Samuel Karlin and James McGregor,Coincidence probabilities., (1959)
work page 1959
-
[10]
Chien-Tai Lin and Chung-Tao Lu,A two-stage algorithm for computation of the exact distribution of sums of spacings, Communications in Statistics-Simulation and Computation46(2017), no. 10, 8205– 8217
work page 2017
-
[11]
Joseph I Naus,The distribution of the size of the maximum cluster of points on a line, Journal of the American Statistical Association60(1965), no. 310, 532–538
work page 1965
-
[12]
Joseph Irwin Naus,Clustering of random points in line and plane, Harvard University Press, 1963
work page 1963
-
[13]
Emanuel Parzen,Modern probability theory and its applications, Wiley, 1960
work page 1960
-
[14]
WL Stevens,Solution to a geometrical problem in probability, Annals of Eugenics9(1939), no. 4, 315– 320
work page 1939
-
[15]
Ph. van Elteren and H. J. M. Gerrits,Een wachtprobleem voorkomende bij drempelwaardemetingen aan het oog, Statistica Neerlandica15(1961), no. 4, 385–401. Peking University Email address:hwyuan25@stu.pku.edu.cn
work page 1961
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.