pith. sign in

arxiv: 2604.25360 · v1 · submitted 2026-04-28 · 🧮 math.PR · math.CO· math.ST· stat.TH

Exact Closed-Form Formulae for Linear and Circular Continuous Scan Statistics: P_c(N - 1; N, w), P_c(3; N, w), and P(3; N, w)

Pith reviewed 2026-05-07 15:15 UTC · model grok-4.3

classification 🧮 math.PR math.COmath.STstat.TH
keywords continuous scan statisticsclosed-form expressionscircular scan statisticslinear scan statisticsuniform random pointsgeometric probabilityclustering detectionextreme spacings
0
0 comments X

The pith

Exact closed-form expressions are derived for the continuous scan statistics Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) at arbitrary N and window width w.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes direct analytical formulas for three specific cases of linear and circular continuous scan statistics instead of relying on recursive approximations. These statistics describe the probability that k out of N uniform random points on a unit interval or circle fit inside a window of width w. A reader would care because the formulas supply exact probabilities for detecting clusters without iterative computation or approximation error, and they apply for any N and any w. The derivations focus on the extreme case k equals N minus 1 and the case k equals 3.

Core claim

The central claim is that the cumulative distribution functions Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) admit exact closed-form expressions obtained by bypassing recursive methods and matching the underlying geometric probabilities of uniform points on an interval or circle. The survival function 1 minus Pc(k; N, w) equals the probability that N random arcs of length 1 minus w cover every point on the circle at least N plus 1 minus k times.

What carries the argument

Direct closed-form expressions for the scan statistic probabilities, built from combinatorial counting of spacings and arc-covering configurations on the circle or line.

If this is right

  • The expressions supply exact baseline distributions for extreme spacings among uniform points.
  • They reduce computational effort by eliminating the need for iterative or recursive evaluation at every N and w.
  • The circular versions connect directly to multiple-coverage probabilities by N arcs of fixed length.
  • The formulas hold for any positive integer N and any window width w between 0 and 1.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The closed forms could be used to calibrate simulation studies or to test software implementations of scan-statistic algorithms for other values of k.
  • Similar spacing arguments might extend to derive closed forms for additional small values of k such as 4 or 5.
  • In applied settings the exact expressions allow precise p-value calculation when testing for clustering in one-dimensional spatial data.

Load-bearing premise

The algebraic manipulations in the derivations produce expressions that exactly equal the geometric probabilities defined by placing N uniform points or arcs.

What would settle it

Generate thousands of independent samples of N uniform points on the unit interval or circle, compute the empirical fraction satisfying the scan condition for chosen N and w, and check whether the value matches the closed-form expression within sampling error.

read the original abstract

The continuous linear $P(k; N, w)$ and circular scan statistics $P_c(k; N, w)$ are fundamental tools in probability and spatial statistics, frequently used to detect clustering in uniform data. Let $X_1, X_2, \dots, X_N$ be independently and uniformly distributed random variables on a unit interval or unit ring. The exact distribution of these scan statistics relies on the minimum window width required to capture exactly $k$ points. Furthermore, the survival function $1 - P_c(k; N, w)$ directly corresponds to the geometric probability that if $N$ arcs of length $1 - w$ are uniformly and randomly placed on a unit circle, every point on the circle is covered at least $N + 1 - k$ times. Historically, evaluating the exact cumulative distribution functions, $P(k; N, w)$ and $P_c(k; N, w)$, relies heavily on complex recursive approximations. In this paper, we bypass these traditional recursive methods to derive direct, generalized closed-form expressions for some linear and circular continuous scan statistics. Specifically, we present the exact analytical solutions for $P_c(N - 1; N, w)$, $P_c(3; N, w)$, and $P(3; N, w)$ for arbitrary values of $N$ and window width $w$. These newly derived closed-form expressions not only provide exact baseline distributions for extreme spacings but also significantly simplify computational complexity compared to existing iterative approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript derives exact closed-form expressions for the continuous linear scan statistic P(3; N, w) and the circular scan statistics P_c(N-1; N, w) and P_c(3; N, w) for arbitrary N and window width w, obtained via direct combinatorial counting of spacing configurations among N uniform order statistics on the unit interval or circle.

Significance. If the derivations hold, these formulas supply exact, non-recursive, parameter-free expressions that match the underlying geometric probability definitions exactly. This provides verifiable baselines for extreme spacings in scan statistics, simplifies computation relative to recursive methods, and strengthens the toolkit for spatial statistics and probability applications. The direct combinatorial approach without approximation or hidden dependencies is a clear strength.

minor comments (2)
  1. [Abstract] The abstract states the results but could briefly indicate the combinatorial spacing-counting technique to orient readers before the detailed derivations.
  2. [Introduction] Notation for the survival function 1 - P_c(k; N, w) and its coverage interpretation is introduced but would benefit from an explicit cross-reference to the relevant equation when first used.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive review, recognition of the combinatorial approach, and recommendation to accept the manuscript. We appreciate the assessment that the closed-form expressions provide exact baselines and simplify computation.

Circularity Check

0 steps flagged

No significant circularity; derivations are direct combinatorial

full rationale

The paper derives Pc(N-1; N, w), Pc(3; N, w), and P(3; N, w) via direct combinatorial counting of spacing configurations on the interval and circle. These expressions match the geometric probability definitions exactly through finite sums over ordered uniform order statistics, without recursion, approximation, fitted parameters, or load-bearing self-citations. The approach is self-contained against the stated uniform-point definitions and requires no external uniqueness theorems or prior results from the same authors to close the derivation chain.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on standard assumptions of i.i.d. uniform points and the equivalence to coverage probabilities on the circle, with no free parameters or invented entities introduced.

axioms (1)
  • domain assumption X1 to XN are independently and uniformly distributed on the unit interval or unit circle.
    Stated directly in the abstract as the setup for the scan statistics.

pith-pipeline@v0.9.0 · 5605 in / 1138 out tokens · 45687 ms · 2026-05-07T15:15:30.207759+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

15 extracted references · 15 canonical work pages

  1. [1]

    William Burnside and Andrew Russell Forsyth,Theory of probability, Cambridge University Press, 1928

  2. [2]

    Joseph Glaz and Joseph Naus,Multiple coverage of the line, The annals of probability (1979), 900–906

  3. [3]

    Joseph Glaz, Joseph Naus, and Sylvan Wallenstein,Scanning n uniform distributed points: Bounds, Scan Statistics, Springer, 2001, pp. 141–159

  4. [4]

    113– 140

    ,Scanning n uniform distributed points: Exact results, Scan Statistics, Springer, 2001, pp. 113– 140. EXACT SCAN STATISTICS FORk= 3 ANDk=N−1 9

  5. [5]

    440, 1466–1475

    Fred W Huffer and Chien-Tai Lin,Approximating the distribution of the scan statistic using moments of the number of clumps, Journal of the American Statistical Association92(1997), no. 440, 1466–1475

  6. [6]

    2, 117–132

    ,Computing the exact distribution of the extremes of sums of consecutive spacings, Computational statistics & data analysis26(1997), no. 2, 117–132

  7. [7]

    ,Computing the joint distribution of general linear combinations of spacings or exponential vari- ates, Statistica Sinica (2001), 1141–1157

  8. [8]

    Raymond J Huntington and Joseph I Naus,A simpler expression for k th nearest neighbor coincidence probabilities, The Annals of Probability (1975), 894–896

  9. [9]

    Samuel Karlin and James McGregor,Coincidence probabilities., (1959)

  10. [10]

    10, 8205– 8217

    Chien-Tai Lin and Chung-Tao Lu,A two-stage algorithm for computation of the exact distribution of sums of spacings, Communications in Statistics-Simulation and Computation46(2017), no. 10, 8205– 8217

  11. [11]

    310, 532–538

    Joseph I Naus,The distribution of the size of the maximum cluster of points on a line, Journal of the American Statistical Association60(1965), no. 310, 532–538

  12. [12]

    Joseph Irwin Naus,Clustering of random points in line and plane, Harvard University Press, 1963

  13. [13]

    Emanuel Parzen,Modern probability theory and its applications, Wiley, 1960

  14. [14]

    4, 315– 320

    WL Stevens,Solution to a geometrical problem in probability, Annals of Eugenics9(1939), no. 4, 315– 320

  15. [15]

    van Elteren and H

    Ph. van Elteren and H. J. M. Gerrits,Een wachtprobleem voorkomende bij drempelwaardemetingen aan het oog, Statistica Neerlandica15(1961), no. 4, 385–401. Peking University Email address:hwyuan25@stu.pku.edu.cn