pith. machine review for the scientific record. sign in

arxiv: 2605.13771 · v1 · submitted 2026-05-13 · 💻 cs.CC · math.PR

Recognition: 2 theorem links

· Lean Theorem

Upper Bounds for Symmetric Approximate Bounded Indistinguishability

Christopher Williamson

Pith reviewed 2026-05-14 17:41 UTC · model grok-4.3

classification 💻 cs.CC math.PR
keywords symmetric distributionsbounded indistinguishabilitystatistical distancehypergeometric smoothingHahn polynomialsmarginalsapproximate indistinguishability
0
0 comments X

The pith

For symmetric distributions, exact indistinguishability on c1 n marginals forces the c2 n marginals to be exponentially close when c2 exceeds c1.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes new upper bounds showing that symmetric distributions over n-bit strings cannot maintain a constant separation in their larger marginals once the smaller ones are fixed to be identical. Using hypergeometric smoothing to average over random subsets and Hahn polynomial expansions to control the resulting distance, the bounds prove that (c1 n, 0)-wise indistinguishability implies exponential closeness for any larger c2 n marginals. This rules out the existence of constant-factor gaps that prior work left open. The same smoothing approach also yields superpolynomial closeness even when the smaller marginals are allowed a tiny positive distance up to exp(-ω(sqrt(n log n))), and it improves existing results for sublinear k and for t approaching n.

Core claim

A pair of symmetric distributions over {0,1}^n is (k,δ)-wise indistinguishable if every size-k marginal differs by statistical distance at most δ. The central result is that when δ equals zero and k equals c1 n for constants 0 < c1 < c2 < 1, the c2 n-wise marginals must have statistical distance at most 2^{-Ω(n)}. The same technique shows that even if the c1 n-wise marginals are permitted distance δ ≤ exp(-ω(sqrt(n log n))), the c2 n-wise marginals remain superpolynomially close, with analogous improvements holding when k is sublinear or t/n tends to one.

What carries the argument

Hypergeometric smoothing, which replaces each distribution with its average over hypergeometric random subsets of size t, followed by expansion in the orthogonal Hahn polynomial basis to bound the total variation distance between the smoothed versions.

If this is right

  • The c2 n-marginals must be exponentially close whenever the c1 n-marginals are identical.
  • Even allowing δ = exp(-ω(sqrt(n log n))) distance in the c1 n-marginals still forces superpolynomial closeness in the c2 n-marginals.
  • The bounds improve prior results uniformly across sublinear k and across the regime where t/n approaches one.
  • No pair of symmetric distributions can achieve a constant statistical distance gap between the two scales.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The rigidity imposed by symmetry may extend to average-case relaxations where distributions are close to symmetric.
  • The smoothing-plus-orthogonal-polynomial method could bound marginal distances in other high-dimensional product spaces beyond the hypercube.
  • These scale-separation results suggest concrete limits on the design of symmetric pseudorandom objects that must fool tests at multiple observation sizes.
  • Similar smoothing ideas might yield testable predictions for how quickly higher-order moments are constrained once lower-order moments are fixed.

Load-bearing premise

The distributions must be symmetric under arbitrary coordinate permutations so that the hypergeometric smoothing step preserves the necessary invariance.

What would settle it

Exhibit a pair of symmetric distributions over {0,1}^n such that the c1 n-marginals are identical yet the c2 n-marginals differ by a fixed positive constant for some fixed c1 < c2.

read the original abstract

A pair of probability distributions over $\{0,1\}^n$ is said to be $(k,\delta)$-wise indistinguishable if all of the size $k$ marginals are within statistical distance at most $\delta$. Previous works introduced this concept and study when and how well one can distinguish between such a pair of symmetric distributions by observing $t$ bits. We use a simple hypergeometric smoothing approach and Hahn polynomials to obtain new upper bounds that apply across a wider range of parameters and improve previously available bounds in several regimes. In particular, prior works left open the basic question of whether there exist constants $0<c_1<c_2<1$ and a pair of $(c_1n,0)$-wise indistinguishable distributions such that the $c_2n$-wise marginals have statistical distance $\Omega(1)$. One application of our new bounds is to rule this out for all $c_1,c_2$ and to show that the $c_2n$-wise marginals must in fact be exponentially close. Another application in this setting is to show that the $c_2n$-wise marginals must be super-polynomially close even if the $c_1n$-wise marginals are allowed to have statistical distance $\delta$ for any $\delta\leq\exp\left({-\omega(\sqrt{n\log{n}})}\right)$. Our bounds also yield new results in other regimes, for example when $k$ is sublinear or when $t/n$ tends to 1.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper introduces upper bounds on the statistical distance between t-wise marginals of symmetric (k, δ)-wise indistinguishable pairs of distributions over {0,1}^n. It employs hypergeometric smoothing combined with approximation properties of Hahn polynomials to derive these bounds, which apply across a range of parameters including sublinear k and t/n approaching 1. A key result resolves an open question by showing that for any constants 0 < c1 < c2 < 1, if two symmetric distributions are (c1 n, 0)-wise indistinguishable, then their c2 n-wise marginals must be exponentially close in statistical distance; a related result shows super-polynomial closeness even when the c1 n-wise marginals have small but positive δ up to exp(-ω(√(n log n))).

Significance. If the derivations hold, the results provide a clean analytic strengthening of prior work on approximate bounded indistinguishability for symmetric distributions, ruling out constant-gap separations between linear-wise marginals and yielding tighter exponential and super-polynomial decay rates. The approach leverages standard orthogonal polynomial machinery without introducing free parameters or fitted constants, and the symmetry hypothesis is used explicitly to enable the smoothing step. These bounds improve on existing results in multiple regimes and have potential implications for pseudorandomness and property testing.

major comments (2)
  1. [§3.1] §3.1, the hypergeometric smoothing step leading to Eq. (8): the error term introduced by the smoothing appears to be controlled by a binomial coefficient ratio whose dependence on the ratio k/n is not made fully explicit; this leaves open whether the claimed exponential decay rate for the c2 n case remains uniform when c1 and c2 are close.
  2. [Theorem 4.3] Theorem 4.3: the super-polynomial closeness bound when δ ≤ exp(-ω(√(n log n))) is stated without an explicit dependence of the exponent on the precise form of δ; verifying that the Hahn-polynomial tail still yields super-polynomial decay requires checking the interaction between the degree-c1 n truncation and the δ-perturbation term.
minor comments (2)
  1. [§2.2] The notation for the Hahn polynomials in §2.2 could be clarified by explicitly recalling the three-term recurrence or the explicit hypergeometric representation used in the proofs.
  2. [Theorem 1.1] In the statement of the main theorems, the dependence of the hidden constants on c1 and c2 should be stated more precisely (e.g., whether they are allowed to depend on the gap c2-c1).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading, positive assessment of the results, and recommendation for minor revision. We address each major comment below and will update the manuscript accordingly to improve clarity.

read point-by-point responses
  1. Referee: [§3.1] §3.1, the hypergeometric smoothing step leading to Eq. (8): the error term introduced by the smoothing appears to be controlled by a binomial coefficient ratio whose dependence on the ratio k/n is not made fully explicit; this leaves open whether the claimed exponential decay rate for the c2 n case remains uniform when c1 and c2 are close.

    Authors: We thank the referee for highlighting this point. The error term from hypergeometric smoothing is indeed controlled by a ratio of binomial coefficients, which we can bound explicitly as a function of k/n (specifically, the ratio is at most (O(1))^{k} times a factor depending on the hypergeometric parameters, reducing to a constant when k = c1 n for fixed c1). For the c2 n case, the resulting statistical distance bound is exponential in n with a rate that depends on the gap c2 - c1 (arising from the Hahn polynomial oscillation properties and the smoothing window); it takes the form exp(-Ω(n · f(c2 - c1))) where f(·) > 0 for any positive gap. This is exponential for any fixed c1 < c2, though the hidden constant vanishes as c1 approaches c2 (as must occur). We will revise §3.1 to state this dependence explicitly in the theorem statement and proof sketch. revision: yes

  2. Referee: [Theorem 4.3] Theorem 4.3: the super-polynomial closeness bound when δ ≤ exp(-ω(√(n log n))) is stated without an explicit dependence of the exponent on the precise form of δ; verifying that the Hahn-polynomial tail still yields super-polynomial decay requires checking the interaction between the degree-c1 n truncation and the δ-perturbation term.

    Authors: We agree that an explicit dependence would be helpful. The super-polynomial decay is obtained by combining the Hahn polynomial tail bound (which for truncation degree d = c1 n yields an error decaying as exp(-ω(√(d log(n/d)))) when the low-degree coefficients are perturbed by at most δ) with the assumption δ ≤ exp(-ω(√(n log n))). The interaction between truncation and perturbation is controlled because the δ-perturbation only affects coefficients up to degree c1 n, while the tail bound applies beyond that degree; when δ is smaller than the tail, it is absorbed without degrading the super-polynomial rate. We will revise the statement of Theorem 4.3 and the surrounding discussion to include an explicit form of the resulting closeness (e.g., n^{-ω(1)}) in terms of the ω function appearing in δ. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation is self-contained from standard tools

full rationale

The paper applies hypergeometric smoothing and Hahn polynomial approximation to symmetric distributions to derive upper bounds on statistical distance for higher-order marginals. These are standard orthogonal polynomial techniques for the hypergeometric measure, with symmetry stated explicitly as the enabling hypothesis. No equations reduce by construction to fitted inputs, no predictions are statistically forced from subsets of the same data, and no load-bearing steps rely on self-citations whose validity is internal to the paper. The central result (exponential closeness of c2 n-wise marginals for (c1 n, 0)-wise indistinguishable symmetric pairs) follows from degree-k polynomial bounds when k exceeds the indistinguishability parameter, without renaming known results or smuggling ansatzes via prior work by the same authors. The derivation chain is independent of the target claim.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard facts from orthogonal polynomial theory and basic probability; no new free parameters, ad-hoc axioms, or invented entities are introduced.

axioms (1)
  • standard math Hahn polynomials form an orthogonal basis for functions on the discrete hypergeometric distribution
    Invoked to expand and bound the difference between marginals after smoothing.

pith-pipeline@v0.9.0 · 5566 in / 1341 out tokens · 34946 ms · 2026-05-14T17:41:39.177437+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

Reference graph

Works this paper leans on

9 extracted references · 4 canonical work pages

  1. [1]

    39th International Symposium on Theoretical Aspects of Computer Science (STACS 2022) , pages =

    Williamson, Christopher , title =. 39th International Symposium on Theoretical Aspects of Computer Science (STACS 2022) , pages =. 2022 , isbn =. doi:10.4230/LIPIcs.STACS.2022.59 , url =

  2. [2]

    Advances in Cryptology -- CRYPTO 2016: 36th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 14--18, 2016, Proceedings, Part III , pages =

    Bogdanov, Andrej and Ishai, Yuval and Viola, Emanuele and Williamson, Christopher , title =. Advances in Cryptology -- CRYPTO 2016: 36th Annual International Cryptology Conference, Santa Barbara, CA, USA, August 14--18, 2016, Proceedings, Part III , pages =. 2016 , doi =

  3. [3]

    Advances in Cryptology -- EUROCRYPT '94 , pages =

    Naor, Moni and Shamir, Adi , title =. Advances in Cryptology -- EUROCRYPT '94 , pages =. 1994 , doi =

  4. [4]

    Combinatorics, Probability and Computing , volume =

    Krause, Matthias and Simon, Hans Ulrich , title =. Combinatorics, Probability and Computing , volume =. 2003 , doi =

  5. [5]

    44th International Colloquium on Automata, Languages, and Programming (ICALP 2017) , pages =

    Bogdanov, Andrej and Williamson, Christopher , title =. 44th International Colloquium on Automata, Languages, and Programming (ICALP 2017) , pages =. 2017 , isbn =. doi:10.4230/LIPIcs.ICALP.2017.53 , url =

  6. [6]

    ACM Transactions on Computation Theory , volume =

    Huang, Xuangui and Viola, Emanuele , title =. ACM Transactions on Computation Theory , volume =. 2022 , month = feb, doi =

  7. [7]

    and Thaler, Justin and Williamson, Christopher , title =

    Bogdanov, Andrej and Mande, Nikhil S. and Thaler, Justin and Williamson, Christopher , title =. Approximation, Randomization, and Combinatorial Optimization. Algorithms and Techniques (APPROX/RANDOM 2019) , pages =. 2019 , isbn =. doi:10.4230/LIPIcs.APPROX-RANDOM.2019.71 , url =

  8. [8]

    and Swarttouw, Ren

    Koekoek, Roelof and Lesky, Peter A. and Swarttouw, Ren. 2010 , pages =. doi:10.1007/978-3-642-05014-5 , url =

  9. [9]

    1973 , pages =

    Delsarte, Philippe , title =. 1973 , pages =