Recognition: 2 theorem links
· Lean TheoremA homogenization principle for total variation
Pith reviewed 2026-05-13 16:52 UTC · model grok-4.3
The pith
Total variation between product distributions is bounded below by a constant times that of their averaged versions.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
For arbitrary probability measures P1,...,Pn and Q1,...,Qn on a measurable space, the total variation between the tensor products of the Pi's and Qi's is at least c times the total variation between the n-fold product of their averages, where c is a universal positive constant. This is proved by embedding each pair into positive measures eta_i on R, defining a functional T such that TV of products equals T of the convolution of the eta's, showing that T of the convolution is at least c times T of the average eta convolved n times, and lifting back to show that equals at least the TV of the averaged products.
What carries the argument
A one-dimensional embedding of probability measure pairs into positive measures on R, together with a functional T over measures on R that realizes the total variation of product measures exactly via convolution of the embedded measures.
Load-bearing premise
The embedding of the probability measures into positive measures on the real line allows the total variation of the products to be exactly represented by the functional T applied to their convolutions.
What would settle it
A counterexample with specific measures P_i and Q_i where the ratio TV(products)/TV(averages products) goes to zero as n increases would falsify the existence of a universal c.
read the original abstract
A homogenization principle for total variation We prove an inequality comparing the variational distance between pairs of product probability measures to its homogenized counterpart. If $P_1,\ldots,P_n,Q_1,\ldots,Q_n$ are arbitrary probability measures on a measurable space and $\bar P:=\frac1n\sum_{i=1}^n P_i, \bar Q:=\frac1n\sum_{i=1}^n Q_i $, we show that $$TV\!\left(\bigotimes_{i=1}^n P_i, \bigotimes_{i=1}^n Q_i\right) \;\ge\; c\,TV(\bar P^{\otimes n},\bar Q^{\otimes n}),$$ where $c>0$ is a universal constant. The proof is based on a one-dimensional representation of total variation between products. We embed pairs of probability distributions $P_i,Q_i$ into positive measures $\eta_i$ on $\mathbb{R}$. We then define a functional $T$ over measures on $\mathbb{R}$ that realizes TV over products via convolution: $TV\!\left(\bigotimes_{i=1}^n P_i, \bigotimes_{i=1}^n Q_i\right)=T(\eta_1*\cdots *\eta_n)$. Our main analytic discovery is that for the relevant class of positive measures $\eta_i$, the convolution inequality $T(\eta_1*\cdots*\eta_n) \ge c\,T\!\left(\bar\eta^{*n}\right)$ holds, where $\bar\eta=\frac1n\sum_{i=1}^n \eta_i$. Finally, a higher-dimensional lifting argument shows that $T\!\left(\bar\eta^{*n}\right)\ge TV(\bar P^{\otimes n},\bar Q^{\otimes n})$. To our knowledge, both the exact representation and the convolution inequality are new.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proves a homogenization inequality for total variation: for arbitrary probability measures P_1,...,P_n and Q_1,...,Q_n on a measurable space, with averages bar P and bar Q, one has TV(⊗_{i=1}^n P_i, ⊗_{i=1}^n Q_i) ≥ c TV(bar P^{⊗n}, bar Q^{⊗n}) for a universal constant c>0. The argument proceeds by embedding each pair (P_i,Q_i) into positive measures η_i on R, introducing a functional T on positive measures on R such that TV of the products equals T of the convolution η_1 * ⋯ * η_n, establishing the convolution inequality T(η_1*⋯*η_n) ≥ c T(bar η^{*n}), and finally lifting T(bar η^{*n}) ≥ TV(bar P^{⊗n}, bar Q^{⊗n}). Both the representation and the convolution inequality are presented as new.
Significance. If the central inequality holds, the result supplies a dimension-free lower bound relating heterogeneous product total variation to its homogenized counterpart. This could find use in concentration, empirical-process theory, and statistical testing where one wishes to reduce to i.i.d. cases. The one-dimensional embedding and the associated convolution inequality for T constitute new analytic machinery that may be of independent interest beyond the present application.
major comments (3)
- [Section 2 (representation via embedding)] The representation equality TV(⊗ P_i, ⊗ Q_i) = T(η_1 * ⋯ * η_n) is load-bearing; it must hold exactly for every collection of probability measures, including those with atoms or mutually singular components. The construction of the embedding map and the functional T (Section 2) needs an explicit verification that the equality is preserved under convolution for all such pairs, not merely for a dense subclass.
- [Section 3 (convolution inequality)] The convolution inequality T(η_1 * ⋯ * η_n) ≥ c T(bar η^{*n}) is asserted for the image of the embedding map. It is unclear whether the class of admissible η_i is closed under averaging and convolution or whether the inequality requires additional regularity (e.g., absolute continuity or moment bounds) that the embedding does not automatically guarantee (Section 3, main analytic step).
- [Section 4 (lifting)] The final lifting step T(bar η^{*n}) ≥ TV(bar P^{⊗n}, bar Q^{⊗n}) must recover the total-variation distance of the averaged measures without loss of the universal constant c. The argument should be checked for cases in which the averaged measures bar P and bar Q have different supports from the original collection (Section 4, lifting argument).
minor comments (2)
- [Introduction] The abstract states that both the representation and the convolution inequality are new; a short comparison paragraph with existing one-dimensional representations of total variation (e.g., via cumulative distribution functions) would help readers assess novelty.
- [Section 3] Notation for the averaged measure bar η is introduced after the convolution inequality is stated; moving the definition earlier would improve readability.
Simulated Author's Rebuttal
We thank the referee for the careful and constructive report. The comments identify points where additional explicit verification would strengthen the manuscript. We address each major comment below and will revise accordingly.
read point-by-point responses
-
Referee: [Section 2 (representation via embedding)] The representation equality TV(⊗ P_i, ⊗ Q_i) = T(η_1 * ⋯ * η_n) is load-bearing; it must hold exactly for every collection of probability measures, including those with atoms or mutually singular components. The construction of the embedding map and the functional T (Section 2) needs an explicit verification that the equality is preserved under convolution for all such pairs, not merely for a dense subclass.
Authors: We agree that the representation must be verified directly for general measures. The embedding maps each pair (P_i, Q_i) to a positive measure η_i on R by integrating against a fixed separating function; T is defined so that it recovers the total-variation functional on the product space. Because total variation is a supremum over bounded measurable functions and convolution corresponds exactly to the product measure, the equality holds by direct substitution for arbitrary measures, including atoms and mutually singular parts. To make this fully transparent we will insert a short lemma in Section 2 that carries out the verification explicitly on atomic measures and on the singular-continuous decomposition, confirming that no approximation step is used. revision: yes
-
Referee: [Section 3 (convolution inequality)] The convolution inequality T(η_1 * ⋯ * η_n) ≥ c T(bar η^{*n}) is asserted for the image of the embedding map. It is unclear whether the class of admissible η_i is closed under averaging and convolution or whether the inequality requires additional regularity (e.g., absolute continuity or moment bounds) that the embedding does not automatically guarantee (Section 3, main analytic step).
Authors: The image class is closed under averaging and convolution: each η_i has total mass 1 and the average bar η is again the image of the averaged pair (bar P, bar Q). The proof of the convolution inequality in Section 3 relies only on positivity of the measures and the specific variational definition of T; it does not invoke absolute continuity or moment conditions. The argument proceeds by reducing the inequality to a one-dimensional convolution estimate that holds for all positive finite measures. We will revise the opening paragraph of Section 3 to state the precise class explicitly and add a short remark confirming that the analytic step applies verbatim to the embedded measures without extra regularity assumptions. revision: partial
-
Referee: [Section 4 (lifting)] The final lifting step T(bar η^{*n}) ≥ TV(bar P^{⊗n}, bar Q^{⊗n}) must recover the total-variation distance of the averaged measures without loss of the universal constant c. The argument should be checked for cases in which the averaged measures bar P and bar Q have different supports from the original collection (Section 4, lifting argument).
Authors: The lifting applies the identical embedding to the averaged measures bar P and bar Q, so T(bar η^{*n}) is defined exactly as the total variation of the n-fold product of the embedded averages. Because the embedding is measure-preserving for total variation and the constant c originates from the convolution step (which is independent of support), no loss occurs. Supports of bar P and bar Q are contained in the union of the original supports, but total variation is insensitive to this inclusion. We will add a brief paragraph at the end of Section 4 that records this support relation and verifies that the inequality remains valid with the same c. revision: yes
Circularity Check
No significant circularity; novel embedding and convolution inequality are independent of inputs
full rationale
The derivation constructs an embedding of arbitrary (P_i, Q_i) into positive measures η_i on R, defines functional T such that TV(⊗P_i, ⊗Q_i) = T(η_1 * ⋯ * η_n) holds by the chosen representation, proves the new convolution inequality T(η_1 * ⋯ * η_n) ≥ c T(¯η^{*n}) for the induced class, and applies a lifting step T(¯η^{*n}) ≥ TV(¯P^{⊗n}, ¯Q^{⊗n}). None of these steps reduces the target inequality to a fitted parameter, self-citation chain, or definitional tautology; the representation equality and convolution bound are established as fresh analytic results rather than by construction equating outputs to inputs. The argument is therefore self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- standard math Total variation distance and convolution are well-defined for probability measures and positive measures on R
invented entities (2)
-
Embedding map from pairs (P_i, Q_i) to positive measures η_i on R
no independent evidence
-
Functional T on positive measures on R
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel; dAlembert_to_ODE_general echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
T(η)=½∫|e^x−e^{-x}|η(dx) with ∫e^{±x}η=1; TV(⊗Pi,⊗Qi)=T(η1*⋯*ηn)
-
IndisputableMonolith/Foundation/AlphaCoordinateFixation.leancostAlphaLog_high_calibrated_iff; J_uniquely_calibrated_via_higher_derivative echoes?
echoesECHOES: this paper passage has the same mathematical shape or conceptual pattern as the Recognition theorem, but is not a direct formal dependency.
admissible measures closed under convolution; mass-defect α and multilinear Ψ representation
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Richard Arratia and Simon Tavaré. Independent process approximations for random combina- torial structures.Advances in Mathematics, 104(1):90–154, 1994. doi: 10.1006/aima.1994.1022
-
[2]
Yannick Baraud. Estimator selection with respect to Hellinger-type risks.Probability Theory and Related Fields, 151(1–2):353–401, 2011. doi: 10.1007/s00440-010-0302-y
-
[3]
Rho-estimators revisited: General theory and applications
Yannick Baraud and Lucien Birgé. Rho-estimators revisited: General theory and applications. The Annals of Statistics, 46(6B):3767–3804, 2018. doi: 10.1214/17-AOS1675
-
[4]
A. Bhattacharyya. On a measure of divergence between two multinomial populations.Sankhy¯ a, 7:401–406, 1946
work page 1946
-
[5]
Robust testing for independent non identically distributed variables and Markov chains
Lucien Birgé. Robust testing for independent non identically distributed variables and Markov chains. In J. P. Florens, M. Mouchart, J. P. Raoult, L. Simar, and A. F. M. Smith, editors, Specifying Statistical Models, volume 16 ofLecture Notes in Statistics, pages 134–162. Springer, New York, NY, 1983. doi: 10.1007/978-1-4612-5503-1_9
-
[6]
On deterministically approximating total variation distance
Weiming Feng, Liqiang Liu, and Tianren Liu. On deterministically approximating total variation distance. InProceedings of the 2024 Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), pages 1766–1791, 2024. doi: 10.1137/1.9781611977912.70
-
[7]
Paul R. Halmos.Measure Theory. Graduate Texts in Mathematics, Vol. 18. Springer, New York, NY, 1974. Reprint of the 1950 edition. doi: 10.1007/978-1-4684-9440-2
-
[8]
Ernst Hellinger. Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen.Journal für die reine und angewandte Mathematik, 136:210–271, 1909
work page 1909
-
[9]
Peter J. Huber. A robust version of the probability ratio test.The Annals of Mathematical Statistics, 36(6):1753–1758, 1965. doi: 10.1214/aoms/1177699803
-
[10]
On equivalence of infinite product measures.Annals of Mathematics, 49(1):214–224, 1948
Shizuo Kakutani. On equivalence of infinite product measures.Annals of Mathematics, 49(1):214–224, 1948. doi: 10.2307/1969123
-
[11]
Aryeh Kontorovich. On the tensorization of the variational distance.Electronic Communications in Probability, 30:1–10, 2025. doi: 10.1214/25-ECP680
-
[12]
TV homogenization inequalities, preprint, 2026
Aryeh Kontorovich. TV homogenization inequalities, preprint, 2026. arXiv:2601.04079
-
[13]
Lucien Le Cam and Grace Lo Yang.Asymptotics in Statistics: Some Basic Concepts. Springer Series in Statistics. Springer, New York, second edition, 2000. doi: 10.1007/978-1-4612-1166-2
-
[14]
Cambridge University Press, Cambridge, 2024
Yury Polyanskiy and Yihong Wu.Information Theory: From Coding to Learning. Cambridge University Press, Cambridge, 2024
work page 2024
-
[15]
Closeness of convolutions of probability measures.Bernoulli, 16(1):23–50, 2010
Bero Roos. Closeness of convolutions of probability measures.Bernoulli, 16(1):23–50, 2010. doi: 10.3150/08-BEJ171
-
[16]
Bero Roos. Refined total variation bounds in the multivariate and compound Poisson approxi- mation.ALEA, Latin American Journal of Probability and Mathematical Statistics, 14:337–360,
-
[17]
doi: 10.30757/ALEA.v14-19. 16
-
[18]
Robust hypothesis testing and distribution estimation in Hellinger distance
Ananda Theertha Suresh. Robust hypothesis testing and distribution estimation in Hellinger distance. InProceedings of The 24th International Conference on Artificial Intelligence and Statistics, volume 130 ofProceedings of Machine Learning Research, pages 2962–2970. PMLR, 2021
work page 2021
-
[19]
Torgersen.Comparison of Statistical Experiments
Erik N. Torgersen.Comparison of Statistical Experiments. Encyclopedia of Mathematics and its Applications, Vol. 36. Cambridge University Press, Cambridge, 1991. 17
work page 1991
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.