pith. machine review for the scientific record. sign in

arxiv: 2605.03839 · v1 · submitted 2026-05-05 · 💻 cs.DS · cs.LG· math.PR

Recognition: unknown

On Computing Total Variation Distance Between Mixtures of Product Distributions

Authors on Pith no claims yet

Pith reviewed 2026-05-07 14:02 UTC · model grok-4.3

classification 💻 cs.DS cs.LGmath.PR
keywords total variation distancemixture of productsproduct distributionsapproximation algorithmsBoolean subcubes#P-hardnesscomputational complexity
0
0 comments X

The pith

Mixtures of k product distributions over [q]^n have their total variation distance approximable to multiplicative (1 ± ε) by a randomized algorithm in poly((nq)^{k1+k2}, 1/ε) time.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops computational methods to approximate or exactly compute the total variation distance between two mixtures of product distributions over a discrete n-dimensional domain. A randomized algorithm achieves multiplicative approximation in time polynomial in (nq) raised to the total number of components across both mixtures. This is useful because total variation distance quantifies how distinguishable two probabilistic models are, and product mixtures are standard for capturing dependencies in high-dimensional discrete data. For the special case of mixtures of Boolean subcubes, the authors provide a deterministic exact algorithm running in time polynomial in n but exponential in the number of components, together with a proof that exact computation is #P-hard once the total number of components grows linearly with the dimension.

Core claim

Given two mixtures P and Q consisting of k1 and k2 product distributions over [q]^n respectively, there exists a randomized algorithm that approximates d_TV(P, Q) to within multiplicative factor (1 ± ε) in time poly((nq)^{k1+k2}, 1/ε). For mixtures of Boolean subcubes over {0,1}^n, the total variation distance can be computed exactly by a deterministic algorithm in time poly(n, 2^{O(k1+k2)}), yet the problem is #P-hard whenever k1 + k2 = Θ(n).

What carries the argument

The randomized approximation procedure together with the deterministic exact procedure for subcube mixtures, both relying on explicit access to the individual product components of the input mixtures.

If this is right

  • When the total number of components is small relative to the dimension, the distance between two such mixtures becomes efficiently computable even for large n and q.
  • Exact numerical comparison is feasible for any fixed number of Boolean subcubes without incurring approximation error.
  • No polynomial-time exact algorithm for the subcube case can exist when the number of components scales linearly with n, unless #P collapses to P.
  • The approximation result immediately yields efficient procedures for deciding whether two mixture models are close or far in total variation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same algorithmic template could be tested on other statistical distances such as Hellinger or chi-squared between the same class of mixtures.
  • In applications that fit mixture models to data, the procedure offers a way to compare two fitted models without generating large synthetic samples.
  • The exponential dependence on k in the subcube exact algorithm suggests a natural parameterization by the number of components that may admit fixed-parameter tractable extensions to other discrete domains.

Load-bearing premise

The mixtures are supplied as explicit lists of their k1 and k2 component product distributions so that each component can be enumerated or accessed directly.

What would settle it

A concrete mixture instance with small fixed k1 and k2 for which the randomized algorithm's output deviates from the true d_TV(P, Q) by more than a (1 + ε) factor, or an instance of subcube mixtures with k1 + k2 = Θ(n) that admits an exact polynomial-time solution.

read the original abstract

We study the problem of approximating the total variation distance between two mixtures of product distributions over an $n$-dimensional discrete domain. Given two mixtures $\mathbb{P}$ and $\mathbb{Q}$ with $k_1$ and $k_2$ product distributions over $[q]^n$, respectively, we give a randomized algorithm that approximates $d_{\mathrm{TV}}\left({\mathbb{P}},{\mathbb{Q}}\right)$ within a multiplicative error of $(1\pm \varepsilon)$ in time $\mathrm{poly}((nq)^{k_1+k_2},1/\varepsilon)$. We also study the special case of mixtures of Boolean subcubes over $\{0,1\}^n$. For this class, we give a deterministic algorithm that exactly computes the total variation distance in time $\mathrm{poly}(n,2^{O(k_1+k_2)})$, and show that exact computation is $\#\mathsf{P}$-hard when $k_1+k_2=\Theta(n)$.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The manuscript studies the computational problem of finding the total variation distance between two mixtures of product distributions over [q]^n, where the mixtures have k1 and k2 components respectively. It claims a randomized algorithm that returns a (1±ε)-multiplicative approximation to d_TV(P,Q) in time poly((nq)^{k1+k2},1/ε). For the special case of mixtures of Boolean subcubes over {0,1}^n it claims a deterministic exact algorithm running in poly(n,2^{O(k1+k2)}) time together with a #P-hardness proof for exact computation when k1+k2=Θ(n).

Significance. If the algorithmic claims and hardness reduction hold, the work supplies the first non-trivial positive results for TV distance between structured mixtures when the number of components is small, a regime that arises in many learning applications. The deterministic exact algorithm for subcubes and the matching hardness result together give a clean complexity picture for that subclass. The results are grounded in standard enumeration and dynamic-programming techniques over low-dimensional supports and therefore appear technically plausible.

minor comments (3)
  1. The input representation of the mixtures is never stated explicitly. The claimed runtimes presuppose that each of the k1+k2 product distributions is given by an explicit description (e.g., its probability vector or parameter list) whose size is polynomial in nq; this assumption should be stated in the preliminaries or introduction and the input size should be related to the running-time bound.
  2. The abstract and introduction would benefit from a short comparison paragraph placing the new runtimes against the naïve O((nq)^{k1+k2}) enumeration baseline and against any prior work on TV distance for single product distributions.
  3. Notation for the mixtures P and Q is introduced in the abstract but never formally defined in the body; a short “Notation and Preliminaries” subsection would remove ambiguity between the mixture measure and its component distributions.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive assessment of our work, recognition of its significance as the first non-trivial results in this regime, and recommendation for minor revision. We appreciate the note that the claims appear technically plausible based on standard techniques.

Circularity Check

0 steps flagged

No significant circularity; algorithmic claims are self-contained

full rationale

The paper states explicit runtime bounds for a randomized multiplicative approximation algorithm and a deterministic exact algorithm for Boolean subcubes, along with a #P-hardness result. These are standard complexity-theoretic claims supported by algorithmic constructions and reductions that do not reduce by definition or self-citation to the target quantities themselves. No self-definitional steps, fitted inputs renamed as predictions, or load-bearing self-citations appear in the abstract or described results; the derivation chain relies on enumeration over mixture components and dynamic programming, which are independent of the final distance value.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claims rest on standard definitions and properties of total variation distance, mixture distributions, and product distributions from probability theory, with no free parameters or invented entities introduced.

axioms (1)
  • standard math Standard properties of total variation distance and mixture distributions hold as defined in probability theory.
    The algorithm and hardness claims rely on these foundational definitions.

pith-pipeline@v0.9.0 · 5477 in / 1402 out tokens · 98042 ms · 2026-05-07T14:02:06.432444+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

164 extracted references · 7 canonical work pages

  1. [1]

    A Distributional-Lifting Theorem for

    Guy Blanc and Jane Lange and Carmen Strassle and Li. A Distributional-Lifting Theorem for. COLT , series =

  2. [2]

    Lifting Uniform Learners via Distributional Decomposition , booktitle =

    Guy Blanc and Jane Lange and Ali Malik and Li. Lifting Uniform Learners via Distributional Decomposition , booktitle =

  3. [3]

    ICML , series =

    Chinmaya Kausik and Kevin Tan and Ambuj Tewari , title =. ICML , series =

  4. [4]

    Nearly tight sample complexity bounds for learning

    Hassan Ashtiani and Shai Ben. Nearly tight sample complexity bounds for learning. NeurIPS , pages =

  5. [5]

    COLT , series =

    Jerry Li and Ludwig Schmidt , title =. COLT , series =

  6. [6]

    COLT , series =

    Prateek Jain and Sewoong Oh , title =. COLT , series =

  7. [7]

    Gordon and Erik Jahn and Bijan Mazaheri and Yuval Rabani and Leonard J

    Spencer L. Gordon and Erik Jahn and Bijan Mazaheri and Yuval Rabani and Leonard J. Schulman , title =. COLT , series =

  8. [8]

    Mazaheri and Yuval Rabani and Leonard J

    Spencer Gordon and Bijan H. Mazaheri and Yuval Rabani and Leonard J. Schulman , title =. COLT , series =

  9. [9]

    Servedio , title =

    Jon Feldman and Ryan O'Donnell and Rocco A. Servedio , title =

  10. [10]

    STOC , pages =

    Sitan Chen and Ankur Moitra , title =. STOC , pages =

  11. [11]

    Meel and Dimitrios Myrisiotis and Aduri Pavan and N

    Arnab Bhattacharyya and Sutanu Gayen and Kuldeep S. Meel and Dimitrios Myrisiotis and Aduri Pavan and N. V. Vinodchandran , title =. ICLR , publisher =

  12. [12]

    ICML , series =

    Jeongyeol Kwon and Yonathan Efroni and Constantine Caramanis and Shie Mannor , title =. ICML , series =

  13. [13]

    Constantinos Daskalakis and Nishanth Dikkala and Gautam Kamath , title =

  14. [14]

    Amirali Abdullah and Ravi Kumar and Andrew McGregor and Sergei Vassilvitskii and Suresh Venkatasubramanian , title =

  15. [15]

    Takafumi Kanamori and Taiji Suzuki and Masashi Sugiyama , title =

  16. [16]

    Computing Divergences between Discrete Decomposable Models , booktitle =

    Loong Kuan Lee and Nico Piatkowski and Fran. Computing Divergences between Discrete Decomposable Models , booktitle =

  17. [17]

    Sreejith Sreekumar and Ziv Goldfeld , title =. J. Mach. Learn. Res. , volume =

  18. [18]

    Rubenstein and Olivier Bousquet and Josip Djolonga and Carlos Riquelme and Ilya O

    Paul K. Rubenstein and Olivier Bousquet and Josip Djolonga and Carlos Riquelme and Ilya O. Tolstikhin , title =. NeurIPS , pages =

  19. [19]

    On the Estimation of

    Barnab. On the Estimation of

  20. [20]

    Liu, Jingcheng and Sinclair, Alistair and Srivastava, Piyush , journal =. The

  21. [21]

    arXiv preprint arXiv:2404.09674 , year=

    A circus of circuits: Connections between decision diagrams, circuits, and automata , author=. arXiv preprint arXiv:2404.09674 , year=

  22. [22]

    arXiv preprint arXiv:2506.23456 , year=

    Sampling and Identity-Testing Without Approximate Tensorization of Entropy , author=. arXiv preprint arXiv:2506.23456 , year=

  23. [23]

    Antonio Blanca and Zongchen Chen and Daniel Stefankovic and Eric Vigoda , title =

  24. [24]

    Topics and Techniques in Distribution Testing:

    Cl. Topics and Techniques in Distribution Testing:. Found. Trends Commun. Inf. Theory , volume =

  25. [25]

    arXiv preprint arXiv:2502.10527 , year=

    Algorithms and Hardness for Estimating Statistical Similarity , author=. arXiv preprint arXiv:2502.10527 , year=

  26. [26]

    Electron

    Kontorovich, Aryeh , TITLE =. Electron. Commun. Probab. , FJOURNAL =. 2025 , PAGES =

  27. [27]

    Arnab Bhattacharyya and Weiming Feng and Piyush Srivastava , title =

  28. [28]

    SODA , pages =

    Xiaoyu Chen and Xinyuan Zhang , title =. SODA , pages =

  29. [29]

    STOC , pages =

    Xiaoyu Chen and Zongchen Chen and Yitong Yin and Xinyuan Zhang , title =. STOC , pages =

  30. [30]

    STOC , pages =

    Guy Bresler , title =. STOC , pages =

  31. [31]

    Ankur Moitra , title =. J

  32. [32]

    COLT , series =

    Weiming Feng and Hongyang Liu and Minji Yang , title =. COLT , series =

  33. [33]

    ALT , series =

    Aryeh Kontorovich and Ariel Avital , title =. ALT , series =

  34. [34]

    arXiv preprint arXiv:2409.10368 , year=

    On the tensorization of the variational distance , author=. arXiv preprint arXiv:2409.10368 , year=

  35. [35]

    FOCS , pages =

    Xiaoyu Chen and Weiming Feng and Yitong Yin and Xinyuan Zhang , title =. FOCS , pages =

  36. [36]

    A Survey on Distribution Testing: Your Data is Big

    Canonne, Cl. A Survey on Distribution Testing: Your Data is Big. But is it Blue? , year =. doi:10.4086/toc.gs.2020.009 , publisher =

  37. [37]

    Testing Probability Distributions Underlying Aggregated Data , booktitle =

    Cl. Testing Probability Distributions Underlying Aggregated Data , booktitle =

  38. [38]

    ICML , series =

    Jamil Arbas and Hassan Ashtiani and Christopher Liaw , title =. ICML , series =

  39. [39]

    arXiv preprint arXiv:1810.08693 , year=

    The total variation distance between high-dimensional Gaussians with the same mean , author=. arXiv preprint arXiv:1810.08693 , year=

  40. [40]

    ICALP , series =

    Stefan Kiefer , title =. ICALP , series =

  41. [41]

    LICS , pages =

    Taolue Chen and Stefan Kiefer , title =. LICS , pages =

  42. [42]

    Corinna Cortes and Mehryar Mohri and Ashish Rastogi , title =. Int. J. Found. Comput. Sci. , volume =

  43. [43]

    Henzinger and Jean

    Laurent Doyen and Thomas A. Henzinger and Jean. Equivalence of Labeled Markov Chains , journal =

  44. [44]

    Murawski and Jo

    Stefan Kiefer and Andrzej S. Murawski and Jo. Language Equivalence for Probabilistic Automata , booktitle =

  45. [45]

    2003 , publisher=

    Counting, sampling and integrating: algorithms and complexity , author=. 2003 , publisher=

  46. [46]

    SODA , pages =

    Weiming Feng and Liqiang Liu and Tianren Liu , title =. SODA , pages =

  47. [47]

    Meel and Dimitrios Myrisiotis and A

    Arnab Bhattacharyya and Sutanu Gayen and Kuldeep S. Meel and Dimitrios Myrisiotis and A. Pavan and N. V. Vinodchandran , title =. IJCAI , pages =

  48. [48]

    2017 , publisher=

    Markov chains and mixing times , author=. 2017 , publisher=

  49. [49]

    A complete problem for statistical zero knowledge , author=. J. 2003 , publisher=

  50. [50]

    Andreas Galanis and Daniel Stefankovic and Eric Vigoda , title =. Comb. Probab. Comput. , volume =

  51. [51]

    FOCS , pages =

    Allan Sly and Nike Sun , title =. FOCS , pages =

  52. [52]

    arXiv , volume =

    Xiaoyu Chen and Weiming Feng and Heng Guo and Xinyuan Zhang and Zongrui Zou , title =. arXiv , volume =

  53. [53]

    TheoretiCS , volume =

    Weiming Feng and Heng Guo and Mark Jerrum and Jiaheng Wang , title =. TheoretiCS , volume =

  54. [54]

    Meel and Dimitrios Myrisiotis and A

    Arnab Bhattacharyya and Sutanu Gayen and Kuldeep S. Meel and Dimitrios Myrisiotis and A. Pavan and N. V. Vinodchandran , title =. ICML , publisher =

  55. [55]

    Meel and N

    Arnab Bhattacharyya and Sutanu Gayen and Kuldeep S. Meel and N. V. Vinodchandran , title =. NeurIPS , year =

  56. [56]

    Dyer and Catherine S

    Martin E. Dyer and Catherine S. Greenhill , title =. J. Algorithms , volume =

  57. [57]

    CoRR , volume =

    Xiaoyu Chen and Zongchen Chen and Yitong Yin and Xinyuan Zhang , title =. CoRR , volume =

  58. [58]

    Weiming Feng and Heng Guo and Jiaheng Wang , title =. Inf. Comput. , volume =

  59. [59]

    Hayes , title =

    Thomas P. Hayes , title =. FOCS , pages =

  60. [60]

    Amartya Shankha Biswas and Ronitt Rubinfeld and Anak Yodpinyanee , title =

  61. [61]

    Log-concave polynomials

    Nima Anari and Kuikui Liu and Shayan. Log-concave polynomials. STOC , pages =

  62. [62]

    A power law of order

    Long, Yun and Nachmias, Asaf and Ning, Weiyang and Peres, Yuval , fjournal =. A power law of order. Mem. Amer. Math. Soc. , number =

  63. [63]

    Entropy decay in the

    Antonio Blanca and Pietro Caputo and Daniel Parisi and Alistair Sinclair and Eric Vigoda , booktitle =. Entropy decay in the

  64. [64]

    Modified log-

    Cryan, Mary and Guo, Heng and Mousa, Giorgos , fjournal =. Modified log-. Ann. Probab. , number =

  65. [65]

    On Mixing of

    Antonio Blanca and Pietro Caputo and Zongchen Chen and Daniel Parisi and Daniel. On Mixing of

  66. [66]

    Fortuin, C. M. and Kasteleyn, P. W. , fjournal =. On the random-cluster model. Physica , pages =

  67. [67]

    arXiv , title =

    Nima Anari and Vishesh Jain and Frederic Koehler and Huy Tuan Pham and Thuy. arXiv , title =

  68. [68]

    The Complexity of Ferromagnetic

    Leslie Ann Goldberg and Mark Jerrum , journal =. The Complexity of Ferromagnetic

  69. [69]

    and Sokal, Alan D

    Edwards, Robert G. and Sokal, Alan D. , fjournal =. Generalization of the. Phys. Rev. D (3) , number =

  70. [70]

    Wilson , booktitle =

    Dana Randall and David B. Wilson , booktitle =. Sampling Spin Configurations of an

  71. [71]

    Rapid Mixing

    Sejun Park and Yunhun Jang and Andreas Galanis and Jinwoo Shin and Daniel. Rapid Mixing. AISTATS , pages =

  72. [72]

    Uniquely colourable graphs and the hardness of colouring graphs of large girth , volume =. Combin. Probab. Comput. , number =

  73. [73]

    Rapid Mixing from Spectral Independence beyond the

    Weiming Feng and Heng Guo and Yitong Yin and Chihao Zhang , booktitle =. Rapid Mixing from Spectral Independence beyond the

  74. [74]

    Rapid Mixing for Colorings via Spectral Independence , year =

    Zongchen Chen and Andreas Galanis and Daniel. Rapid Mixing for Colorings via Spectral Independence , year =

  75. [75]

    arXiv , author =:2107.03932 , journal =

    Perfect sampling for (atomic). arXiv , author =:2107.03932 , journal =

  76. [76]

    Coloring simple hypergraphs , volume =

    Frieze, Alan and Mubayi, Dhruv , journal =. Coloring simple hypergraphs , volume =

  77. [77]

    Frieze and P

    Alan M. Frieze and P. Randomly coloring simple hypergraphs , journal =

  78. [78]

    Nonuniversal critical dynamics in

    Swendsen, Robert and Wang, Jian-Sheng , issue =. Nonuniversal critical dynamics in. Phys. Rev. Lett. , pages =

  79. [79]

    STOC , pages=

    Sampling constraint satisfaction solutions in the local lemma regime , author=. STOC , pages=

  80. [80]

    On the sampling

    Vishesh Jain and Huy Tuan Pham and Thuy Duong Vuong , journal =. On the sampling

Showing first 80 references.