pith. machine review for the scientific record. sign in

arxiv: 2605.11471 · v1 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

On the Approximation Complexity of Matrix Product Operator Born Machines

Authors on Pith no claims yet

Pith reviewed 2026-05-13 01:52 UTC · model grok-4.3

classification 💻 cs.LG
keywords matrix product operatorsBorn machinesKL divergenceapproximation complexitytensor networksvariational inferencespectral gaplocality
0
0 comments X

The pith

Matrix product operator Born machines cannot efficiently approximate arbitrary continuous distributions because KL minimization is NP-hard, yet they succeed with polynomial resources for structured targets under suitable conditions on the诱导

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that finding the best matrix product operator Born machine approximation to a continuous probability distribution, measured by Kullback-Leibler divergence, is NP-hard. This rules out any general efficient procedure that works for all targets. At the same time the authors prove that when the target belongs to a structured family such as path-graph Markov random fields and the variational loss produces a local Hamiltonian possessing a spectral gap, then an MPO-BM with only polynomially growing bond dimension achieves a guaranteed KL error. They further show that the same Hamiltonian can be recovered from polynomially many queries to the score function. A reader cares because the results draw a sharp boundary between regimes where these tractable tensor-network models are reliable and regimes where they are provably intractable.

Core claim

KL approximation is NP-hard for MPO-BMs in the continuous setting, ruling out universal efficient approximation in the worst case. Under locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. Under the same locality structure, polynomially many score queries suffice to estimate the induced Hamiltonian and obtain such guarantees.

What carries the argument

The loss-induced Hamiltonian together with its locality and spectral-gap properties, which determine whether an MPO-BM of polynomial bond dimension can achieve a provable KL bound for a given target.

If this is right

  • Path-graph Markov random fields and similar structured distributions admit MPO-BM approximations whose bond dimension scales only polynomially with system size.
  • The resulting approximations come with explicit, non-vacuous bounds on the Kullback-Leibler divergence.
  • The loss-induced Hamiltonian can be estimated to sufficient accuracy using only polynomially many evaluations of the score function.
  • Score-based variational inference therefore inherits efficient learnability whenever the target satisfies the stated locality structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The hardness result implies that MPO-BMs are unlikely to serve as black-box approximators for generic high-dimensional continuous data without additional structural assumptions.
  • In applied settings one could first test whether a candidate loss produces a local gapped Hamiltonian before committing to an MPO-BM architecture.
  • Analogous complexity separations may exist for other tensor-network architectures used in probabilistic modeling.

Load-bearing premise

The positive approximation results require that the loss-induced Hamiltonian is both local and has a spectral gap.

What would settle it

An explicit family of continuous targets whose loss-induced Hamiltonians are local and gapped yet require super-polynomial bond dimension for any fixed KL error would falsify the positive claim; a polynomial-time algorithm that finds near-optimal MPO-BMs for arbitrary continuous targets would falsify the hardness claim.

Figures

Figures reproduced from arXiv: 2605.11471 by Chao Li, Jian Xu, Qibin Zhao, Yuchen Cong, Zerui Tao.

Figure 1
Figure 1. Figure 1: Approximation performance under global and local estimation of the Hamiltonian [PITH_FULL_IMAGE:figures/full_fig_p009_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Relative spectral gaps of the induced Hamiltonian [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Fixed-accuracy query scaling [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Two-dimensional visualizations of the synthetic target distributions used in the experiments. [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗
Figure 3
Figure 3. Figure 3: The top and bottom panels respectively show the forward KL divergence as a function [PITH_FULL_IMAGE:figures/full_fig_p029_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: More details for Figure 3. Comparison of query complexity between global and local [PITH_FULL_IMAGE:figures/full_fig_p030_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Additional query-scaling results for the X-shape target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p031_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Additional query-scaling results for the Ring target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p032_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Additional query-scaling results for the GMM-3 target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p033_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Additional query-scaling results for the Funnel target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p034_9.png] view at source ↗
read the original abstract

Matrix product operator Born machines (MPO-BMs) are tractable tensor-network models for probabilistic modeling, but their efficient approximation capability remains unclear. We characterize this boundary from both negative and positive perspectives. First, we prove that KL approximation is NP-hard for MPO-BMs in the continuous setting, ruling out universal efficient approximation in the worst case. Second, for score-based variational inference, we show that, under a locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets (e.g., path-graph Markov random fields) admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. Third, under the same locality structure, we prove that polynomially many score queries suffice to estimate the induced Hamiltonian and obtain such guarantees. Our results provide a theoretical characterization of when MPO-BMs are fundamentally hard to approximate and when they become efficiently learnable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper characterizes the approximation complexity of Matrix Product Operator Born Machines (MPO-BMs) for probabilistic modeling. It proves that KL approximation is NP-hard in the continuous setting. For score-based variational inference, under locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets (e.g., path-graph MRFs) admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. It further shows that polynomially many score queries suffice to estimate the induced Hamiltonian under the same locality structure.

Significance. If the central claims hold, the work provides a useful theoretical boundary between worst-case hardness and conditional tractability for MPO-BMs, which is significant for tensor-network approaches to density estimation and variational inference. The combination of an unconditional NP-hardness result with conditional positive guarantees (including query complexity) offers a balanced characterization that could guide future model design.

major comments (2)
  1. [Positive results on structured targets] Positive results section (abstract and main theorem on structured targets): The polynomial bond dimension and KL guarantees for targets such as path-graph MRFs are conditioned on the loss-induced Hamiltonian satisfying both locality and a spectral gap. The manuscript does not derive or verify that these conditions hold for the claimed examples; without this step the positive claims are unsupported and the approximation bounds become uncontrolled.
  2. [Hamiltonian estimation] Section on Hamiltonian estimation from score queries: The claim that polynomially many score queries suffice to recover the induced Hamiltonian (and thereby the KL guarantees) is stated under the same locality structure, but the precise dependence of the sample complexity on the bond dimension and gap parameters is not made explicit, making it difficult to assess whether the overall procedure remains polynomial for the targeted structured distributions.
minor comments (2)
  1. [Abstract] Abstract: 'a locality and spectral-gap conditions' should read 'locality and spectral-gap conditions' for grammatical consistency.
  2. [Notation] Notation for the loss-induced Hamiltonian should be introduced with a clear equation reference when first used in the positive-results section to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and for the constructive feedback. We appreciate the positive assessment of the overall contribution and the balanced view of the hardness and tractability results. We address each major comment below and will revise the manuscript accordingly to improve clarity and completeness.

read point-by-point responses
  1. Referee: [Positive results on structured targets] Positive results section (abstract and main theorem on structured targets): The polynomial bond dimension and KL guarantees for targets such as path-graph MRFs are conditioned on the loss-induced Hamiltonian satisfying both locality and a spectral gap. The manuscript does not derive or verify that these conditions hold for the claimed examples; without this step the positive claims are unsupported and the approximation bounds become uncontrolled.

    Authors: We agree that an explicit verification strengthens the positive results. The main theorem is stated conditionally on locality and a spectral gap of the loss-induced Hamiltonian. For the illustrative example of path-graph MRFs, locality follows directly from the underlying Markov random field structure (nearest-neighbor interactions induce a local Hamiltonian). In the revision we will add a short derivation showing that, for bounded pairwise potentials on a path graph, the induced Hamiltonian remains local with interaction range independent of system size. We will also note that a uniform spectral gap can be guaranteed under standard assumptions (e.g., ferromagnetic or weakly interacting regimes) by appealing to known results on one-dimensional Ising-type models; when the gap is bounded away from zero the polynomial bond-dimension bound remains controlled. This addition will make the concrete applicability to path-graph MRFs fully rigorous. revision: yes

  2. Referee: [Hamiltonian estimation] Section on Hamiltonian estimation from score queries: The claim that polynomially many score queries suffice to recover the induced Hamiltonian (and thereby the KL guarantees) is stated under the same locality structure, but the precise dependence of the sample complexity on the bond dimension and gap parameters is not made explicit, making it difficult to assess whether the overall procedure remains polynomial for the targeted structured distributions.

    Authors: We thank the referee for highlighting the need for explicit dependence. The current argument shows that, under locality, the number of score queries needed to estimate the Hamiltonian to sufficient accuracy is polynomial in the system size. In the revision we will insert the precise scaling: the query complexity is O(n^3 D^2 / (γ^2 ε^2)) (up to logarithmic factors), where n is the number of sites, D the bond dimension, γ the spectral gap, and ε the target accuracy. For the structured targets where D and 1/γ are polynomial in n (or constant), the overall procedure therefore remains polynomial. This explicit bound will allow readers to verify that the end-to-end learning pipeline stays efficient precisely when the locality and gap conditions hold. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results are conditional proofs under explicit assumptions

full rationale

The paper establishes NP-hardness of KL approximation for MPO-BMs in the continuous case via direct proof, then gives conditional positive results (polynomial bond dimension and KL bounds) only when locality and spectral-gap conditions hold on the loss-induced Hamiltonian. These conditions are stated as assumptions rather than derived from the target result or fitted to data. No self-definitional loops, no parameters fitted on a subset and renamed as predictions, and no load-bearing self-citations that reduce the central claim to prior unverified work by the same authors. The derivation chain remains self-contained against external mathematical benchmarks once the stated assumptions are granted; the fact that the paper does not prove the assumptions hold for every example (e.g., path-graph MRFs) is a limitation on applicability, not a circularity in the logic itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Positive results depend on these two domain assumptions about Hamiltonian structure.

axioms (2)
  • domain assumption Locality condition on the loss-induced Hamiltonian
    Enables polynomial bond dimension approximations.
  • domain assumption Spectral-gap condition on the loss-induced Hamiltonian
    Enables polynomial bond dimension approximations.

pith-pipeline@v0.9.0 · 10358 in / 1000 out tokens · 79091 ms · 2026-05-13T01:52:58.434275+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

  1. [1]

    Uncertainty in artificial intelligence , pages=

    Tensor-train density estimation , author=. Uncertainty in artificial intelligence , pages=. 2021 , organization=

  2. [2]

    International Conference on Artificial Intelligence and Statistics , pages=

    Tensor networks for probabilistic sequence modeling , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

  3. [3]

    Journal of Machine Learning Research , volume=

    Matrix product states for inference in discrete probabilistic models , author=. Journal of Machine Learning Research , volume=

  4. [4]

    2009 17th European Signal Processing Conference , pages=

    Probabilistic non-negative tensor factorization using Markov chain Monte Carlo , author=. 2009 17th European Signal Processing Conference , pages=. 2009 , organization=

  5. [5]

    III , author=

    The rotation of eigenvectors by a perturbation. III , author=. SIAM Journal on Numerical Analysis , volume=. 1970 , publisher=

  6. [6]

    American Journal of Mathematics , volume=

    Logarithmic sobolev inequalities , author=. American Journal of Mathematics , volume=. 1975 , publisher=

  7. [7]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Score-based generative modeling through stochastic differential equations , author=. arXiv preprint arXiv:2011.13456 , year=

  8. [8]

    Batch and match: black-box variational inference with a score-based divergence , author=

  9. [9]

    Advances in Neural Information Processing Systems , volume=

    Variational inference with Gaussian score matching , author=. Advances in Neural Information Processing Systems , volume=

  10. [10]

    , author=

    Estimation of non-normalized statistical models by score matching. , author=. Journal of Machine Learning Research , volume=

  11. [11]

    IEEE Access , volume=

    From probabilistic graphical models to generalized tensor networks for supervised learning , author=. IEEE Access , volume=. 2020 , publisher=

  12. [12]

    Advances in neural information processing systems , volume=

    A tensorized transformer for language modeling , author=. Advances in neural information processing systems , volume=

  13. [13]

    SciPost Physics , volume=

    Generative learning of continuous data by tensor networks , author=. SciPost Physics , volume=

  14. [14]

    arXiv preprint arXiv:2304.05305 , year=

    Generative modeling via hierarchical tensor sketching , author=. arXiv preprint arXiv:2304.05305 , year=

  15. [15]

    Applied and Computational Harmonic Analysis , volume=

    Generative modeling via tensor train sketching , author=. Applied and Computational Harmonic Analysis , volume=. 2023 , publisher=

  16. [16]

    Physical Review B , volume=

    Generative tensor network classification model for supervised machine learning , author=. Physical Review B , volume=. 2020 , publisher=

  17. [17]

    Entropy , volume=

    Information perspective to probabilistic modeling: Boltzmann machines versus born machines , author=. Entropy , volume=. 2018 , publisher=

  18. [18]

    International Workshop on Active Inference , pages=

    Learning generative models for active inference using tensor networks , author=. International Workshop on Active Inference , pages=. 2022 , organization=

  19. [19]

    Machine Learning: Science and Technology , volume=

    Plastic tensor networks for interpretable generative modeling , author=. Machine Learning: Science and Technology , volume=. 2026 , publisher=

  20. [20]

    Entropy , volume=

    Probabilistic modeling with matrix product states , author=. Entropy , volume=. 2019 , publisher=

  21. [21]

    IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

    Recent advances for quantum neural networks in generative learning , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2023 , publisher=

  22. [22]

    arXiv preprint arXiv:2501.18691 , year=

    Regularized second-order optimization of tensor-network Born machines , author=. arXiv preprint arXiv:2501.18691 , year=

  23. [23]

    Scientific Reports , volume=

    Sequence processing with quantum-inspired tensor networks , author=. Scientific Reports , volume=. 2025 , publisher=

  24. [24]

    Physical Review B , volume=

    Supervised learning with projected entangled pair states , author=. Physical Review B , volume=. 2021 , publisher=

  25. [25]

    Physical Review E , volume=

    Tensor networks for unsupervised machine learning , author=. Physical Review E , volume=. 2023 , publisher=

  26. [26]

    2019 International Joint Conference on Neural Networks (IJCNN) , pages=

    Tensor ring restricted Boltzmann machines , author=. 2019 International Joint Conference on Neural Networks (IJCNN) , pages=. 2019 , organization=

  27. [27]

    IEEE Transactions on Big Data , year=

    Term model: Tensor ring mixture model for density estimation , author=. IEEE Transactions on Big Data , year=

  28. [28]

    Physical Review B , volume=

    Tree tensor networks for generative modeling , author=. Physical Review B , volume=. 2019 , publisher=

  29. [29]

    Physical Review X , volume=

    Unsupervised generative modeling using matrix product states , author=. Physical Review X , volume=. 2018 , publisher=

  30. [30]

    Physical Review Research , volume=

    Using matrix-product states for time-series machine learning , author=. Physical Review Research , volume=. 2025 , publisher=

  31. [31]

    TRANSACTIONS ON MACHINE LEARNING RESEARCH , year=

    What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)? , author=. TRANSACTIONS ON MACHINE LEARNING RESEARCH , year=

  32. [32]

    Advances in neural information processing systems , volume=

    Expressive power of tensor-network factorizations for probabilistic modeling , author=. Advances in neural information processing systems , volume=

  33. [33]

    arXiv preprint arXiv:2510.22138 , year=

    Tractable Shapley Values and Interactions via Tensor Networks , author=. arXiv preprint arXiv:2510.22138 , year=

  34. [34]

    Advances in Neural Information Processing Systems , volume=

    Chen, Zhuo and Dangovski, Rumen and Loh, Charlotte and Dugan, Owen and Luo, Di and Solja. Advances in Neural Information Processing Systems , volume=

  35. [35]

    Journal of Machine Learning Research , volume=

    Tensor regression networks , author=. Journal of Machine Learning Research , volume=. 2020 , publisher=

  36. [36]

    Advances in Neural Information Processing Systems , pages=

    Tensorizing neural networks , author=. Advances in Neural Information Processing Systems , pages=

  37. [37]

    SIAM Journal on Scientific Computing , volume=

    Tensor-train decomposition , author=. SIAM Journal on Scientific Computing , volume=. 2011 , publisher=

  38. [38]

    Matrix Product State Representations

    Matrix product state representations , author=. arXiv preprint quant-ph/0608197 , year=

  39. [39]

    Foundations and Trends

    Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives , author=. Foundations and Trends. 2017 , publisher=

  40. [40]

    SIAM review , volume=

    Tensor decompositions and applications , author=. SIAM review , volume=. 2009 , publisher=

  41. [41]

    Foundations and trends

    An introduction to matrix concentration inequalities , author=. Foundations and trends. 2015 , publisher=

  42. [42]

    Area Laws and Tensor Networks for Maximally Mixed Ground States: I. Arad, R. Firanko, R. Jain , author=. Communications in Mathematical Physics , volume=. 2026 , publisher=

  43. [43]

    The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

    On the Hardness of Approximating Distributions with Tractable Probabilistic Models , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

  44. [44]

    Proceedings of the AAAI Conference on Artificial Intelligence , volume=

    Sum of squares circuits , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

  45. [45]

    Advances in Neural Information Processing Systems , volume=

    EigenVI: score-based variational inference with orthogonal function expansions , author=. Advances in Neural Information Processing Systems , volume=