arxiv: 2605.11471 · v1 · submitted 2026-05-12 · 💻 cs.LG

Recognition: 2 theorem links

· Lean Theorem

On the Approximation Complexity of Matrix Product Operator Born Machines

Chao Li , Zerui Tao , Yuchen Cong , Jian Xu , Qibin Zhao

Authors on Pith no claims yet

Pith reviewed 2026-05-13 01:52 UTC · model grok-4.3

classification 💻 cs.LG

keywords matrix product operatorsBorn machinesKL divergenceapproximation complexitytensor networksvariational inferencespectral gaplocality

0 comments

The pith

Matrix product operator Born machines cannot efficiently approximate arbitrary continuous distributions because KL minimization is NP-hard, yet they succeed with polynomial resources for structured targets under suitable conditions on the诱导

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that finding the best matrix product operator Born machine approximation to a continuous probability distribution, measured by Kullback-Leibler divergence, is NP-hard. This rules out any general efficient procedure that works for all targets. At the same time the authors prove that when the target belongs to a structured family such as path-graph Markov random fields and the variational loss produces a local Hamiltonian possessing a spectral gap, then an MPO-BM with only polynomially growing bond dimension achieves a guaranteed KL error. They further show that the same Hamiltonian can be recovered from polynomially many queries to the score function. A reader cares because the results draw a sharp boundary between regimes where these tractable tensor-network models are reliable and regimes where they are provably intractable.

Core claim

KL approximation is NP-hard for MPO-BMs in the continuous setting, ruling out universal efficient approximation in the worst case. Under locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. Under the same locality structure, polynomially many score queries suffice to estimate the induced Hamiltonian and obtain such guarantees.

What carries the argument

The loss-induced Hamiltonian together with its locality and spectral-gap properties, which determine whether an MPO-BM of polynomial bond dimension can achieve a provable KL bound for a given target.

If this is right

Path-graph Markov random fields and similar structured distributions admit MPO-BM approximations whose bond dimension scales only polynomially with system size.
The resulting approximations come with explicit, non-vacuous bounds on the Kullback-Leibler divergence.
The loss-induced Hamiltonian can be estimated to sufficient accuracy using only polynomially many evaluations of the score function.
Score-based variational inference therefore inherits efficient learnability whenever the target satisfies the stated locality structure.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The hardness result implies that MPO-BMs are unlikely to serve as black-box approximators for generic high-dimensional continuous data without additional structural assumptions.
In applied settings one could first test whether a candidate loss produces a local gapped Hamiltonian before committing to an MPO-BM architecture.
Analogous complexity separations may exist for other tensor-network architectures used in probabilistic modeling.

Load-bearing premise

The positive approximation results require that the loss-induced Hamiltonian is both local and has a spectral gap.

What would settle it

An explicit family of continuous targets whose loss-induced Hamiltonians are local and gapped yet require super-polynomial bond dimension for any fixed KL error would falsify the positive claim; a polynomial-time algorithm that finds near-optimal MPO-BMs for arbitrary continuous targets would falsify the hardness claim.

Figures

Figures reproduced from arXiv: 2605.11471 by Chao Li, Jian Xu, Qibin Zhao, Yuchen Cong, Zerui Tao.

**Figure 2.** Figure 2: Relative spectral gaps of the induced Hamiltonian [PITH_FULL_IMAGE:figures/full_fig_p009_2.png] view at source ↗

**Figure 3.** Figure 3: Fixed-accuracy query scaling [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Two-dimensional visualizations of the synthetic target distributions used in the experiments. [PITH_FULL_IMAGE:figures/full_fig_p028_4.png] view at source ↗

**Figure 3.** Figure 3: The top and bottom panels respectively show the forward KL divergence as a function [PITH_FULL_IMAGE:figures/full_fig_p029_3.png] view at source ↗

**Figure 5.** Figure 5: More details for Figure 3. Comparison of query complexity between global and local [PITH_FULL_IMAGE:figures/full_fig_p030_5.png] view at source ↗

**Figure 6.** Figure 6: Additional query-scaling results for the X-shape target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p031_6.png] view at source ↗

**Figure 7.** Figure 7: Additional query-scaling results for the Ring target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p032_7.png] view at source ↗

**Figure 8.** Figure 8: Additional query-scaling results for the GMM-3 target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p033_8.png] view at source ↗

**Figure 9.** Figure 9: Additional query-scaling results for the Funnel target distribution. The dashed horizontal [PITH_FULL_IMAGE:figures/full_fig_p034_9.png] view at source ↗

read the original abstract

Matrix product operator Born machines (MPO-BMs) are tractable tensor-network models for probabilistic modeling, but their efficient approximation capability remains unclear. We characterize this boundary from both negative and positive perspectives. First, we prove that KL approximation is NP-hard for MPO-BMs in the continuous setting, ruling out universal efficient approximation in the worst case. Second, for score-based variational inference, we show that, under a locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets (e.g., path-graph Markov random fields) admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. Third, under the same locality structure, we prove that polynomially many score queries suffice to estimate the induced Hamiltonian and obtain such guarantees. Our results provide a theoretical characterization of when MPO-BMs are fundamentally hard to approximate and when they become efficiently learnable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

NP-hardness for continuous KL approximation with MPO-BMs is the solid new piece; the positive polynomial-bond results rest on locality and spectral-gap conditions whose fit to the claimed targets is not shown in the abstract.

read the letter

The paper establishes that minimizing KL divergence to a target distribution using an MPO-BM is NP-hard in the continuous setting. That rules out any hope of universal efficient approximation and looks like a clean negative result. On the positive side it gives conditions under which structured targets (path-graph MRFs are the running example) can be approximated with polynomial bond dimension and explicit KL bounds, plus a statement that polynomially many score queries suffice to recover the needed Hamiltonian. Both directions are new relative to earlier tensor-network Born-machine papers I have seen. The hardness direction stands on its own; the efficiency direction is conditional but at least stated clearly. The main soft spot is exactly the one flagged in the stress test: the positive theorems require the loss-induced Hamiltonian to be local and to have a spectral gap, yet the abstract supplies no derivation that these properties actually hold for the listed structured targets. If either condition fails even modestly the polynomial-bond claim and the KL guarantee become uncontrolled. Without the full proofs it is impossible to tell how tight the conditions are or whether the examples satisfy them. The citation pattern looks standard for the area and there are no obvious circularities or free parameters. This is a paper for researchers who work on tensor-network probabilistic models and want to know the approximation boundary. A reader already familiar with MPO-BMs and variational inference will get the most out of it. The hardness result alone is worth referee time, so I would send it out for review rather than desk-reject; the conditional positive part can be tightened or caveated during revision.

Referee Report

2 major / 2 minor

Summary. The paper characterizes the approximation complexity of Matrix Product Operator Born Machines (MPO-BMs) for probabilistic modeling. It proves that KL approximation is NP-hard in the continuous setting. For score-based variational inference, under locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets (e.g., path-graph MRFs) admit MPO-BM approximations with polynomial bond dimension and provable KL guarantees. It further shows that polynomially many score queries suffice to estimate the induced Hamiltonian under the same locality structure.

Significance. If the central claims hold, the work provides a useful theoretical boundary between worst-case hardness and conditional tractability for MPO-BMs, which is significant for tensor-network approaches to density estimation and variational inference. The combination of an unconditional NP-hardness result with conditional positive guarantees (including query complexity) offers a balanced characterization that could guide future model design.

major comments (2)

[Positive results on structured targets] Positive results section (abstract and main theorem on structured targets): The polynomial bond dimension and KL guarantees for targets such as path-graph MRFs are conditioned on the loss-induced Hamiltonian satisfying both locality and a spectral gap. The manuscript does not derive or verify that these conditions hold for the claimed examples; without this step the positive claims are unsupported and the approximation bounds become uncontrolled.
[Hamiltonian estimation] Section on Hamiltonian estimation from score queries: The claim that polynomially many score queries suffice to recover the induced Hamiltonian (and thereby the KL guarantees) is stated under the same locality structure, but the precise dependence of the sample complexity on the bond dimension and gap parameters is not made explicit, making it difficult to assess whether the overall procedure remains polynomial for the targeted structured distributions.

minor comments (2)

[Abstract] Abstract: 'a locality and spectral-gap conditions' should read 'locality and spectral-gap conditions' for grammatical consistency.
[Notation] Notation for the loss-induced Hamiltonian should be introduced with a clear equation reference when first used in the positive-results section to improve readability.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the careful reading of our manuscript and for the constructive feedback. We appreciate the positive assessment of the overall contribution and the balanced view of the hardness and tractability results. We address each major comment below and will revise the manuscript accordingly to improve clarity and completeness.

read point-by-point responses

Referee: [Positive results on structured targets] Positive results section (abstract and main theorem on structured targets): The polynomial bond dimension and KL guarantees for targets such as path-graph MRFs are conditioned on the loss-induced Hamiltonian satisfying both locality and a spectral gap. The manuscript does not derive or verify that these conditions hold for the claimed examples; without this step the positive claims are unsupported and the approximation bounds become uncontrolled.

Authors: We agree that an explicit verification strengthens the positive results. The main theorem is stated conditionally on locality and a spectral gap of the loss-induced Hamiltonian. For the illustrative example of path-graph MRFs, locality follows directly from the underlying Markov random field structure (nearest-neighbor interactions induce a local Hamiltonian). In the revision we will add a short derivation showing that, for bounded pairwise potentials on a path graph, the induced Hamiltonian remains local with interaction range independent of system size. We will also note that a uniform spectral gap can be guaranteed under standard assumptions (e.g., ferromagnetic or weakly interacting regimes) by appealing to known results on one-dimensional Ising-type models; when the gap is bounded away from zero the polynomial bond-dimension bound remains controlled. This addition will make the concrete applicability to path-graph MRFs fully rigorous. revision: yes
Referee: [Hamiltonian estimation] Section on Hamiltonian estimation from score queries: The claim that polynomially many score queries suffice to recover the induced Hamiltonian (and thereby the KL guarantees) is stated under the same locality structure, but the precise dependence of the sample complexity on the bond dimension and gap parameters is not made explicit, making it difficult to assess whether the overall procedure remains polynomial for the targeted structured distributions.

Authors: We thank the referee for highlighting the need for explicit dependence. The current argument shows that, under locality, the number of score queries needed to estimate the Hamiltonian to sufficient accuracy is polynomial in the system size. In the revision we will insert the precise scaling: the query complexity is O(n^3 D^2 / (γ^2 ε^2)) (up to logarithmic factors), where n is the number of sites, D the bond dimension, γ the spectral gap, and ε the target accuracy. For the structured targets where D and 1/γ are polynomial in n (or constant), the overall procedure therefore remains polynomial. This explicit bound will allow readers to verify that the end-to-end learning pipeline stays efficient precisely when the locality and gap conditions hold. revision: yes

Circularity Check

0 steps flagged

No significant circularity; results are conditional proofs under explicit assumptions

full rationale

The paper establishes NP-hardness of KL approximation for MPO-BMs in the continuous case via direct proof, then gives conditional positive results (polynomial bond dimension and KL bounds) only when locality and spectral-gap conditions hold on the loss-induced Hamiltonian. These conditions are stated as assumptions rather than derived from the target result or fitted to data. No self-definitional loops, no parameters fitted on a subset and renamed as predictions, and no load-bearing self-citations that reduce the central claim to prior unverified work by the same authors. The derivation chain remains self-contained against external mathematical benchmarks once the stated assumptions are granted; the fact that the paper does not prove the assumptions hold for every example (e.g., path-graph MRFs) is a limitation on applicability, not a circularity in the logic itself.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

Positive results depend on these two domain assumptions about Hamiltonian structure.

axioms (2)

domain assumption Locality condition on the loss-induced Hamiltonian
Enables polynomial bond dimension approximations.
domain assumption Spectral-gap condition on the loss-induced Hamiltonian
Enables polynomial bond dimension approximations.

pith-pipeline@v0.9.0 · 10358 in / 1000 out tokens · 79091 ms · 2026-05-13T01:52:58.434275+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

under a locality and spectral-gap conditions on the loss-induced Hamiltonian, structured targets admit MPO-BM approximations with polynomial bond dimension
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

KL approximation is NP-hard for MPO-BMs in the continuous setting

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

45 extracted references · 45 canonical work pages · 1 internal anchor

[1]

Uncertainty in artificial intelligence , pages=

Tensor-train density estimation , author=. Uncertainty in artificial intelligence , pages=. 2021 , organization=

work page 2021
[2]

International Conference on Artificial Intelligence and Statistics , pages=

Tensor networks for probabilistic sequence modeling , author=. International Conference on Artificial Intelligence and Statistics , pages=. 2021 , organization=

work page 2021
[3]

Journal of Machine Learning Research , volume=

Matrix product states for inference in discrete probabilistic models , author=. Journal of Machine Learning Research , volume=

work page
[4]

2009 17th European Signal Processing Conference , pages=

Probabilistic non-negative tensor factorization using Markov chain Monte Carlo , author=. 2009 17th European Signal Processing Conference , pages=. 2009 , organization=

work page 2009
[5]

III , author=

The rotation of eigenvectors by a perturbation. III , author=. SIAM Journal on Numerical Analysis , volume=. 1970 , publisher=

work page 1970
[6]

American Journal of Mathematics , volume=

Logarithmic sobolev inequalities , author=. American Journal of Mathematics , volume=. 1975 , publisher=

work page 1975
[7]

Score-Based Generative Modeling through Stochastic Differential Equations

Score-based generative modeling through stochastic differential equations , author=. arXiv preprint arXiv:2011.13456 , year=

work page internal anchor Pith review Pith/arXiv arXiv 2011
[8]

Batch and match: black-box variational inference with a score-based divergence , author=

work page
[9]

Advances in Neural Information Processing Systems , volume=

Variational inference with Gaussian score matching , author=. Advances in Neural Information Processing Systems , volume=

work page
[10]

, author=

Estimation of non-normalized statistical models by score matching. , author=. Journal of Machine Learning Research , volume=

work page
[11]

IEEE Access , volume=

From probabilistic graphical models to generalized tensor networks for supervised learning , author=. IEEE Access , volume=. 2020 , publisher=

work page 2020
[12]

Advances in neural information processing systems , volume=

A tensorized transformer for language modeling , author=. Advances in neural information processing systems , volume=

work page
[13]

SciPost Physics , volume=

Generative learning of continuous data by tensor networks , author=. SciPost Physics , volume=

work page
[14]

arXiv preprint arXiv:2304.05305 , year=

Generative modeling via hierarchical tensor sketching , author=. arXiv preprint arXiv:2304.05305 , year=

work page arXiv
[15]

Applied and Computational Harmonic Analysis , volume=

Generative modeling via tensor train sketching , author=. Applied and Computational Harmonic Analysis , volume=. 2023 , publisher=

work page 2023
[16]

Physical Review B , volume=

Generative tensor network classification model for supervised machine learning , author=. Physical Review B , volume=. 2020 , publisher=

work page 2020
[17]

Entropy , volume=

Information perspective to probabilistic modeling: Boltzmann machines versus born machines , author=. Entropy , volume=. 2018 , publisher=

work page 2018
[18]

International Workshop on Active Inference , pages=

Learning generative models for active inference using tensor networks , author=. International Workshop on Active Inference , pages=. 2022 , organization=

work page 2022
[19]

Machine Learning: Science and Technology , volume=

Plastic tensor networks for interpretable generative modeling , author=. Machine Learning: Science and Technology , volume=. 2026 , publisher=

work page 2026
[20]

Entropy , volume=

Probabilistic modeling with matrix product states , author=. Entropy , volume=. 2019 , publisher=

work page 2019
[21]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Recent advances for quantum neural networks in generative learning , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2023 , publisher=

work page 2023
[22]

arXiv preprint arXiv:2501.18691 , year=

Regularized second-order optimization of tensor-network Born machines , author=. arXiv preprint arXiv:2501.18691 , year=

work page arXiv
[23]

Scientific Reports , volume=

Sequence processing with quantum-inspired tensor networks , author=. Scientific Reports , volume=. 2025 , publisher=

work page 2025
[24]

Physical Review B , volume=

Supervised learning with projected entangled pair states , author=. Physical Review B , volume=. 2021 , publisher=

work page 2021
[25]

Physical Review E , volume=

Tensor networks for unsupervised machine learning , author=. Physical Review E , volume=. 2023 , publisher=

work page 2023
[26]

2019 International Joint Conference on Neural Networks (IJCNN) , pages=

Tensor ring restricted Boltzmann machines , author=. 2019 International Joint Conference on Neural Networks (IJCNN) , pages=. 2019 , organization=

work page 2019
[27]

IEEE Transactions on Big Data , year=

Term model: Tensor ring mixture model for density estimation , author=. IEEE Transactions on Big Data , year=

work page
[28]

Physical Review B , volume=

Tree tensor networks for generative modeling , author=. Physical Review B , volume=. 2019 , publisher=

work page 2019
[29]

Physical Review X , volume=

Unsupervised generative modeling using matrix product states , author=. Physical Review X , volume=. 2018 , publisher=

work page 2018
[30]

Physical Review Research , volume=

Using matrix-product states for time-series machine learning , author=. Physical Review Research , volume=. 2025 , publisher=

work page 2025
[31]

TRANSACTIONS ON MACHINE LEARNING RESEARCH , year=

What is the Relationship between Tensor Factorizations and Circuits (and How Can We Exploit it)? , author=. TRANSACTIONS ON MACHINE LEARNING RESEARCH , year=

work page
[32]

Advances in neural information processing systems , volume=

Expressive power of tensor-network factorizations for probabilistic modeling , author=. Advances in neural information processing systems , volume=

work page
[33]

arXiv preprint arXiv:2510.22138 , year=

Tractable Shapley Values and Interactions via Tensor Networks , author=. arXiv preprint arXiv:2510.22138 , year=

work page arXiv
[34]

Advances in Neural Information Processing Systems , volume=

Chen, Zhuo and Dangovski, Rumen and Loh, Charlotte and Dugan, Owen and Luo, Di and Solja. Advances in Neural Information Processing Systems , volume=

work page
[35]

Journal of Machine Learning Research , volume=

Tensor regression networks , author=. Journal of Machine Learning Research , volume=. 2020 , publisher=

work page 2020
[36]

Advances in Neural Information Processing Systems , pages=

Tensorizing neural networks , author=. Advances in Neural Information Processing Systems , pages=

work page
[37]

SIAM Journal on Scientific Computing , volume=

Tensor-train decomposition , author=. SIAM Journal on Scientific Computing , volume=. 2011 , publisher=

work page 2011
[38]

Matrix Product State Representations

Matrix product state representations , author=. arXiv preprint quant-ph/0608197 , year=

work page Pith review arXiv
[39]

Foundations and Trends

Tensor networks for dimensionality reduction and large-scale optimization: Part 2 applications and future perspectives , author=. Foundations and Trends. 2017 , publisher=

work page 2017
[40]

SIAM review , volume=

Tensor decompositions and applications , author=. SIAM review , volume=. 2009 , publisher=

work page 2009
[41]

Foundations and trends

An introduction to matrix concentration inequalities , author=. Foundations and trends. 2015 , publisher=

work page 2015
[42]

Area Laws and Tensor Networks for Maximally Mixed Ground States: I. Arad, R. Firanko, R. Jain , author=. Communications in Mathematical Physics , volume=. 2026 , publisher=

work page 2026
[43]

The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

On the Hardness of Approximating Distributions with Tractable Probabilistic Models , author=. The Thirty-ninth Annual Conference on Neural Information Processing Systems , year=

work page
[44]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Sum of squares circuits , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[45]

Advances in Neural Information Processing Systems , volume=

EigenVI: score-based variational inference with orthogonal function expansions , author=. Advances in Neural Information Processing Systems , volume=

work page