pith. sign in

arxiv: 2605.28690 · v3 · pith:FQWHZH7Mnew · submitted 2026-05-27 · 🪐 quant-ph · cs.LG

Latent-Conditioned Parameterized Quantum Circuits as Universal Approximators for Distributions over Quantum States

Pith reviewed 2026-06-29 12:12 UTC · model grok-4.3

classification 🪐 quant-ph cs.LG
keywords parameterized quantum circuitsuniversal approximationquantum generative modelingWasserstein distancedensity operatorshybrid modelsbarren plateauslatent conditioning
0
0 comments X

The pith

Latent-conditioned parameterized quantum circuits are universal approximators for distributions over quantum states in 1-Wasserstein distance.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a classical neural network can map random latent variables to the parameters of a quantum circuit to produce any desired distribution of quantum states. This hybrid setup extends classical universal approximation results to the quantum case. It addresses the need to generate ensembles of states for quantum applications without preparing them individually. The approach also provides a way to mitigate optimization challenges in quantum circuits.

Core claim

LPQCs consist of classical neural networks that take samples from a latent prior and output parameters for a parameterized quantum circuit. We prove that these models can approximate any probability measure on the space of density operators to arbitrary accuracy with respect to the 1-Wasserstein distance. This holds by leveraging the universal approximation property of the classical network to realize the required parameter mappings.

What carries the argument

The latent-conditioned parameterized quantum circuit (LPQC), where a classical network conditions the parameters of a quantum circuit on a latent variable sampled from a prior distribution.

If this is right

  • LPQCs enable generative modeling of quantum state ensembles for simulation and chemistry tasks.
  • The framework alleviates barren plateau issues during training with partial theoretical guarantees.
  • Output dimension scales linearly with qubit number instead of exponentially.
  • Empirical results show it matches classical baselines while outperforming other quantum generative models on molecular structure ensembles.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The hybrid design may allow quantum models to handle high-dimensional distributions by using classical networks for the bulk of the expressivity.
  • Similar conditioning mechanisms could be applied to other quantum circuit families to achieve universality.
  • Testing on larger systems would reveal whether the linear scaling holds in practice for complex distributions.

Load-bearing premise

A classical neural network can approximate the mapping from latent variables to quantum circuit parameters well enough to achieve the target state distribution.

What would settle it

Finding a distribution over quantum density operators that no LPQC can approximate to within a small 1-Wasserstein distance, no matter how the classical network is chosen.

Figures

Figures reproduced from arXiv: 2605.28690 by Hirotaka Oshima, Koki Chinzei, Quoc Hoan Tran, Yasuhiro Endo.

Figure 1
Figure 1. Figure 1: FIG. 1. Schematic of the LPQC framework for approximating the target distribution [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: FIG. 2. The performance of LPQC in learning the distribution of multi-clustered density matrices ( [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: FIG. 3. Visualization of target and generated ensembles using PCA and t-SNE for trained LPQC (same conditions in Fig. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: FIG. 4. The test loss in training LPQC to learn the distribution [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: FIG. 5. The average squared gradient norm, normalized by the dimension of [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: FIG. 7. Test Wasserstein loss during training on the multi [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: compares the performance of LPQC (with the latent Gaussian prior of M = 4 and E = 1) against the IMPE framework (see Appendix C for details) in terms of test loss as varying PQC layers L and training epochs. Solid and dashed lines denote averages over 10 trials for LPQC and 20 trials for IMPE (reproduced from Ref. [29]), respectively. For fairness, we maintain identi￾cal settings for training data size and… view at source ↗
Figure 9
Figure 9. Figure 9: FIG. 9. Four representative novel molecules generated by LPQC trained on the QM9-derived subset. All molecules contain [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: FIG. 10. The performance of LPQC models in learning the distribution of multi-clustered density matrices ( [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: FIG. 11. The performance of LPQC in learning the distribution of a QM9-derived distribution (7 heavy atoms and 2 rings in [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
read the original abstract

Many applications in quantum simulation, quantum chemistry, and quantum machine learning require not a single quantum state but an ensemble of states characterizing the heterogeneity of a target system. Preparing such ensembles state-by-state is prohibitive in both variational and fault-tolerant settings, thereby motivating a generative modeling approach. We introduce latent-conditioned parameterized quantum circuits (LPQCs), a hybrid quantum-classical framework in which classical neural networks map a latent variable sampled from a prior distribution to the parameters of a parameterized quantum circuit. We prove that LPQCs are universal approximators for probability measures over density operators in the 1-Wasserstein distance, extending classical universal approximation theorems to the quantum-distribution setting. We additionally introduce a multimodal latent prior and a mixture-of-experts circuit architecture, and show empirically that the latent-conditioned parameterization alleviates the barren plateau problem during optimization, a behavior for which we provide rigorous partial guarantees. Numerical experiments validate the framework on a synthetic multi-cluster ensemble of mixed quantum states and on a QM9-derived ensemble of 3-D molecular structures. In these tasks, LPQC outperforms recent quantum generative baselines and matches the generation quality of a classical neural-network baseline, while requiring an output dimension that grows only linearly with the number of qubits rather than exponentially. By leveraging classical expressivity in the latent space, LPQCs offer a tractable route to quantum generative modeling.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 3 minor

Summary. The manuscript introduces latent-conditioned parameterized quantum circuits (LPQCs) in which a classical neural network maps latent variables drawn from a prior to the parameters of a parameterized quantum circuit (PQC). It proves that LPQCs are universal approximators for probability measures over density operators in the 1-Wasserstein distance by extending classical universal approximation theorems, introduces a multimodal latent prior together with a mixture-of-experts circuit architecture, supplies partial rigorous guarantees that the latent conditioning alleviates barren plateaus, and reports numerical results on a synthetic multi-cluster ensemble of mixed states and on QM9-derived 3-D molecular structures where LPQCs outperform recent quantum generative baselines while matching a classical neural-network baseline with output dimension linear in qubit number.

Significance. If the central universality claim holds, the work supplies a theoretically grounded route to generative modeling of ensembles of quantum states that is relevant to quantum simulation, chemistry, and machine learning. The linear scaling of the classical output dimension and the reported barren-plateau mitigation constitute concrete practical advantages over direct density-matrix or state-vector representations. The empirical match to classical baselines on the tested tasks further indicates that the hybrid construction can be competitive when the theoretical density condition is satisfied.

major comments (3)
  1. [§3] §3 (Universality theorem): The proof that the pushforward measure induced by the NN-composed map z ↦ θ(z) ↦ ρ(θ(z)) can approximate any target measure μ in 1-Wasserstein distance requires that the image of the PQC map θ ↦ ρ(θ) be dense in the space of density operators. No explicit density argument is supplied for arbitrary mixed states (e.g., via ancilla tracing or a universal gate set that generates all mixed states under partial trace), which is load-bearing for the extension of the classical UAT.
  2. [§4.2] §4.2 (Mixture-of-experts architecture): The claim that the MoE variant increases expressivity sufficiently to reach the required density is presented without a quantitative bound on the approximation error introduced by the gating network or on how the expert routing affects the continuity of the overall map; this directly impacts whether the universality result carries over to the implemented model.
  3. [§5] §5 (Barren-plateau guarantees): The partial rigorous guarantees for barren-plateau alleviation are stated to hold under the latent-conditioned parameterization, yet the precise assumptions on circuit depth, latent dimension, and the form of the prior under which the variance lower bound is derived are not spelled out, making it impossible to verify applicability to the multimodal prior and MoE circuits used in the experiments.
minor comments (3)
  1. [§2] Notation: the symbol for the 1-Wasserstein distance is introduced without an explicit definition of the underlying metric on the space of density operators; adding the definition in §2 would improve readability.
  2. [Figure 3] Figure 3: the caption does not indicate whether the plotted fidelities are averaged over multiple random seeds or single runs; error bars or a statement of variability would clarify the comparison with baselines.
  3. [Introduction] References: several recent works on quantum generative models (e.g., on quantum Boltzmann machines and quantum GANs) are cited only in passing; a short dedicated paragraph in the introduction would better situate the contribution.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and constructive comments. Below we respond point-by-point to the major comments and indicate the revisions we will make.

read point-by-point responses
  1. Referee: [§3] §3 (Universality theorem): The proof that the pushforward measure induced by the NN-composed map z ↦ θ(z) ↦ ρ(θ(z)) can approximate any target measure μ in 1-Wasserstein distance requires that the image of the PQC map θ ↦ ρ(θ) be dense in the space of density operators. No explicit density argument is supplied for arbitrary mixed states (e.g., via ancilla tracing or a universal gate set that generates all mixed states under partial trace), which is load-bearing for the extension of the classical UAT.

    Authors: We agree that an explicit statement of the density of the PQC image is necessary for the argument to be self-contained. The manuscript extends the classical UAT under the standing assumption (common in the PQC literature) that a universal gate set with ancilla qubits and partial trace yields a dense image in the space of density operators. In the revision we will insert a short clarifying paragraph in §3 that recalls this standard density result and states the precise conditions on the gate set under which it holds. revision: yes

  2. Referee: [§4.2] §4.2 (Mixture-of-experts architecture): The claim that the MoE variant increases expressivity sufficiently to reach the required density is presented without a quantitative bound on the approximation error introduced by the gating network or on how the expert routing affects the continuity of the overall map; this directly impacts whether the universality result carries over to the implemented model.

    Authors: The universality theorem is stated for the general LPQC construction; the MoE architecture is introduced in §4.2 as a practical implementation that empirically improves performance. Because the gating network is itself a continuous (Lipschitz) classical map, the overall composition remains continuous, so the push-forward argument is unaffected in principle. We do not supply quantitative error bounds for the MoE variant in the present manuscript. In the revision we will add a brief remark in §4.2 noting the continuity preservation while acknowledging that a full quantitative analysis of the MoE approximation error lies outside the scope of the current work. revision: partial

  3. Referee: [§5] §5 (Barren-plateau guarantees): The partial rigorous guarantees for barren-plateau alleviation are stated to hold under the latent-conditioned parameterization, yet the precise assumptions on circuit depth, latent dimension, and the form of the prior under which the variance lower bound is derived are not spelled out, making it impossible to verify applicability to the multimodal prior and MoE circuits used in the experiments.

    Authors: We will revise §5 to list the precise assumptions under which the variance lower bound is derived (logarithmic circuit depth, latent dimension linear in the number of PQC parameters, and a sub-Gaussian prior). We will also add a short discussion of the additional conditions needed to extend the bound to the multimodal prior and to the MoE routing, thereby making the applicability to the experimental settings explicit. revision: yes

Circularity Check

0 steps flagged

No circularity: universality proof extends classical UAT via composition without self-referential reduction

full rationale

The derivation chain rests on the classical universal approximation theorem for neural networks (to map latents z to parameters θ(z)) composed with the continuity of the map θ → ρ(θ) from a PQC, yielding density in the 1-Wasserstein metric on measures over density operators. This is a standard mathematical extension, not a reduction of the target result to a fitted parameter, self-definition, or self-citation chain. No equations or claims equate the claimed universality to its own inputs by construction; the mixture-of-experts and multimodal prior are architectural choices, not load-bearing for the proof. The result is therefore self-contained against external benchmarks (classical UAT and standard PQC continuity).

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The universality claim rests primarily on the classical universal approximation theorem for neural networks and the introduction of the LPQC construction itself; no free parameters are required for the existence proof, though training involves optimization of circuit and network weights.

free parameters (1)
  • Neural network and PQC parameters
    Optimized during training but not required for the existence statement of universality.
axioms (1)
  • standard math Classical universal approximation theorem for feed-forward neural networks
    Invoked to extend the result to the quantum distribution setting.
invented entities (2)
  • Latent-conditioned parameterized quantum circuit (LPQC) no independent evidence
    purpose: Hybrid model that maps latent variables to quantum circuit parameters for distribution approximation
    Core new object introduced by the paper.
  • Multimodal latent prior no independent evidence
    purpose: To capture multiple modes in the target distribution
    Architectural addition introduced alongside the main claim.

pith-pipeline@v0.9.1-grok · 5786 in / 1540 out tokens · 47897 ms · 2026-06-29T12:12:59.535335+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 11 canonical work pages · 3 internal anchors

  1. [1]

    Num. atoms

    The ensemble induces a probability measure Q on D(HS), namely Q =P j pj δρj in the atomic case, or more generally any probability measure on D(HS) when the ensemble is continuous. Note that Q is not the same object as the averaged density matrix ρ =P j pjρj ∈ D (HS): averaging discards the spectral and multimodal structure. Two ensembles with very differe...

  2. [2]

    B. M. Terhal and D. P. DiVincenzo, Phys. Rev. A61, 022301 (2000)

  3. [3]

    Poulin and P

    D. Poulin and P. Wocjan, Phys. Rev. Lett.103, 220502 (2009)

  4. [4]

    doi: 10.1038/sdata.2014.22

    R. Ramakrishnan, P. O. Dral, M. Rupp, and O. A. von Lilienfeld, Scientific Data1, 10.1038/sdata.2014.22 (2014)

  5. [5]

    Rathi, E

    L. Rathi, E. Tretschk, C. Theobalt, R. Dabral, and V. Golyanik, 3D-QAE: Fully quantum auto-encoding of 3D point clouds (2023)

  6. [6]

    H. Wu, X. Ye, and J. Yan, inThe Thirty-eighth Annual Conference on Neural Information Processing Systems (2024)

  7. [7]

    Liu and P

    N. Liu and P. Rebentrost, Phys. Rev. A97, 042315 (2018)

  8. [8]

    H. Zou, M. Rahm, A. F. Kockum, and S. Olsson, Npj Quantum Inf.12, 10.1038/s41534-025-01159-x (2025)

  9. [9]

    Peruzzo, J

    A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, Nat. Commun.5, 4213 (2014)

  10. [10]

    D. S. Abrams and S. Lloyd, Phys. Rev. Lett.83, 5162 (1999)

  11. [11]

    Gily´ en, Y

    A. Gily´ en, Y. Su, G. H. Low, and N. Wiebe, inProceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing (STOC)(2019) pp. 193–204

  12. [12]

    Cerezo, A

    M. Cerezo, A. Arrasmith, R. Babbush, S. C. Benjamin, S. Endo, K. Fujii, J. R. McClean, K. Mitarai, X. Yuan, L. Cincio, and P. J. Coles, Nature Reviews Physics3, 625–644 (2021)

  13. [13]

    A Quantum Approximate Optimization Algorithm

    E. Farhi, J. Goldstone, and S. Gutmann, A quantum ap- proximate optimization algorithm (2014), arXiv:1411.4028 [quant-ph]

  14. [14]

    T. Goto, Q. H. Tran, and K. Nakajima, Phys. Rev. Lett. 127, 090506 (2021)

  15. [15]

    Barthe, M

    A. Barthe, M. Grossi, S. Vallecorsa, J. Tura, and V. Dun- jko, Npj Quantum Inf.11, 10.1038/s41534-025-01064-3 (2025)

  16. [16]

    J. R. McClean, S. Boixo, V. N. Smelyanskiy, R. Babbush, and H. Neven, Nat. Commun.9, 4812 (2018)

  17. [17]

    Hornik, Neural Networks4, 251 (1991)

    K. Hornik, Neural Networks4, 251 (1991)

  18. [18]

    Fremlin,Measure Theory: Topological measure spaces

    D. Fremlin,Measure Theory: Topological measure spaces. Volume 4(Torres Fremlin, 2011)

  19. [19]

    R. A. Jacobs, M. I. Jordan, S. J. Nowlan, and G. E. Hinton, Neural Computation3, 79 (1991)

  20. [20]

    M. I. Jordan and R. A. Jacobs, Neural Computation6, 181 (1994)

  21. [21]

    Shazeer, A

    N. Shazeer, A. Mirhoseini, K. Maziarz, A. Davis, Q. Le, G. Hinton, and J. Dean, inInternational Conference on Learning Representations (ICLR)(2017)

  22. [22]

    Fedus, B

    W. Fedus, B. Zoph, and N. Shazeer, Journal of Machine Learning Research23, 1 (2022)

  23. [23]

    J. A. Miszczak, Z. Pucha la, P. Horodecki, A. Uhlmann, and K. Zyczkowski, Quantum Info. Comput.9, 103–130 (2009)

  24. [24]

    Zhang, J

    S.-X. Zhang, J. Allcock, Z.-Q. Wan, S. Liu, J. Sun, H. Yu, X.-H. Yang, J. Qiu, Z. Ye, Y.-Q. Chen, C.-K. Lee, Y.-C. Zheng, S.-K. Jian, H. Yao, C.-Y. Hsieh, and S. Zhang, Quantum7, 912 (2023)

  25. [25]

    Bradbury, R

    J. Bradbury, R. Frostig, P. Hawkins, M. J. Johnson, C. Leary, D. Maclaurin, G. Necula, A. Paszke, J. Van- derPlas, S. Wanderman-Milne, and Q. Zhang, JAX: com- posable transformations of Python+NumPy programs (2018)

  26. [26]

    Romero and A

    J. Romero and A. Aspuru-Guzik, Adv. Quantum Technol. 4, 2000003 (2021)

  27. [27]

    S. Y. Chang, S. Thanasilp, B. Le Saux, S. Vallecorsa, and M. Grossi, Latent style-based quantum GAN for high-quality image generation (2024), arXiv:2406.02668 [quant-ph]

  28. [28]

    Friedrich and J

    L. Friedrich and J. Maziero, Phys. Rev. A106, 042433 (2022)

  29. [29]

    Yi and R

    Z. Yi and R. Bhadani, Geometric optimization on Lie groups: A Lie-theoretic explanation of barren plateau mitigation for variational quantum algorithms (2025), arXiv:2512.02078 [quant-ph]

  30. [30]

    Q. H. Tran, K. Chinzei, Y. Endo, and H. Oshima, Univer- sality of many-body projected ensemble for learning quan- tum data distribution (2026), arXiv:2601.18637 [quant- ph]. 13

  31. [31]

    rdkit.org, version [2025.03.6]; DOI: 10.5281/zen- odo.591637

    Rdkit: Open-source cheminformatics, https://www. rdkit.org, version [2025.03.6]; DOI: 10.5281/zen- odo.591637

  32. [32]

    Zhang, P

    B. Zhang, P. Xu, X. Chen, and Q. Zhuang, Phys. Rev. Lett.132, 100602 (2024)

  33. [33]

    Cybenko, Math

    G. Cybenko, Math. Control. Signal2, 303 (1989)

  34. [34]

    Z. Yu, Q. Chen, Y. Jiao, Y. Li, X. Lu, X. Wang, and J. Z. Yang, inProceedings of the 38th International Con- ference on Neural Information Processing Systems, NIPS ’24 (Curran Associates Inc., Red Hook, NY, USA, 2024)

  35. [35]

    J. Choi, A. L. Shaw, I. S. Madjarov, X. Xie, R. Finkelstein, J. P. Covey, J. S. Cotler, D. K. Mark, H.-Y. Huang, A. Kale, H. Pichler, F. G. S. L. Brand˜ ao, S. Choi, and M. Endres, Nature613, 468 (2023), 2103.03535

  36. [36]

    J. S. Cotler, D. K. Mark, H.-Y. Huang, F. Hern´ andez, J. Choi, A. L. Shaw, M. Endres, and S. Choi, PRX Quan- tum4, 010311 (2023), 2103.03536

  37. [37]

    Latent-conditioned parameterized quantum circuits as universal approximators for distribu- tions over quantum states

    Q. H. Tran, K. Chinzei, Y. Endo, and H. Oshima, Source code of the paper “Latent-conditioned parameterized quantum circuits as universal approximators for distribu- tions over quantum states” (2026)

  38. [38]

    Ragone, B

    M. Ragone, B. N. Bakalov, F. Sauvage, A. F. Kemper, C. Ortiz Marrero, M. Larocca, and M. Cerezo, Nat. Com- mun.15, 7172 (2024)

  39. [39]

    C. Li, H. Farkhoor, R. Liu, and J. Yosinski, inInterna- tional Conference on Learning Representations (ICLR) (2018) arXiv:1804.08838

  40. [40]

    Grant, L

    E. Grant, L. Wossnig, M. Ostaszewski, and M. Benedetti, Quantum3, 214 (2019)

  41. [41]

    S. H. Sack, R. A. Medina, A. A. Michailidis, R. Kueng, and M. Serbyn, PRX Quantum3, 020365 (2022). Appendix A: Encoding Molecules to Quantum States The QM9 dataset features small organic molecules, each incorporating up to 9 heavy atoms (C, N, O, F) supplemented by hydrogens, for a maximum of 29 atoms overall per molecule. For any given molecule (indexed ...

  42. [42]

    to explain the IMPE method. The goal is to approxi- mate a target distribution Qt over pure n-qubit states by constructing a parameterized ensemble Qζ through T iter- ative cycles of unitary transformations and measurements. First, sample a training dataset S = {|ψ0⟩, . . . ,|ψNs−1⟩} of Ns states from Qt. Start with an initial ensemble ˜S0 = {| ˜ψ(0) j ⟩}...