pith. machine review for the scientific record. sign in

arxiv: 2604.00149 · v2 · submitted 2026-03-31 · ⚛️ physics.comp-ph

Recognition: unknown

Towards Verifiable and Self-Correcting AI Physicists for Quantum Many-Body Simulations

Chen Mo, Di Luo, Guijing Duan, Jize Han, Junkun Huang, Ken Deng, Ling Qian, Runqing Zhang, Xiangfei Wang, Zhiguo Huang

Pith reviewed 2026-05-13 22:29 UTC · model grok-4.3

classification ⚛️ physics.comp-ph
keywords quantum many-body simulationAI verificationself-correctionmulti-agent frameworkLLM benchmarkphysics automation
0
0 comments X

The pith

A multi-agent AI system with built-in verifiers turns unreliable LLM outputs into correct quantum many-body simulations on a new benchmark of 100 real research tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents QMP-Bench, a collection of 100 tasks drawn directly from published papers in quantum many-body physics, to test AI systems on realistic research problems. It then describes PhysVEC, a multi-agent setup that adds separate programming and scientific verifiers to check code correctness and adherence to physical laws at every step. These verifiers produce explicit evidence of errors and trigger corrections before the final output. Tests show PhysVEC beats standard large language models across the benchmark tasks and improves further when given more inference time. The work aims to make AI-generated physics results verifiable rather than merely plausible.

Core claim

PhysVEC seamlessly integrates programming and scientific verifiers to guarantee coding correctness and principle-based physical validity, yielding interpretable evidence and error correction at each step, significantly outperforms existing LLM baselines on various scenarios in QMP-Bench, and presents a favorable inference-time scaling that transforms unreliable AI generations into accurate physical reproductions.

What carries the argument

The PhysVEC multi-agent framework that couples programming verifiers for code checks with scientific verifiers for physical-principle checks.

Load-bearing premise

The verifiers can catch and fix both coding mistakes and physical-law violations on every task without missing real errors or introducing new systematic mistakes of their own.

What would settle it

A new set of quantum many-body tasks where the verifiers accept a simulation that violates a conservation law or produces results inconsistent with known analytic limits.

read the original abstract

While large language models (LLMs) promise to revolutionize automated scientific discovery, their application in rigorous real-world physical research is stalled by two critical barriers: a lack of realistic evaluation benchmarks and systemic LLM hallucinations. Here, we address both problems. We introduce QMP-Bench, a pioneering end-to-end research-level benchmark in quantum many-body simulation consisting of $100$ tasks extracted from $21$ high-impact prestigious journals, presenting a challenge even for current frontier LLMs. To establish a paradigm for reliable and transparent AI physicists, we present PhysVEC, a multi-agent framework that enforces self-verifiable and error correction in AI research. PhysVEC seamlessly integrates programming and scientific verifiers to guarantee coding correctness and principle-based physical validity, yielding interpretable evidence and error correction at each step. PhysVEC significantly outperforms existing LLM baselines on various scenarios in QMP-Bench and presents a favorable inference-time scaling, successfully transforming unreliable AI generations into accurate physical reproductions, paving a robust and trustworthy path towards future automated scientific discovery.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces QMP-Bench, an end-to-end benchmark of 100 quantum many-body simulation tasks extracted from 21 high-impact journals, and proposes PhysVEC, a multi-agent framework that integrates programming verifiers and scientific verifiers to enforce coding correctness and principle-based physical validity. It claims that PhysVEC significantly outperforms existing LLM baselines across scenarios in QMP-Bench, exhibits favorable inference-time scaling, and transforms unreliable generations into accurate physical reproductions.

Significance. If the performance claims and verifier robustness hold, the work supplies a concrete engineering contribution toward reliable AI-assisted discovery in physics by addressing hallucinations through explicit, interpretable verification steps. The benchmark itself is a useful addition for the field, as it moves beyond synthetic tasks to journal-derived problems; the multi-agent design with dual verifiers offers a reproducible template that could be extended to other domains.

major comments (2)
  1. [Abstract and evaluation section] The central performance claim (outperformance on QMP-Bench with favorable scaling) is load-bearing yet unsupported by any quantitative metrics, error bars, per-task breakdown, or ablation results in the abstract or evaluation description; without these, the assertion that PhysVEC 'significantly outperforms' baselines cannot be assessed.
  2. [Scientific verifiers description] The self-correction guarantee rests on the scientific verifiers catching violations of physical principles (e.g., broken symmetries, incorrect conservation laws, or invalid approximations in many-body Hamiltonians) across all 100 tasks; the manuscript provides no concrete specification of the principle list, how LLM-driven verifiers avoid false negatives on subtle inconsistencies that appear only in long-time dynamics or specific parameter regimes, or validation against ground-truth solutions where available.
minor comments (2)
  1. [Framework architecture] Clarify the exact interaction protocol among agents and the decision thresholds used by the verifiers to trigger correction loops.
  2. [Evaluation] Add a table or figure summarizing baseline models, exact QMP-Bench task categories, and success criteria to improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive comments and positive assessment of the potential significance of QMP-Bench and PhysVEC. We address the major comments point by point below. We will incorporate revisions to strengthen the presentation of quantitative results and the specification of the scientific verifiers.

read point-by-point responses
  1. Referee: [Abstract and evaluation section] The central performance claim (outperformance on QMP-Bench with favorable scaling) is load-bearing yet unsupported by any quantitative metrics, error bars, per-task breakdown, or ablation results in the abstract or evaluation description; without these, the assertion that PhysVEC 'significantly outperforms' baselines cannot be assessed.

    Authors: We agree that the abstract would benefit from explicit quantitative support. The full evaluation section already contains the requested metrics, error bars, per-task breakdowns, and ablation studies showing PhysVEC's outperformance and scaling behavior. We will revise the abstract to include key numerical results (e.g., success rates and scaling trends) and add explicit cross-references in the evaluation description to these detailed results. revision: yes

  2. Referee: [Scientific verifiers description] The self-correction guarantee rests on the scientific verifiers catching violations of physical principles (e.g., broken symmetries, incorrect conservation laws, or invalid approximations in many-body Hamiltonians) across all 100 tasks; the manuscript provides no concrete specification of the principle list, how LLM-driven verifiers avoid false negatives on subtle inconsistencies that appear only in long-time dynamics or specific parameter regimes, or validation against ground-truth solutions where available.

    Authors: We acknowledge the need for greater transparency here. We will expand the scientific verifiers section to provide an explicit enumerated list of enforced physical principles, describe the prompting and checking procedures used by the LLM-driven verifiers to detect violations (including checks for long-time dynamics and parameter-specific regimes), and report validation results against available ground-truth solutions for the subset of tasks where they exist. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark and framework evaluation

full rationale

The paper introduces QMP-Bench (100 tasks from 21 external journals) and PhysVEC (multi-agent verifiers for code and physics principles) as an engineering system. Claims of outperformance are direct empirical comparisons to LLM baselines on this independently defined benchmark, with no equations, fitted parameters renamed as predictions, self-definitional loops, or load-bearing self-citations that reduce the central result to its inputs by construction. The derivation chain consists of framework description plus external evaluation and is self-contained.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the domain assumption that verifiers can be constructed to enforce physical validity; no free parameters or invented physical entities are introduced.

axioms (1)
  • domain assumption Programming and scientific verifiers can be integrated to guarantee both coding correctness and principle-based physical validity
    This assumption underpins the entire PhysVEC error-correction loop.
invented entities (1)
  • PhysVEC multi-agent framework no independent evidence
    purpose: To enforce self-verifiable and error correction in AI research for quantum simulations
    Newly proposed system whose reliability is demonstrated only within the paper's own benchmark.

pith-pipeline@v0.9.0 · 5503 in / 1239 out tokens · 48577 ms · 2026-05-13T22:29:12.896425+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. The Agentification of Scientific Research: A Physicist's Perspective

    cs.AI 2026-04 unverdicted novelty 3.0

    AI will evolve from a research tool into a collaborator, fundamentally reshaping scientific collaboration, discovery, publishing, and evaluation while requiring continuous learning and idea diversity for original cont...

Reference graph

Works this paper leans on

42 extracted references · 42 canonical work pages · cited by 1 Pith paper · 3 internal anchors

  1. [1]

    The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery

    Lu, C., Lu, C., Lange, R.T., Foerster, J., Clune, J., Ha, D.: The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery (2024). https://arxiv.org/abs/2408.06292

  2. [2]

    In: The Thirty-ninth Annual Conference on Neural Information Processing Systems (2025)

    Tang, J., Xia, L., Li, Z., Huang, C.: AI-researcher: Autonomous scientific innovation. In: The Thirty-ninth Annual Conference on Neural Information Processing Systems (2025). https://openreview.net/forum?id=kQWyOYUAC4

  3. [3]

    https: //arxiv.org/abs/2512.07921

    Li, Z., Li, Z., Guo, Z., Ren, X., Huang, C.: DeepCode: Open Agentic Coding (2025). https: //arxiv.org/abs/2512.07921

  4. [4]

    Nature624(7990), 86–91 (2023) https://doi.org/10.1038/s41586-023-06734-w

    Szymanski, N.J., Rendy, B., Fei, Y., Kumar, R.E., He, T., Milsted, D., McDermott, M.J., Gallant, M., Cubuk, E.D., Merchant, A., Kim, H., Jain, A., Bartel, C.J., Persson, K., Zeng, Y., Ceder, G.: An autonomous laboratory for the accelerated synthesis of inorganic materials. Nature624(7990), 86–91 (2023) https://doi.org/10.1038/s41586-023-06734-w

  5. [5]

    Nature Communications16(1), 9104 (2025) https://doi.org/10

    Mandal, I., Soni, J., Zaki, M., Smedskjaer, M.M., Wondraczek, K., Wondraczek, L., Gos- vami, N.N., Krishnan, N.M.A.: Evaluating large language model agents for automation of atomic force microscopy. Nature Communications16(1), 9104 (2025) https://doi.org/10. 1038/s41467-025-64105-7

  6. [6]

    Nature Communications17(1), 204 (2025) https://doi.org/10.1038/s41467-025-66916-0

    Desai, S., Addamane, S., Tsao, J.Y., Brener, I., Dingreville, R., Iyer, P.P.: Self-driving lab discovers principles for steering spontaneous emission beyond conventional fourier optics. Nature Communications17(1), 204 (2025) https://doi.org/10.1038/s41467-025-66916-0

  7. [7]

    Patterns6(10), 101372 (2025) https://doi.org/10.1016/j.patter.2025.101372

    Cao, S., Zhang, Z., Alghadeer, M., Fasciati, S.D., Piscitelli, M., Bakr, M., Leek, P., Aspuru- Guzik, A.: Automating quantum computing laboratory experiments with an agent-based ai framework. Patterns6(10), 101372 (2025) https://doi.org/10.1016/j.patter.2025.101372

  8. [8]

    https://arxiv.org/ abs/2508.05421

    Sha, R., Wang, B., Yang, J., Ma, X., Wu, C., Yan, L., Zhou, C., Liu, J., Wang, G., Yan, S., Zhu, L.: LLM-based Multi-Agent Copilot for Quantum Sensor (2025). https://arxiv.org/ abs/2508.05421

  9. [9]

    https://arxiv.org/abs/ 2601.14288

    Peng, Z.-Y., Yuan, H.-S., Lai, Q., Jiang, J.-Q., Ye, G., Zhang, J., Piao, Y.-S.: DeepInflation: an AI agent for research and model discovery of inflation (2026). https://arxiv.org/abs/ 2601.14288

  10. [10]

    Llm-feynman: leveraging large language models for universal scientific formula and theory discovery

    Song, Z., Zhou, Q., Ren, C., Ling, C., Ju, M., Wang, J.: LLM-Feynman: Leveraging Large Language Models for Universal Scientific Formula and Theory Discovery (2025). https: //arxiv.org/abs/2503.06512

  11. [11]

    https: //arxiv.org/abs/2504.14557 15

    Campbell, C., Chen, H.M., Luk, W., Fan, H.: Enhancing LLM-based Quantum Code Generation with Multi-Agent Optimization and Quantum Error Correction (2025). https: //arxiv.org/abs/2504.14557 15

  12. [12]

    In: The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2025)

    Yang, R., Wang, Z., Gu, Y., Liang, Y., Li, T.: QCircuitbench: A large-scale dataset for benchmarking quantum algorithm design. In: The Thirty-ninth Annual Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2025). https://openreview.net/forum?id=NkiLldW2bi

  13. [13]

    https://arxiv.org/abs/2512.18847

    Gustin, I., Calder´ on, L.M., P´ erez-S´ anchez, J.B., Gonthier, J.F., Nakamura, Y., Panicker, K., Ramprasad, M., Zhang, Z., Zou, Y., Bernales, V., Aspuru-Guzik, A.: El Agente Cu´ antico: Automating quantum simulations (2026). https://arxiv.org/abs/2512.18847

  14. [14]

    https://arxiv.org/abs/2601.10194

    Li, W., Ren, J., Cheng, L., Gong, C.: Autonomous Quantum Simulation through Large Language Model Agents (2026). https://arxiv.org/abs/2601.10194

  15. [15]

    https://arxiv.org/abs/2512.19799

    Miao, T., Dai, J., Liu, J., Tan, J., Zhang, M., Jin, W., Du, Y., Jin, T., Pang, X., Liu, Z., Guo, T., Zhang, Z., Huang, Y., Chen, S., Ye, R., Zhang, Y., Zhang, L., Chen, K., Wang, W., E, W., Chen, S.: PhysMaster: Building an Autonomous AI Physicist for Theoretical and Computational Physics Research (2025). https://arxiv.org/abs/2512.19799

  16. [16]

    In: Advances in Neural Information Processing Systems, vol

    Wei, J., Wang, X., Schuurmans, D., Bosma, M., ichter, b., Xia, F., Chi, E., Le, Q.V., Zhou, D.: Chain-of-thought prompting elicits reasoning in large language models. In: Advances in Neural Information Processing Systems, vol. 35, pp. 24824–24837 (2022). https://openreview.net/forum?id= VjQlMeSB J

  17. [17]

    In: Advances in Neural Information Processing Systems, vol

    Yao, S., Yu, D., Zhao, J., Shafran, I., Griffiths, T., Cao, Y., Narasimhan, K.: Tree of thoughts: Deliberate problem solving with large language models. In: Advances in Neural Information Processing Systems, vol. 36, pp. 11809–11822 (2023). https://openreview.net/forum?id=1hflw0tjM8

  18. [18]

    DeepSeek-R1 incentivizes reasoning in LLMs through reinforcement learning , volume=

    Guo, D., Yang, D., Zhang, H., Song, J., Wang, P., Zhu, Q., Xu, R., Zhang, R., Ma, S., Bi, X., Zhang, X., Yu, X., Wu, Y., Wu, Z.F., Gou, Z., Shao, Z., Li, Z., Gao, Z., Liu, A., Xue, B., Wang, B., Wu, B., Feng, B., Lu, C., Zhao, C., Deng, C., Ruan, C., Dai, D., Chen, D., Ji, D., Li, E., Lin, F., Dai, F., Luo, F., Hao, G., Chen, G., Li, G., Zhang, H., Xu, H....

  19. [19]

    In: Advances in Neural Information Processing Systems, vol

    Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., K¨ uttler, H., Lewis, 16 M., Yih, W.-t., Rockt¨ aschel, T., Riedel, S., Kiela, D.: Retrieval-augmented generation for knowledge-intensive nlp tasks. In: Advances in Neural Information Processing Systems, vol. 33, pp. 9459–9474 (2020).https://openreview.net/forum?id=KnVuuSvtIm1

  20. [20]

    In: The Twelfth International Conference on Learning Representations (2024).https://openreview.net/forum?id=hSyW5go0v8

    Asai, A., Wu, Z., Wang, Y., Sil, A., Hajishirzi, H.: Self-RAG: Learning to retrieve, generate, and critique through self-reflection. In: The Twelfth International Conference on Learning Representations (2024).https://openreview.net/forum?id=hSyW5go0v8

  21. [21]

    In: The Eleventh International Conference on Learning Representations (2023).https://openreview.net/forum?id=WE vluYUL-X

    Yao, S., Zhao, J., Yu, D., Du, N., Shafran, I., Narasimhan, K.R., Cao, Y.: React: Synergizing reasoning and acting in language models. In: The Eleventh International Conference on Learning Representations (2023).https://openreview.net/forum?id=WE vluYUL-X

  22. [22]

    Pal: Program-aided language models,

    Gao, L., Madaan, A., Zhou, S., Alon, U., Liu, P., Yang, Y., Callan, J., Neubig, G.: PAL: Program-aided Language Models (2023). https://arxiv.org/abs/2211.10435

  23. [23]

    In: NeurIPS (2023).https://openreview.net/forum?id=43rnkOcpI1

    Madaan, A., Tandon, N., Gupta, P., Hallinan, S., Gao, L., Wiegreffe, S., Alon, U., Dziri, N., Prabhumoye, S., Yang, Y., Gupta, S., Majumder, B.P., Hermann, K., Welleck, S., Yaz- danbakhsh, A., Clark, P.: Self-refine: Iterative refinement with self-feedback. In: NeurIPS (2023).https://openreview.net/forum?id=43rnkOcpI1

  24. [24]

    In: NeurIPS (2024).https://openreview.net/forum?id=30hggYAY0Z

    Yang, J., Jimenez, C.E., Wettig, A., Lieret, K., Yao, S., Narasimhan, K., Press, O.: Swe-agent: Agent-computer interfaces enable automated software engineering. In: NeurIPS (2024).https://openreview.net/forum?id=30hggYAY0Z

  25. [25]

    Model Context Protocol (MCP): Landscape, Security Threats, and Future Research Directions

    Hou, X., Zhao, Y., Wang, S., Wang, H.: Model context protocol (mcp): Landscape, security threats, and future research directions. CoRRabs/2503.23278(2025)

  26. [26]

    CoRR abs/2510.26854(2025)

    Li, Y., Huang, Y., Wang, T., Fan, C., Cai, X., Hu, S., Liu, X., Shi, C., Xu, M., Wang, Z., Wang, Y., Jin, X., Zhang, T., Zhang, L., Wang, L., Deng, Y., Zhang, P., Sun, W., Li, X., E, W., Zhang, L., Yao, Z., Chen, K.: Inverse knowledge search over verifiable reasoning: Synthesizing a scientific encyclopedia from a long chains-of-thought knowledge base. CoR...

  27. [27]

    arXiv preprint arXiv:2307.10635

    Wang, X., Hu, Z., Lu, P., Zhu, Y., Zhang, J., Subramaniam, S., Loomba, A.R., Zhang, S., Sun, Y., Wang, W.: SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models (2024). https://arxiv.org/abs/2307.10635

  28. [28]

    In: First Conference on Language Modeling (2024).https://openreview.net/forum?id=Ti67584b98

    Rein, D., Hou, B.L., Stickland, A.C., Petty, J., Pang, R.Y., Dirani, J., Michael, J., Bow- man, S.R.: GPQA: A graduate-level google-proof q&a benchmark. In: First Conference on Language Modeling (2024).https://openreview.net/forum?id=Ti67584b98

  29. [29]

    CoRRabs/2509.01659 (2025)

    Qiu, J., Shi, J., Juan, X., Zhao, Z., Geng, J., Liu, S., Wang, H., Wu, S., Wang, M.: Physics supernova: Ai agent matches elite gold medalists at ipho 2025. CoRRabs/2509.01659 (2025)

  30. [30]

    Probing the Critical Point (CritPt) of AI Reasoning: a Frontier Physics Research Benchmark

    Zhu, M., Tian, M., Yang, X., Zhou, T., Yuan, L., Zhu, P., Chertkov, E., Liu, S., Du, Y., Ji, Z., Das, I., Cao, J., Du, Y., Yu, J., Wu, P., He, J., Su, Y., Jiang, Y., Zhang, Y., Liu, C., Huang, Z.-M., Jia, W., Wang, Y., Jafarpour, F., Zhao, Y., Chen, X., Shelton, J., Young, A.W., Bartolotta, J., Xu, W., Sun, Y., Chu, A., Colussi, V., Akers, C., Brooks, N.,...

  31. [31]

    CoRR abs/2407.13168(2024)

    Tian, M., Gao, L., Zhang, S.D., Chen, X., Fan, C., Guo, X., Haas, R., Ji, P., Krongchon, K., Li, Y., Liu, S., Luo, D., Ma, Y., Tong, H., Trinh, K., Tian, C., Wang, Z., Wu, B., Xiong, Y., Yin, S., Zhu, M., Lieret, K., Lu, Y., Liu, G., Du, Y., Tao, T., Press, O., Callan, J., Huerta, E.A., Peng, H.: Scicode: A research coding benchmark curated by scientists....

  32. [32]

    https://arxiv.org/abs/2502.15815

    Chung, D.J.H., Gao, Z., Kvasiuk, Y., Li, T., M¨ unchmeyer, M., Rudolph, M., Sala, F., Tadepalli, S.C.: Theoretical Physics Benchmark (TPBench) – a Dataset and Study of AI Reasoning Capabilities in Theoretical Physics (2025). https://arxiv.org/abs/2502.15815

  33. [33]

    In: The Fourteenth International Confer- ence on Learning Representations (2026).https://openreview.net/forum?id=cZFgsLq8Gs

    Weng, Y., Zhu, M., Xie, Q., Sun, Q., Lin, Z., Liu, S., Zhang, Y.: Deepscientist: Advancing frontier-pushing scientific findings progressively. In: The Fourteenth International Confer- ence on Learning Representations (2026).https://openreview.net/forum?id=cZFgsLq8Gs

  34. [34]

    In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023).https://openreview.net/forum?id=uccHPGDlao

    Zheng, L., Chiang, W.-L., Sheng, Y., Zhuang, S., Wu, Z., Zhuang, Y., Lin, Z., Li, Z., Li, D., Xing, E., Zhang, H., Gonzalez, J.E., Stoica, I.: Judging LLM-as-a-judge with MT-bench and chatbot arena. In: Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2023).https://openreview.net/forum?id=uccHPGDlao

  35. [35]

    Paperbench: Evaluating ai’s ability to replicate ai research, 2025

    Starace, G., Jaffe, O., Sherburn, D., Aung, J., Chan, J.S., Maksin, L., Dias, R., Mays, E., Kinsella, B., Thompson, W., Heidecke, J., Glaese, A., Patwardhan, T.: PaperBench: Evaluating AI’s Ability to Replicate AI Research (2025). https://arxiv.org/abs/2504.01848

  36. [36]

    SciPost Phys

    Fishman, M., White, S.R., Stoudenmire, E.M.: The ITensor Software Library for Ten- sor Network Calculations. SciPost Phys. Codebases, 4 (2022) https://doi.org/10.21468/ SciPostPhysCodeb.4

  37. [37]

    SoftwareX10, 100311 (2019) https://doi.org/10.1016/j.softx.2019.100311

    Carleo, G., Choo, K., Hofmann, D., Smith, J.E.T., Westerhout, T., Alet, F., Davis, E.J., Efthymiou, S., Glasser, I., Lin, S.-H., Mauri, M., Mazzola, G., Mendl, C.B., van Nieuwen- burg, E., O’Reilly, O., Th´ eveniaut, H., Torlai, G., Vicentini, F., Wietek, A.: Netket: A machine learning toolkit for many-body quantum systems. SoftwareX10, 100311 (2019) http...

  38. [38]

    SciPost Phys

    Vicentini, F., Hofmann, D., Szab´ o, A., Wu, D., Roth, C., Giuliani, C., Pescia, G., Nys, J., Vargas-Calder´ on, V., Astrakhantsev, N., Carleo, G.: NetKet 3: Machine Learning Toolbox for Many-Body Quantum Systems. SciPost Phys. Codebases, 7 (2022) https://doi.org/10. 21468/SciPostPhysCodeb.7

  39. [39]

    https://doi.org/10.5281/zenodo.2562111

    Aleksandrowicz, G., Alexander, T., Barkoutsos, P., Bello, L., Ben-Haim, Y., Bucher, D., Cabrera-Hern´ andez, F.J., Carballo-Franquis, J., Chen, A., Chen, C.-F., Chow, J.M., C´ orcoles-Gonzales, A.D., Cross, A.J., Cross, A., Cruz-Benito, J., Culver, C., Gonz´ alez, S.D.L.P., Torre, E.D.L., Ding, D., Dumitrescu, E., Duran, I., Eendebak, P., Everitt, M., Ser...

  40. [40]

    Wiley Interdiscip

    Neese, F.: The orca program system. Wiley Interdiscip. Rev. Comput. Mol. Sci.2(1), 73–78 (2012) https://doi.org/10.1002/wcms.81

  41. [41]

    In: The Thirteenth International Conference on Learning Representations (2025)

    Snell, C.V., Lee, J., Xu, K., Kumar, A.: Scaling LLM test-time compute optimally can be more effective than scaling parameters for reasoning. In: The Thirteenth International Conference on Learning Representations (2025). https://openreview.net/forum?id=4FWAwZtd2n

  42. [42]

    Tezuka, M., Ueda, M.: Density-matrix renormalization group study of trapped imbal- anced fermi condensates. Phys. Rev. Lett.100, 110403 (2008) https://doi.org/10.1103/ PhysRevLett.100.110403 19