Improvise, Adapt, Overcome: An On-The-Fly Multifidelity Algorithm for Efficient Machine Learning

Peter Zaspel; Vivin Vinod

arxiv: 2606.02662 · v1 · pith:OBSNBZ3Onew · submitted 2026-06-01 · 💻 cs.LG · cs.AI· physics.chem-ph

Improvise, Adapt, Overcome: An On-The-Fly Multifidelity Algorithm for Efficient Machine Learning

Vivin Vinod , Peter Zaspel This is my paper

Pith reviewed 2026-06-28 15:20 UTC · model grok-4.3

classification 💻 cs.LG cs.AIphysics.chem-ph

keywords multifidelity machine learningadaptive algorithmsquantum chemistrymachine learningcoupled cluster energiesexcitation energiescost reductiontraining data efficiency

0 comments

The pith

An adaptive on-the-fly multifidelity algorithm decides training data composition dynamically across fidelity levels to cut quantum chemistry costs.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a machine learning framework for quantum chemistry that adapts in real time to choose how many training samples to generate at each accuracy level. Fixed-ratio multifidelity methods often produce redundant data because they do not check whether accuracy has already plateaued at cheaper levels. The new method queries additional samples only when needed at the current fidelity and advances to a higher, more expensive fidelity only after saturation occurs. If the approach holds, models for properties such as coupled-cluster energies and excitation energies reach target accuracy with far less total computation. This directly addresses the bottleneck of expensive reference calculations that limits the size and scope of machine-learned potentials and property predictors.

Core claim

The central claim is that an adaptive multifidelity machine learning procedure, by dynamically querying and adding training samples at each fidelity level, saturates model accuracy at lower fidelities before moving to higher-fidelity reference calculations, thereby reducing data-generation costs by up to a factor of 30 relative to single-fidelity training and by up to a factor of 5 relative to standard fixed-ratio multifidelity schemes across benchmarks on coupled-cluster energies and excitation energies.

What carries the argument

The on-the-fly adaptive algorithm that autonomously queries training samples at successive fidelity levels and decides when accuracy has saturated before advancing.

If this is right

High-accuracy models for coupled-cluster and excitation energies become feasible at substantially lower total computational expense.
Redundant multifidelity data generation is avoided by construction through saturation checks at each level.
The same adaptive logic applies to any chemical property for which calculations of graded accuracy exist.
A cost-aware pathway opens for scaling machine learning to larger systems where data generation was previously prohibitive.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The method could be combined with active-learning selection criteria inside each fidelity to further reduce sample counts.
Similar dynamic fidelity scheduling may transfer to other simulation domains that possess cheap and expensive solvers, such as fluid mechanics or electronic-structure methods beyond chemistry.
Long-term integration with automated workflow engines would allow fully autonomous model construction without manual ratio tuning.

Load-bearing premise

That accuracy at each lower-fidelity level can be driven to its practical maximum by adding samples without overlooking information that only the higher-fidelity calculations can supply.

What would settle it

A benchmark on a new molecular property in which the adaptive method either requires at least as many high-fidelity points as a fixed-ratio multifidelity baseline to reach the same error or produces a higher total cost while matching single-fidelity accuracy.

read the original abstract

Machine learning has accelerated quantum chemistry but is hindered by the prohibitive cost of generating high fidelity training data. Multifidelity machine learning (MFML) mitigates this overhead by systematically combining abundant low fidelity data with sparse high fidelity data. In spite of its success, standard MFML schemes rely on pre-defined scaling factors to determine sparse data ratio across fidelities, often generating redundant multifidelity data resulting in a loss of efficiency. Here, we introduce an adaptive on-the-fly multifidelity framework for machine learning that autonomously determines training dataset composition. By dynamically querying training samples at each fidelity, the algorithm saturates model accuracy at lower fidelities before moving up to more expensive reference calculations. We benchmark the novel adaptive-MFML across diverse chemical properties including the computational chemistry gold standard coupled cluster energies, and the more chemically challenging excitation energies. In our numerical experiments we show that our adaptive algorithm reduces data generation costs by up to a factor of 30 compared to single fidelity methods and improves upon standard MFML by up to a factor of 5. The mitigation of data redundancy establishes a high-accuracy low-cost pathway for sustainable cost-aware machine learning in quantum chemistry.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The adaptive on-the-fly MFML idea targets redundancy in standard schemes but the abstract leaves the stopping rule and validation too opaque to judge the gains.

read the letter

The main takeaway is an adaptive multifidelity framework that decides on the fly how many samples to add at each fidelity level instead of using fixed scaling ratios. It claims to cut data costs by up to 30 times versus single-fidelity training and 5 times versus conventional MFML on coupled-cluster energies and excitation energies.

What is actually new is the dynamic querying procedure that saturates accuracy at cheaper fidelities before requesting expensive higher-fidelity calculations. This directly addresses the redundancy that pre-defined ratios often create. The choice to test both a standard property and a harder one like excitation energies is sensible.

The paper does a clean job laying out the inefficiency in existing MFML and showing the potential payoff in the abstract. If the adaptive logic works as described, the cost savings would matter for groups that want larger or more diverse training sets in quantum chemistry.

The soft spots are the missing details. The abstract reports the numerical wins but gives no description of the saturation test, the validation metric, dataset sizes, or how error is measured. That makes it impossible to check whether the algorithm stops too early when fidelity correlations are weak, which is a real risk for excitation energies. The stress-test concern about hidden accuracy loss therefore stands until the methods are shown.

This is for readers working on cost-aware ML for chemistry who already know standard MFML. A serious referee should see the full implementation and results to decide if the adaptive part delivers without trading off accuracy. I would send it to review.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces an adaptive on-the-fly multifidelity machine learning algorithm that autonomously determines training dataset composition across fidelity levels by dynamically querying samples and saturating model accuracy at lower fidelities before moving to higher-fidelity calculations. It benchmarks the approach on coupled-cluster energies and excitation energies, claiming data-generation cost reductions of up to a factor of 30 versus single-fidelity methods and up to a factor of 5 versus standard MFML.

Significance. If the adaptive saturation procedure functions reliably, the method could substantially lower computational barriers to high-accuracy ML models in quantum chemistry while reducing redundant high-fidelity calculations. The work merits credit for its emphasis on on-the-fly adaptation to mitigate data redundancy and for including benchmarks on both standard (coupled-cluster) and more challenging (excitation energies) properties.

major comments (2)

[Abstract] Abstract: the central efficiency claims depend on the saturation test, yet no description of the stopping rule, cross-validation scheme, validation metric, or error threshold is supplied; without these details it is impossible to evaluate whether lower-fidelity saturation reliably captures all information needed at the target fidelity, especially for excitation energies where inter-fidelity correlations are often weaker.
[Numerical experiments] Numerical experiments section: the reported factors of 30 and 5 are presented without dataset sizes, exclusion criteria, number of independent runs, or error bars, preventing assessment of whether the observed gains are statistically robust or reproducible.

minor comments (1)

[Abstract] Abstract: the phrase 'computational chemistry gold standard' for coupled-cluster energies could be made more precise by specifying the exact level (e.g., CCSD(T)).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below and will revise the manuscript to improve clarity and completeness.

read point-by-point responses

Referee: [Abstract] Abstract: the central efficiency claims depend on the saturation test, yet no description of the stopping rule, cross-validation scheme, validation metric, or error threshold is supplied; without these details it is impossible to evaluate whether lower-fidelity saturation reliably captures all information needed at the target fidelity, especially for excitation energies where inter-fidelity correlations are often weaker.

Authors: We agree the abstract is too terse on this point. The stopping rule (saturation of cross-validation error below a fixed threshold), the 5-fold cross-validation scheme, the MAE validation metric, and the 0.01 eV error threshold are fully specified in Section 3 (Methods). We will add a single sentence to the abstract summarizing these elements. On the specific concern for excitation energies, the numerical results in Section 4 demonstrate that the adaptive procedure still yields the reported cost reductions even when inter-fidelity correlations are weaker, because the algorithm only escalates fidelity once lower-fidelity models have demonstrably saturated. revision: yes
Referee: [Numerical experiments] Numerical experiments section: the reported factors of 30 and 5 are presented without dataset sizes, exclusion criteria, number of independent runs, or error bars, preventing assessment of whether the observed gains are statistically robust or reproducible.

Authors: We accept this criticism. The revised Numerical experiments section will explicitly state the training-set sizes at each fidelity, the exclusion criteria (outlier removal based on energy deviation >3σ), the number of independent runs (10), and error bars (standard deviation across runs). These additions will allow direct evaluation of statistical robustness. revision: yes

Circularity Check

0 steps flagged

No circularity; adaptive MFML is a procedural algorithm with empirical benchmarks

full rationale

The paper presents an on-the-fly adaptive multifidelity algorithm that dynamically queries samples to saturate accuracy at lower fidelities before escalating. No equations, fitted parameters, or self-citations are described that would make the reported cost reductions (factors of 30 vs single-fidelity, 5 vs standard MFML) reduce to inputs by construction. The claims rest on numerical experiments across chemical properties rather than any self-definitional or fitted-input structure. This is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

No free parameters, axioms, or invented entities are identifiable from the abstract alone; the contribution is described purely as an algorithmic change to data acquisition strategy.

pith-pipeline@v0.9.1-grok · 5743 in / 1081 out tokens · 32663 ms · 2026-06-28T15:20:57.802838+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 27 canonical work pages

[1]

Biometrika87(1), 1–13 (2000) https: //doi.org/10.1093/biomet/87.1.1 15

Kennedy, M., O’Hagan, A.: Predicting the output from a complex computer code when fast approximations are available. Biometrika87(1), 1–13 (2000) https: //doi.org/10.1093/biomet/87.1.1 15

work page doi:10.1093/biomet/87.1.1 2000
[2]

Gratiet, L.L., Garnier, J.: Recursive co-kriging model for design of computer experiments with multiple levels of fidelity. Int. J. Uncertainty Quantif.4(5) (2014) https://doi.org/10.1615/Int.J.UncertaintyQuantification.2014006914

work page doi:10.1615/int.j.uncertaintyquantification.2014006914 2014
[3]

Fern´ andez-Godino, M.G.: Review of multi-fidelity models. Adv. Comput. Sci. Eng.1(4), 351–400 (2023) https://doi.org/10.3934/acse.2023015

work page doi:10.3934/acse.2023015 2023
[4]

Dral, P.O.: Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett.11(6), 2336–2347 (2020) https://doi.org/10.1021/acs.jpclett.9b03664

work page doi:10.1021/acs.jpclett.9b03664 2020
[5]

Westermayr, J., Gastegger, M., Sch¨ utt, K.T., Maurer, R.J.: Perspective on inte- grating machine learning into computational chemistry and materials science. J. of Chem. Phys.154(23), 230903 (2021) https://doi.org/10.1063/5.0047760

work page doi:10.1063/5.0047760 2021
[6]

Crawford, T.D., Schaefer III, H.F.: An Introduction to Coupled Cluster Theory for Computational Chemists, pp. 33–136. John Wiley & Sons, Ltd, (2000). Chap

2000
[7]

https://doi.org/10.1002/9780470125915.ch2

work page doi:10.1002/9780470125915.ch2
[8]

Ramakrishnan, R., Dral, P.O., Rupp, M., Lilienfeld, O.A.: Big data meets quan- tum chemistry approximations: The ∆-machine learning approach. J. Chem. The- ory Comput.11(5), 2087–2096 (2015) https://doi.org/10.1021/acs.jctc.5b00099

work page doi:10.1021/acs.jctc.5b00099 2087
[9]

Pilania, G., Gubernatis, J.E., Lookman, T.: Multi-fidelity machine learning mod- els for accurate bandgap predictions of solids. Comput. Mater. Sci.129, 156–163 (2017) https://doi.org/10.1016/j.commatsci.2016.12.004

work page doi:10.1016/j.commatsci.2016.12.004 2017
[10]

Zaspel, P., Huang, B., Harbrecht, H., Von Lilienfeld, O.A.: Boosting quan- tum machine learning models with a multilevel combination technique: Pople Diagrams revisited. J. Chem. Theory Comput.15(3), 1546–1559 (2019) https: //doi.org/10.1021/acs.jctc.8b00832

work page doi:10.1021/acs.jctc.8b00832 2019
[11]

Vinod, V., Maity, S., Zaspel, P., Kleinekath¨ ofer, U.: Multifidelity machine learning for molecular excitation energies. J. Chem. Theory Comput.19(21), 7658–7670 (2023) https://doi.org/10.1021/acs.jctc.3c00882

work page doi:10.1021/acs.jctc.3c00882 2023
[12]

Ruth, M., Gerbig, D., Schreiner, P.R.: Machine learning for bridging the gap between density functional theory and coupled cluster energies. J. Chem. Theory and Comp.19(15), 4912–4920 (2023) https://doi.org/10.1021/acs.jctc.3c00274

work page doi:10.1021/acs.jctc.3c00274 2023
[13]

Schreiner, P., Kleinekath¨ ofer, U., Zaspel, P.: Pre- dicting molecular energies of small organic molecules with multi-fidelity methods

Vinod, V., Lyu, D., Ruth, M., R. Schreiner, P., Kleinekath¨ ofer, U., Zaspel, P.: Pre- dicting molecular energies of small organic molecules with multi-fidelity methods. J. Comp. Chem.46(6), 70056 (2025) https://doi.org/10.1002/jcc.70056

work page doi:10.1002/jcc.70056 2025
[14]

https://arxiv.org/abs/2604.00069

Sandonas, L.M., Balcells, D., Bochkarev, A., Cole, J.M., Deringer, V.L., Dobrautz, W., Ehrenhofer, A., Frank, T., Friederich, P., Friedrich, R., George, J., Ghiringhelli, L., Caldas, A.H., Juraskova, V., Kneiding, H., Lysogorskiy, Y., Margraf, J.T., T¨ urk, H., Lilienfeld, A., Todorovi´ c, M., Tkatchenko, A., Rossi, M., 16 Cuniberti, G.: Perspective: Towa...

arXiv 2026
[15]

Dral, P.O., Owens, A., Dral, A., Cs´ anyi, G.: Hierarchical machine learning of potential energy surfaces. J. Chem. Phys.152(20), 204110 (2020) https://doi. org/10.1063/5.0006498

work page doi:10.1063/5.0006498 2020
[16]

Vinod, V., Zaspel, P.: Benchmarking data efficiency in ∆-ML and multifidelity models for quantum chemistry. J. Chem. Phys.163(2), 024134 (2025) https: //doi.org/10.1063/5.0272457

work page doi:10.1063/5.0272457 2025
[17]

Vinod, V., Zaspel, P.: Investigating data hierarchies in multifidelity machine learning for excitation energies. J. Chem. Theory Comput.21(6), 3077–3091 (2025) https://doi.org/10.1021/acs.jctc.4c01491

work page doi:10.1021/acs.jctc.4c01491 2025
[18]

Lyu, D., Vinod, V., Holzenkamp, M., Holtkamp, Y.M., Maity, S., Salazar, C.R., Kleinekath¨ ofer, U., Zaspel, P.: Excitation energy transfer between porphyrin dyes on a clay surface: A study employing multifidelity machine learning. Adv. Theory Simul.8(11), 00271 (2025) https://doi.org/10.1002/adts.202500271

work page doi:10.1002/adts.202500271 2025
[19]

ChemRxiv2026(0504) (2026) https://doi.org/10.26434/chemrxiv.15002714/v1

Maity, S., Vinod, V., Zaspel, P., Kleinekath¨ ofer, U.: ∆-machine learning for LC- DFT-level excitation energies of bacteriochlorophyll molecules in a LH2 complex. ChemRxiv2026(0504) (2026) https://doi.org/10.26434/chemrxiv.15002714/v1

work page doi:10.26434/chemrxiv.15002714/v1 2026
[20]

Acta Numerica13, 147–269 (2004) https://doi.org/10.1017/S0962492904000182

Bungartz, H.-J., Griebel, M.: Sparse grids. Acta Numerica13, 147–269 (2004) https://doi.org/10.1017/S0962492904000182

work page doi:10.1017/s0962492904000182 2004
[21]

Vinod, V., Kleinekath¨ ofer, U., Zaspel, P.: Optimized multifidelity machine learn- ing for quantum chemistry. Mach. Learn.: Sci. Technol.5(1), 015054 (2024) https://doi.org/10.1088/2632-2153/ad2cef

work page doi:10.1088/2632-2153/ad2cef 2024
[22]

Zhang, L., Zhang, S., Owens, A., Yurchenko, S.N., Dral, P.O.: VIB5 database with accurate ab initio quantum chemical molecular potential energy surfaces. Sci. Data9(1), 84 (2022) https://doi.org/10.1038/s41597-022-01185-w

work page doi:10.1038/s41597-022-01185-w 2022
[23]

Vinod, V., Zaspel, P.: QeMFi: A multifidelity dataset of quantum chemical prop- erties of diverse molecules. Sci. Data12(1), 202 (2025) https://doi.org/10.1038/ s41597-024-04247-3

2025
[24]

Zenodo (2024) https://doi.org/10

Vinod, V., Zaspel, P.: QeMFi: A multifidelity dataset of quantum chemical prop- erties of diverse molecules (1.1.0) [dataset]. Zenodo (2024) https://doi.org/10. 5281/zenodo.13925688

2024
[25]

Pinheiro Jr, M., Zhang, S., Dral, P.O., Barbatti, M.: WS22 database, Wigner Sam- pling and geometry interpolation for configurationally diverse molecular datasets. Sci. Data10(1), 95 (2023) https://doi.org/10.1038/s41597-023-01998-3 17

work page doi:10.1038/s41597-023-01998-3 2023
[26]

Westermayr, J., Marquetand, P.: Machine learning for electronically excited states of molecules. Chem. Rev.121(16), 9873–9926 (2020) https://doi.org/10.1021/ acs.chemrev.0c00749

2020
[27]

Dral, P.O., Barbatti, M.: Molecular excited states through a machine learn- ing lens. Nat. Rev. Chem.5(6), 388–405 (2021) https://doi.org/10.1038/ s41570-021-00278-1

2021
[28]

Smith, J.S., Zubatyuk, R., Nebgen, B., Lubbers, N., Barros, K., Roitberg, A.E., Isayev, O., Tretiak, S.: The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data7(1), 134 (2020) https://doi.org/10.1038/s41597-020-0473-z

work page doi:10.1038/s41597-020-0473-z 2020
[29]

Bartlett, R.J., Musia l, M.: Coupled-cluster theory in quantum chemistry. Rev. Mod. Phys.79, 291–352 (2007) https://doi.org/10.1103/RevModPhys.79.291

work page doi:10.1103/revmodphys.79.291 2007
[30]

Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J.S., Roitberg, A.E.: TorchANI: A free and open source PyTorch-based deep learning implementation of the ani neural network potentials. J. Chem. Inf. Modeling60(7), 3408–3415 (2020) https: //doi.org/10.1021/acs.jcim.0c00451

work page doi:10.1021/acs.jcim.0c00451 2020
[31]

Vinod, V., Zaspel, P.: LFaB: low fidelity as bias for active learning in the chemical configuration space. J. Chem. Theory Comput. (2026) https://doi.org/10.1021/ acs.jctc.6c00009

2026
[32]

Smith, J.S., Nebgen, B., Lubbers, N., Isayev, O., Roitberg, A.E.: Less is more: Sampling chemical space with active learning. J. Chem. Phys.148(24), 241733 (2018) https://doi.org/10.1063/1.5023802

work page doi:10.1063/1.5023802 2018
[33]

Qu, C., Houston, P.L., Conte, R., Nandi, A., Bowman, J.M.: Breaking the coupled cluster barrier for machine-learned potentials of large molecules: The case of 15- atom acetylacetone. J. Phys. Chem. Lett.12(20), 4902–4909 (2021) https://doi. org/10.1021/acs.jpclett.1c01142

work page doi:10.1021/acs.jpclett.1c01142 2021
[34]

Vinod, V., Zaspel, P.: Assessing non-nested configurations of multifidelity machine learning for quantum-chemical properties. Mach. Learn.: Sci. Technol.5(4), 045005 (2024) https://doi.org/10.1088/2632-2153/ad7f25 18

work page doi:10.1088/2632-2153/ad7f25 2024

[1] [1]

Biometrika87(1), 1–13 (2000) https: //doi.org/10.1093/biomet/87.1.1 15

Kennedy, M., O’Hagan, A.: Predicting the output from a complex computer code when fast approximations are available. Biometrika87(1), 1–13 (2000) https: //doi.org/10.1093/biomet/87.1.1 15

work page doi:10.1093/biomet/87.1.1 2000

[2] [2]

Gratiet, L.L., Garnier, J.: Recursive co-kriging model for design of computer experiments with multiple levels of fidelity. Int. J. Uncertainty Quantif.4(5) (2014) https://doi.org/10.1615/Int.J.UncertaintyQuantification.2014006914

work page doi:10.1615/int.j.uncertaintyquantification.2014006914 2014

[3] [3]

Fern´ andez-Godino, M.G.: Review of multi-fidelity models. Adv. Comput. Sci. Eng.1(4), 351–400 (2023) https://doi.org/10.3934/acse.2023015

work page doi:10.3934/acse.2023015 2023

[4] [4]

Dral, P.O.: Quantum chemistry in the age of machine learning. J. Phys. Chem. Lett.11(6), 2336–2347 (2020) https://doi.org/10.1021/acs.jpclett.9b03664

work page doi:10.1021/acs.jpclett.9b03664 2020

[5] [5]

Westermayr, J., Gastegger, M., Sch¨ utt, K.T., Maurer, R.J.: Perspective on inte- grating machine learning into computational chemistry and materials science. J. of Chem. Phys.154(23), 230903 (2021) https://doi.org/10.1063/5.0047760

work page doi:10.1063/5.0047760 2021

[6] [6]

Crawford, T.D., Schaefer III, H.F.: An Introduction to Coupled Cluster Theory for Computational Chemists, pp. 33–136. John Wiley & Sons, Ltd, (2000). Chap

2000

[7] [7]

https://doi.org/10.1002/9780470125915.ch2

work page doi:10.1002/9780470125915.ch2

[8] [8]

Ramakrishnan, R., Dral, P.O., Rupp, M., Lilienfeld, O.A.: Big data meets quan- tum chemistry approximations: The ∆-machine learning approach. J. Chem. The- ory Comput.11(5), 2087–2096 (2015) https://doi.org/10.1021/acs.jctc.5b00099

work page doi:10.1021/acs.jctc.5b00099 2087

[9] [9]

Pilania, G., Gubernatis, J.E., Lookman, T.: Multi-fidelity machine learning mod- els for accurate bandgap predictions of solids. Comput. Mater. Sci.129, 156–163 (2017) https://doi.org/10.1016/j.commatsci.2016.12.004

work page doi:10.1016/j.commatsci.2016.12.004 2017

[10] [10]

Zaspel, P., Huang, B., Harbrecht, H., Von Lilienfeld, O.A.: Boosting quan- tum machine learning models with a multilevel combination technique: Pople Diagrams revisited. J. Chem. Theory Comput.15(3), 1546–1559 (2019) https: //doi.org/10.1021/acs.jctc.8b00832

work page doi:10.1021/acs.jctc.8b00832 2019

[11] [11]

Vinod, V., Maity, S., Zaspel, P., Kleinekath¨ ofer, U.: Multifidelity machine learning for molecular excitation energies. J. Chem. Theory Comput.19(21), 7658–7670 (2023) https://doi.org/10.1021/acs.jctc.3c00882

work page doi:10.1021/acs.jctc.3c00882 2023

[12] [12]

Ruth, M., Gerbig, D., Schreiner, P.R.: Machine learning for bridging the gap between density functional theory and coupled cluster energies. J. Chem. Theory and Comp.19(15), 4912–4920 (2023) https://doi.org/10.1021/acs.jctc.3c00274

work page doi:10.1021/acs.jctc.3c00274 2023

[13] [13]

Schreiner, P., Kleinekath¨ ofer, U., Zaspel, P.: Pre- dicting molecular energies of small organic molecules with multi-fidelity methods

Vinod, V., Lyu, D., Ruth, M., R. Schreiner, P., Kleinekath¨ ofer, U., Zaspel, P.: Pre- dicting molecular energies of small organic molecules with multi-fidelity methods. J. Comp. Chem.46(6), 70056 (2025) https://doi.org/10.1002/jcc.70056

work page doi:10.1002/jcc.70056 2025

[14] [14]

https://arxiv.org/abs/2604.00069

Sandonas, L.M., Balcells, D., Bochkarev, A., Cole, J.M., Deringer, V.L., Dobrautz, W., Ehrenhofer, A., Frank, T., Friederich, P., Friedrich, R., George, J., Ghiringhelli, L., Caldas, A.H., Juraskova, V., Kneiding, H., Lysogorskiy, Y., Margraf, J.T., T¨ urk, H., Lilienfeld, A., Todorovi´ c, M., Tkatchenko, A., Rossi, M., 16 Cuniberti, G.: Perspective: Towa...

arXiv 2026

[15] [15]

Dral, P.O., Owens, A., Dral, A., Cs´ anyi, G.: Hierarchical machine learning of potential energy surfaces. J. Chem. Phys.152(20), 204110 (2020) https://doi. org/10.1063/5.0006498

work page doi:10.1063/5.0006498 2020

[16] [16]

Vinod, V., Zaspel, P.: Benchmarking data efficiency in ∆-ML and multifidelity models for quantum chemistry. J. Chem. Phys.163(2), 024134 (2025) https: //doi.org/10.1063/5.0272457

work page doi:10.1063/5.0272457 2025

[17] [17]

Vinod, V., Zaspel, P.: Investigating data hierarchies in multifidelity machine learning for excitation energies. J. Chem. Theory Comput.21(6), 3077–3091 (2025) https://doi.org/10.1021/acs.jctc.4c01491

work page doi:10.1021/acs.jctc.4c01491 2025

[18] [18]

Lyu, D., Vinod, V., Holzenkamp, M., Holtkamp, Y.M., Maity, S., Salazar, C.R., Kleinekath¨ ofer, U., Zaspel, P.: Excitation energy transfer between porphyrin dyes on a clay surface: A study employing multifidelity machine learning. Adv. Theory Simul.8(11), 00271 (2025) https://doi.org/10.1002/adts.202500271

work page doi:10.1002/adts.202500271 2025

[19] [19]

ChemRxiv2026(0504) (2026) https://doi.org/10.26434/chemrxiv.15002714/v1

Maity, S., Vinod, V., Zaspel, P., Kleinekath¨ ofer, U.: ∆-machine learning for LC- DFT-level excitation energies of bacteriochlorophyll molecules in a LH2 complex. ChemRxiv2026(0504) (2026) https://doi.org/10.26434/chemrxiv.15002714/v1

work page doi:10.26434/chemrxiv.15002714/v1 2026

[20] [20]

Acta Numerica13, 147–269 (2004) https://doi.org/10.1017/S0962492904000182

Bungartz, H.-J., Griebel, M.: Sparse grids. Acta Numerica13, 147–269 (2004) https://doi.org/10.1017/S0962492904000182

work page doi:10.1017/s0962492904000182 2004

[21] [21]

Vinod, V., Kleinekath¨ ofer, U., Zaspel, P.: Optimized multifidelity machine learn- ing for quantum chemistry. Mach. Learn.: Sci. Technol.5(1), 015054 (2024) https://doi.org/10.1088/2632-2153/ad2cef

work page doi:10.1088/2632-2153/ad2cef 2024

[22] [22]

Zhang, L., Zhang, S., Owens, A., Yurchenko, S.N., Dral, P.O.: VIB5 database with accurate ab initio quantum chemical molecular potential energy surfaces. Sci. Data9(1), 84 (2022) https://doi.org/10.1038/s41597-022-01185-w

work page doi:10.1038/s41597-022-01185-w 2022

[23] [23]

Vinod, V., Zaspel, P.: QeMFi: A multifidelity dataset of quantum chemical prop- erties of diverse molecules. Sci. Data12(1), 202 (2025) https://doi.org/10.1038/ s41597-024-04247-3

2025

[24] [24]

Zenodo (2024) https://doi.org/10

Vinod, V., Zaspel, P.: QeMFi: A multifidelity dataset of quantum chemical prop- erties of diverse molecules (1.1.0) [dataset]. Zenodo (2024) https://doi.org/10. 5281/zenodo.13925688

2024

[25] [25]

Pinheiro Jr, M., Zhang, S., Dral, P.O., Barbatti, M.: WS22 database, Wigner Sam- pling and geometry interpolation for configurationally diverse molecular datasets. Sci. Data10(1), 95 (2023) https://doi.org/10.1038/s41597-023-01998-3 17

work page doi:10.1038/s41597-023-01998-3 2023

[26] [26]

Westermayr, J., Marquetand, P.: Machine learning for electronically excited states of molecules. Chem. Rev.121(16), 9873–9926 (2020) https://doi.org/10.1021/ acs.chemrev.0c00749

2020

[27] [27]

Dral, P.O., Barbatti, M.: Molecular excited states through a machine learn- ing lens. Nat. Rev. Chem.5(6), 388–405 (2021) https://doi.org/10.1038/ s41570-021-00278-1

2021

[28] [28]

Smith, J.S., Zubatyuk, R., Nebgen, B., Lubbers, N., Barros, K., Roitberg, A.E., Isayev, O., Tretiak, S.: The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules. Sci. Data7(1), 134 (2020) https://doi.org/10.1038/s41597-020-0473-z

work page doi:10.1038/s41597-020-0473-z 2020

[29] [29]

Bartlett, R.J., Musia l, M.: Coupled-cluster theory in quantum chemistry. Rev. Mod. Phys.79, 291–352 (2007) https://doi.org/10.1103/RevModPhys.79.291

work page doi:10.1103/revmodphys.79.291 2007

[30] [30]

Gao, X., Ramezanghorbani, F., Isayev, O., Smith, J.S., Roitberg, A.E.: TorchANI: A free and open source PyTorch-based deep learning implementation of the ani neural network potentials. J. Chem. Inf. Modeling60(7), 3408–3415 (2020) https: //doi.org/10.1021/acs.jcim.0c00451

work page doi:10.1021/acs.jcim.0c00451 2020

[31] [31]

Vinod, V., Zaspel, P.: LFaB: low fidelity as bias for active learning in the chemical configuration space. J. Chem. Theory Comput. (2026) https://doi.org/10.1021/ acs.jctc.6c00009

2026

[32] [32]

Smith, J.S., Nebgen, B., Lubbers, N., Isayev, O., Roitberg, A.E.: Less is more: Sampling chemical space with active learning. J. Chem. Phys.148(24), 241733 (2018) https://doi.org/10.1063/1.5023802

work page doi:10.1063/1.5023802 2018

[33] [33]

Qu, C., Houston, P.L., Conte, R., Nandi, A., Bowman, J.M.: Breaking the coupled cluster barrier for machine-learned potentials of large molecules: The case of 15- atom acetylacetone. J. Phys. Chem. Lett.12(20), 4902–4909 (2021) https://doi. org/10.1021/acs.jpclett.1c01142

work page doi:10.1021/acs.jpclett.1c01142 2021

[34] [34]

Vinod, V., Zaspel, P.: Assessing non-nested configurations of multifidelity machine learning for quantum-chemical properties. Mach. Learn.: Sci. Technol.5(4), 045005 (2024) https://doi.org/10.1088/2632-2153/ad7f25 18

work page doi:10.1088/2632-2153/ad7f25 2024