Interpretable Meta-Learning for Multi-Objective Chemical Search

Antonio Varagnolo; Michael G. Taylor; Nicholas E. Lubbers; Rapha\"el Pestourie; Yulia Pimonova

arxiv: 2606.20497 · v1 · pith:P7CFK5APnew · submitted 2026-06-18 · 💻 cs.CE · cond-mat.mtrl-sci

Interpretable Meta-Learning for Multi-Objective Chemical Search

Antonio Varagnolo , Yulia Pimonova , Michael G. Taylor , Rapha\"el Pestourie , Nicholas E. Lubbers This is my paper

Pith reviewed 2026-06-26 15:06 UTC · model grok-4.3

classification 💻 cs.CE cond-mat.mtrl-sci

keywords meta-learningmulti-objective optimizationmolecular discoverysurrogate modelsspin-crossover complexesefficient global optimizationuncertainty quantificationchemical search

0 comments

The pith

Meta-learned linear surrogates acquire transferable chemical knowledge that adapts rapidly to new multi-objective molecular searches from limited data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that training linear meta-learning models across multiple chemical objectives together with inexpensive auxiliary properties produces surrogates that pick up transferable molecular patterns. These patterns let the models adjust quickly to fresh sets of competing objectives even when only small amounts of new data are available. The approach matters because chemical space exploration must balance several properties at once while staying within strict limits on computation and data. The pipeline places these models inside an Efficient Global Optimization loop and adds a dynamic recalibration step for uncertainty estimates that change as the search moves into the tails of the distribution. On a large-scale hunt for spin-crossover metal-organic complexes the meta-learning version reaches markedly better Pareto fronts than a standard baseline, which trails by 78 percent, while the adaptive calibration step improves results over more than half of the static settings tested.

Core claim

By training across chemical objectives and cheap auxiliary properties, the meta-learned surrogates acquire transferable chemical knowledge that adapts rapidly to new objectives from limited data. For the first time linear meta-learning is deployed in a multi-objective chemical search setting inside an Efficient Global Optimization framework. On spin-crossover metal-organic complexes the baseline performs 78 percent worse in Pareto sense than the meta-learning alternative. An adaptive confidence-tuning algorithm dynamically recalibrates the exploration-exploitation trade-off as the search evolves and dominates over 50 percent of the statically calibrated front.

What carries the argument

Linear meta-learning models with adaptive-confidence uncertainty quantification integrated into an Efficient Global Optimization framework for multi-objective molecular discovery.

If this is right

Meta-learned surrogates adapt to new objectives from limited data without extensive retraining.
Dynamic confidence tuning improves Pareto fronts over static calibration during active search.
Interpretable linear models support simultaneous handling of multiple competing objectives under computational constraints.
The pipeline outperforms non-meta baselines by 78 percent in Pareto performance on spin-crossover complexes.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same auxiliary-property strategy could transfer to other data-scarce search domains that share cheap computable descriptors.
Replacing deep models with linear meta-learners may lower the barrier to interpretable closed-loop discovery systems.
Explicit tests on larger distribution shifts between auxiliary and target objectives would bound the transfer range.
Coupling the surrogates with automated synthesis planners could close the loop from prediction to experiment.

Load-bearing premise

Linear models trained on auxiliary properties will reliably capture transferable chemical knowledge without requiring extensive hyperparameter tuning or suffering from distribution shift when applied to a new target objective in the multi-objective setting.

What would settle it

A new target objective where the meta-learned model shows no faster adaptation or smaller Pareto error than a baseline trained only on target data.

Figures

Figures reproduced from arXiv: 2606.20497 by Antonio Varagnolo, Michael G. Taylor, Nicholas E. Lubbers, Rapha\"el Pestourie, Yulia Pimonova.

**Figure 1.** Figure 1: Core components of the proposed multi-objective chemical search framework. (a) The metalearning algorithm builds new task-specific vectors as a sum of a component parallel to the subspace spanned by support-task coefficients and an orthogonal residual. (b) Uncertainty is estimated with Bayesian bootstrap ensembles. The concentration parameter α controls the sparsity of Dirichlet-distributed weights and c… view at source ↗

**Figure 2.** Figure 2: Functionalization-based molecular generation and graphlet featurization. (a) Example complex from the final Pareto front. (b) Four representative functionalizations applied to the complex in (a). (c) Simplified illustration of graphleft featurization for the complex in (a). Each molecule is encoded with minervachem fingerprinting before being passed to the models. regions by proposing functionalizations of… view at source ↗

**Figure 3.** Figure 3: Pareto-front coverage on QM9. (a) C-metric results for the QM9 experiment, averaged over five independent runs. The random baseline selects candidates uniformly at random from the proposals generated by the search procedure, corresponding to pure exploration of the candidate space. Both the meta-learning and base models dominate the random Pareto front from the earliest stages of the search. (b) Illustrati… view at source ↗

**Figure 4.** Figure 4: Relative improvement of optima in the marginal distribution of individual QM9 properties. Improvement at generation (t), computed as the ratio between the best objective value found by generation (t) and the best value for the same property in the initial molecular set. Curves show averages over five random seeds. For the electronic gap, both model-based algorithms converge early to a suboptimal region, wh… view at source ↗

**Figure 5.** Figure 5: Pareto-front dominance in the spin-crossover search. (a) C-metric values for the spin-crossover (SCO) experiment, reported as mean ± standard deviation over five runs. The meta-learning pipeline rapidly reaches C-metric values of 0.7–0.85 against the base-only pipeline and maintains this advantage across most generations. The reverse comparison remains much lower: the base-only pipeline rarely exceeds C ∼ … view at source ↗

**Figure 6.** Figure 6: Pareto-optimal complexes, chemical-space exploration, and model interpretability in the SCO experiments. (a) Two-dimensional projections of the Pareto front from the largest search campaign, with all evaluated molecules colored by generation. The search targets complexes with ∆ESCO < 5, as predicted by Architector, high solvation energy, high dipole moment, and low HOMO-LUMO gap, with all properties comput… view at source ↗

**Figure 7.** Figure 7: Property prediction error on meta-selected SCO candidates over the course of search. Prediction error on generation T +1, evaluated over four independent runs after fitting the meta-learning and base models on all generations ≤ T. Green regions indicate iterations where the meta-learning model has lower error, red regions indicate iterations where the base model has lower error. Across the 480 predictions … view at source ↗

**Figure 8.** Figure 8: Out-of-distribution robustness and uncertainty calibration for the spin-crossover target across training and testing generations. The top-left panel shows column-normalized RMSE, the top-right panel reports the error difference between base and meta-learning models, and the bottom panels show calibration effects. Positive values indicate overconfidence, whereas negative values indicate underconfidence. 16 … view at source ↗

**Figure 9.** Figure 9: (a) Evolution of the evaluated molecule distribution in a two-dimensional PCA embedding of [PITH_FULL_IMAGE:figures/full_fig_p018_9.png] view at source ↗

**Figure 10.** Figure 10: Meta-learning ablation across SCO objectives, showing parameter-vector geometry, coefficient [PITH_FULL_IMAGE:figures/full_fig_p019_10.png] view at source ↗

**Figure 11.** Figure 11: Column-normalized RMSE for each property. [PITH_FULL_IMAGE:figures/full_fig_p025_11.png] view at source ↗

**Figure 12.** Figure 12: RMSE error difference for each property. [PITH_FULL_IMAGE:figures/full_fig_p026_12.png] view at source ↗

**Figure 13.** Figure 13: Mean Calibration for the linear meta model for each property [PITH_FULL_IMAGE:figures/full_fig_p026_13.png] view at source ↗

**Figure 14.** Figure 14: Median Calibration for linear meta model for each property [PITH_FULL_IMAGE:figures/full_fig_p027_14.png] view at source ↗

read the original abstract

Navigating the vast space of synthetically accessible molecules demands surrogate models that are interpretable and capable of handling multiple competing objectives at the same time. Deep learning approaches struggle to satisfy them under the computational constraints of quantum-level chemistry. Here, we introduce a modular pipeline that combines interpretable linear meta-learning models and adaptive-confidence uncertainty quantification into an Efficient Global Optimization (EGO) framework for multi-objective molecular discovery. For the first time, linear meta-learning is deployed in a multi-objective chemical search setting: by training across chemical objectives and cheap auxiliary properties, the meta-learned surrogates acquire transferable chemical knowledge that adapts rapidly to new objectives from limited data. Evaluated empirically on a real large scale search for spin-crossover metal-organic complexes, the baseline performs 78% worse in Pareto sense than the meta-learning alternative. We also address the calibration challenges inherent to active search. Since optimal candidates typically lie precisely in the distributional tails, standard uncertainty estimates fail. We introduce an adaptive confidence-tuning algorithm that dynamically recalibrates the exploration-exploitation trade-off as the molecular search evolves. Empirically, dynamic confidence tuning further dominates over 50% of the statically calibrated front.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Linear meta-learning in multi-objective molecular search is a sensible direction but the 78% Pareto claim rests on thin evidence from the abstract alone.

read the letter

The paper's main move is putting linear meta-learning into an EGO pipeline for multi-objective chemical search, training on auxiliary properties to get fast adaptation to new targets like spin-crossover complexes, plus an adaptive confidence tuner for tail events. That combination has not been tried in this exact setting before, and the focus on interpretability plus data efficiency is a practical response to the limits of deep models under quantum constraints.

They get credit for framing the problem around real constraints in materials discovery and for trying to handle calibration when optima sit in the tails. The modular pipeline idea is straightforward and could be useful to groups already using surrogate-based search.

The soft spots are more basic. The abstract reports a 78% Pareto improvement and 50% dominance from dynamic tuning but gives no dataset size, baseline code or implementation details, exact metric definition, or error analysis. Without those, the numbers are hard to judge. The stress-test concern about distribution shift between auxiliary properties and the target objective also lands: linear models have limited capacity, and if the feature correlations do not carry over, few-shot adaptation will not deliver the claimed edge. No ablations are mentioned that would separate meta-learning from plain multi-task linear regression.

This is for computational chemists or materials people who already run EGO-style searches and want something more interpretable than neural nets. A reader who needs reproducible methods or strong baselines will find the current write-up frustrating.

I would send it to peer review. The idea is worth a proper methods check even if the current evidence looks preliminary.

Referee Report

3 major / 1 minor

Summary. The paper introduces a modular pipeline combining interpretable linear meta-learning models with adaptive-confidence uncertainty quantification inside an Efficient Global Optimization (EGO) framework for multi-objective molecular discovery. It claims that training across chemical objectives and cheap auxiliary properties allows the meta-learned linear surrogates to acquire transferable chemical knowledge that enables rapid adaptation to new objectives from limited data. On a large-scale search for spin-crossover metal-organic complexes, the baseline is reported to perform 78% worse in a Pareto sense than the meta-learning approach, while dynamic confidence tuning is said to dominate over 50% of the statically calibrated Pareto front.

Significance. If the empirical claims hold after proper validation, the work would demonstrate that linear meta-learning can deliver interpretable, data-efficient surrogates for multi-objective chemical search where deep learning is computationally prohibitive. The combination of meta-learning with adaptive uncertainty calibration addresses a recognized challenge in active search (tail calibration), and the emphasis on linear models offers interpretability advantages. However, the absence of ablations and distribution-shift checks limits the ability to credit the claimed gains specifically to meta-learning rather than multi-task linear regression.

major comments (3)

[Abstract] Abstract: The headline claim of a 78% Pareto improvement and 50% dominance of dynamic tuning supplies no details on dataset size, the precise Pareto metric (e.g., hypervolume, coverage), baseline implementation, number of independent runs, or statistical testing. Without these, the central empirical result cannot be evaluated for robustness or reproducibility.
[Abstract] Abstract / Methods (assumed §3): No ablation is presented that isolates meta-learning from plain multi-task linear regression on the same auxiliary properties. The reported advantage could therefore arise from multi-task training alone rather than the meta-learning mechanism, undermining attribution of the 78% gain to transferable knowledge acquisition.
[Abstract] Abstract: The assumption that auxiliary properties lie in the same distribution as the target spin-crossover objective (or induce comparable linear correlations) is stated without validation or feature-space overlap analysis. Linear models lack capacity for nonlinear interactions; if distribution shift exists, few-shot adaptation cannot reliably recover the claimed performance, making this a load-bearing untested premise for the transferability narrative.

minor comments (1)

[Abstract] The abstract refers to 'the first time' linear meta-learning is deployed in this setting; a brief literature comparison table would clarify novelty relative to prior multi-task or meta-learning work in cheminformatics.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive and detailed feedback. We address each major comment point by point below, with proposed revisions where appropriate to strengthen the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract: The headline claim of a 78% Pareto improvement and 50% dominance of dynamic tuning supplies no details on dataset size, the precise Pareto metric (e.g., hypervolume, coverage), baseline implementation, number of independent runs, or statistical testing. Without these, the central empirical result cannot be evaluated for robustness or reproducibility.

Authors: We agree that the abstract would benefit from greater specificity to support immediate evaluation of the claims. The full manuscript reports these details in the Experiments and Results sections (including the spin-crossover dataset, hypervolume as the Pareto metric, baseline implementation, multiple independent runs, and statistical testing). To improve self-containment of the abstract, we will revise it to concisely incorporate the Pareto metric, number of runs, and mention of statistical significance. revision: yes
Referee: [Abstract] Abstract / Methods (assumed §3): No ablation is presented that isolates meta-learning from plain multi-task linear regression on the same auxiliary properties. The reported advantage could therefore arise from multi-task training alone rather than the meta-learning mechanism, undermining attribution of the 78% gain to transferable knowledge acquisition.

Authors: This is a fair observation. While the method is explicitly framed as meta-learning (learning a transferable initialization across objectives for rapid few-shot adaptation), an explicit ablation against standard multi-task linear regression would strengthen attribution. We will add this ablation in the revised manuscript, directly comparing the meta-learned surrogates to a multi-task linear regression baseline trained on the same auxiliary properties. revision: yes
Referee: [Abstract] Abstract: The assumption that auxiliary properties lie in the same distribution as the target spin-crossover objective (or induce comparable linear correlations) is stated without validation or feature-space overlap analysis. Linear models lack capacity for nonlinear interactions; if distribution shift exists, few-shot adaptation cannot reliably recover the claimed performance, making this a load-bearing untested premise for the transferability narrative.

Authors: The auxiliary properties were chosen on the basis of known chemical correlations with spin-crossover behavior and share the identical molecular descriptor feature space. We acknowledge that an explicit validation of distributional overlap would reinforce the premise. In the revision we will add a feature-space overlap and correlation analysis between auxiliary and target properties to support the transferability claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; empirical results stand independently

full rationale

The paper's central claims rest on empirical comparisons of meta-learned linear surrogates versus baselines in a multi-objective molecular search for spin-crossover complexes, with performance metrics (e.g., 78% Pareto improvement) derived from held-out evaluations rather than any self-referential definitions, fitted inputs renamed as predictions, or load-bearing self-citations. No equations are presented that reduce the claimed transferable knowledge or adaptation to inputs by construction; the pipeline description in the abstract frames results as data-driven outcomes without invoking uniqueness theorems or ansatzes from prior author work. The derivation chain is self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; no explicit free parameters, axioms, or invented entities are stated or derivable. Full manuscript would be required to audit modeling assumptions such as linearity of the surrogate or the form of the adaptive confidence update.

pith-pipeline@v0.9.1-grok · 5755 in / 1069 out tokens · 14490 ms · 2026-06-26T15:06:38.106428+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

61 extracted references · 7 canonical work pages

[1]

The chemical space project

Jean-Louis Reymond. “The chemical space project”. In:Accounts of chemical research48.3 (2015), pp. 722–730

2015
[2]

Accelerating the discovery of materials for clean energy in the era of smart automation

Daniel P Tabor et al. “Accelerating the discovery of materials for clean energy in the era of smart automation”. In:Nature reviews materials3.5 (2018), pp. 5–20

2018
[3]

Deep learning for molecular design—a review of the state of the art

Daniel C Elton et al. “Deep learning for molecular design—a review of the state of the art”. In:Molecular Systems Design & Engineering4.4 (2019), pp. 828–849

2019
[4]

Exploring chemical compound space with quantum-based machine learning

O Anatole von Lilienfeld, Klaus-Robert Müller, and Alexandre Tkatchenko. “Exploring chemical compound space with quantum-based machine learning”. In:Nature Reviews Chemistry4.7 (2020), pp. 347–358

2020
[5]

Hierarchical modeling of molecular energies using a deep neural network

Nicholas Lubbers, Justin S. Smith, and Kipton Barros. “Hierarchical modeling of molecular energies using a deep neural network”. In:The Journal of Chemical Physics148.24 (2018), p. 241715.DOI: 10.1063/1.5011181

work page doi:10.1063/1.5011181 2018
[6]

Chemprop: a machine learning package for chemical property prediction

Esther Heid et al. “Chemprop: a machine learning package for chemical property prediction”. In:Journal of chemical information and modeling64.1 (2023), pp. 9–17

2023
[7]

Designing in the face of uncertainty: exploiting electronic structure and machine learning models for discovery in inorganic chemistry

Jon Paul Janet et al. “Designing in the face of uncertainty: exploiting electronic structure and machine learning models for discovery in inorganic chemistry”. In:Inorganic chemistry58.16 (2019), pp. 10592– 10606

2019
[8]

Machine learning in chemical reaction space

Sina Stocker et al. “Machine learning in chemical reaction space”. In:Nature communications11.1 (2020), p. 5505

2020
[9]

Utilizing deep learning to explore chemical space for drug lead optimization

Rajkumar Chakraborty and Yasha Hasija. “Utilizing deep learning to explore chemical space for drug lead optimization”. In:Expert Systems with Applications229 (2023), p. 120592. 21

2023
[10]

Accurate multiobjective design in a space of millions of transition metal complexes with neural-network-driven efficient global optimization

Jon Paul Janet et al. “Accurate multiobjective design in a space of millions of transition metal complexes with neural-network-driven efficient global optimization”. In:ACS central science6.4 (2020), pp. 513– 524

2020
[11]

Automatic chemical design using a data-driven continuous representa- tion of molecules

Rafael Gómez-Bombarelli et al. “Automatic chemical design using a data-driven continuous representa- tion of molecules”. In:ACS central science4.2 (2018), pp. 268–276

2018
[12]

Davide Rigoni, Nicolò Navarin, and Alessandro Sperduti.Conditional Constrained Graph Variational Autoencoders for Molecule Design. 2020. arXiv: 2009.00725 [cs.LG].URL: https://arxiv.org/ abs/2009.00725

arXiv 2020
[13]

Clement Vignac et al.DiGress: Discrete Denoising diffusion for graph generation. 2023. arXiv: 2209. 14734 [cs.LG].URL:https://arxiv.org/abs/2209.14734

arXiv 2023
[14]

Gang Liu et al.Graph Diffusion Transformers for Multi-Conditional Molecular Generation. 2024. arXiv: 2401.13858 [cs.LG].URL:https://arxiv.org/abs/2401.13858

arXiv 2024
[15]

Generative models as an emerging paradigm in the chemical sciences

Dylan M Anstine and Olexandr Isayev. “Generative models as an emerging paradigm in the chemical sciences”. In:Journal of the American Chemical Society145.16 (2023), pp. 8736–8750

2023
[16]

Deep generative molecular design reshapes drug discovery

Xiangxiang Zeng et al. “Deep generative molecular design reshapes drug discovery”. In:Cell Reports Medicine3.12 (2022)

2022
[17]

A feature-aligned diffusion model for controllable generation of 3D drug-like molecules

Hao Lu et al. “A feature-aligned diffusion model for controllable generation of 3D drug-like molecules”. In:Digital Discovery(2026)

2026
[18]

Harnessing generative AI for efficient organic materials discovery in low-data regimes

Jun Hyeong Kim et al. “Harnessing generative AI for efficient organic materials discovery in low-data regimes”. In:Digital Discovery(2026)

2026
[19]

Foundation models for discovery and exploration in chemical space

Alexius Wadell et al. “Foundation models for discovery and exploration in chemical space”. In:arXiv preprint arXiv:2510.18900(2025)

Pith/arXiv arXiv 2025
[20]

MolE: a molecular foundation model for drug discovery

Oscar Méndez-Lucio, Christos Nicolaou, and Berton Earnshaw. “MolE: a molecular foundation model for drug discovery”. In:arXiv preprint arXiv:2211.02657(2022)

arXiv 2022
[21]

Active learning enables extrapolation in molecular generative models

Evan R Antoniuk et al. “Active learning enables extrapolation in molecular generative models”. In:arXiv preprint arXiv:2501.02059(2025)

arXiv 2025
[22]

Computer-aided multi-objective optimization in small molecule discovery

Jenna C Fromer and Connor W Coley. “Computer-aided multi-objective optimization in small molecule discovery”. In:Patterns4.2 (2023)

2023
[23]

Pareto optimization to accelerate multi-objective virtual screening

Jenna C Fromer, David E Graff, and Connor W Coley. “Pareto optimization to accelerate multi-objective virtual screening”. In:Digital Discovery3.3 (2024), pp. 467–481

2024
[24]

Examining multi-objective deep reinforcement learning frameworks for molecular design

Aws Al-Jumaily et al. “Examining multi-objective deep reinforcement learning frameworks for molecular design”. In:Biosystems232 (2023), p. 104989

2023
[25]

Linear graphlet models for accurate and interpretable cheminformatics

Michael Tynes et al. “Linear graphlet models for accurate and interpretable cheminformatics”. In:Digital Discovery3.10 (2024), pp. –.DOI:10.1039/D4DD00089G

work page doi:10.1039/d4dd00089g 2024
[26]

Meta-Learning Linear Models for Molecular Property Prediction

Yulia Pimonova et al. “Meta-Learning Linear Models for Molecular Property Prediction”. In:arXiv preprint arXiv:2509.13527(2025). 22

arXiv 2025
[27]

A survey of deep meta-learning

Mike Huisman, Jan N Van Rijn, and Aske Plaat. “A survey of deep meta-learning”. In:Artificial Intelligence Review54.6 (2021), pp. 4483–4541

2021
[28]

Advances and challenges in meta-learning: A technical review

Anna Vettoruzzo et al. “Advances and challenges in meta-learning: A technical review”. In:IEEE transactions on pattern analysis and machine intelligence46.7 (2024), pp. 4763–4779

2024
[29]

A comprehensive survey on meta-learning: Applications, advances, and challenges

Jingyao Wang. “A comprehensive survey on meta-learning: Applications, advances, and challenges”. In: Authorea Preprints(2024)

2024
[30]

Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning

Alice EA Allen et al. “Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning”. In:npj computational materials10.1 (2024), p. 154

2024
[31]

Meta-learning accelerates multi-objective Bayesian optimization of chemical reaction: a monoacylation case study

Guihua Luo et al. “Meta-learning accelerates multi-objective Bayesian optimization of chemical reaction: a monoacylation case study”. In:Journal of Industrial and Engineering Chemistry(2026)

2026
[32]

Meta-learning innovates chemical kinetics: An efficient approach for surrogate model construction

Chenyue Tao et al. “Meta-learning innovates chemical kinetics: An efficient approach for surrogate model construction”. In:Proceedings of the Combustion Institute41 (2025), p. 105860

2025
[33]

End-to-end sequence-structure-function meta-learning predicts genome-wide chemical- protein interactions for dark proteins

Tian Cai et al. “End-to-end sequence-structure-function meta-learning predicts genome-wide chemical- protein interactions for dark proteins”. In:PLOS Computational Biology19.1 (2023), e1010851

2023
[34]

Harnessing surrogate models for data-efficient predictive chemistry: descriptors vs. learned hidden representations

Guanming Chen and Thijs Stuyver. “Harnessing surrogate models for data-efficient predictive chemistry: descriptors vs. learned hidden representations”. In:Digital Discovery4.11 (2025), pp. 3227–3237

2025
[35]

Quantum chemistry structures and properties of 134 kilo molecules

Raghunathan Ramakrishnan et al. “Quantum chemistry structures and properties of 134 kilo molecules”. In:Scientific data1.1 (2014), pp. 1–7

2014
[36]

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

Justin S. Smith et al. “The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules”. In:Scientific Data7.1 (2020), p. 134.DOI: 10.1038/s41597-020-0473-z

work page doi:10.1038/s41597-020-0473-z 2020
[37]

Quantitative correlation of physical and chemical properties with chemical structure: utility for prediction

Alan R Katritzky et al. “Quantitative correlation of physical and chemical properties with chemical structure: utility for prediction”. In:Chemical reviews110.10 (2010), pp. 5714–5789

2010
[38]

Less is more: Sampling chemical space with active learning

Justin S Smith et al. “Less is more: Sampling chemical space with active learning”. In:The Journal of chemical physics148.24 (2018)

2018
[39]

Uncertainty-driven dynamics for active learning of interatomic potentials

Maksim Kulichenko et al. “Uncertainty-driven dynamics for active learning of interatomic potentials”. In:Nature computational science3.3 (2023), pp. 230–239

2023
[40]

A quantitative uncertainty metric controls error in neural network-driven chemical discovery

Jon Paul Janet et al. “A quantitative uncertainty metric controls error in neural network-driven chemical discovery”. In:Chemical science10.34 (2019), pp. 7913–7922

2019
[41]

Efficient Global Optimization of Expensive Black-Box Functions

Donald R. Jones, Matthias Schonlau, and William J. Welch. “Efficient Global Optimization of Expensive Black-Box Functions”. In:Journal of Global Optimization. V ol. 13. 4. 1998, pp. 455–492.DOI: 10. 1023/A:1008306431147

1998
[42]

John Wiley & Sons, 2013

Malcolm A Halcrow.Spin-crossover materials: properties and applications. John Wiley & Sons, 2013

2013
[43]

How to Switch Spin-Crossover Metal Complexes at Constant Room Tempera- ture

Marat M Khusniyarov. “How to Switch Spin-Crossover Metal Complexes at Constant Room Tempera- ture”. In:Chemistry–A European Journal22.43 (2016), pp. 15178–15191

2016
[44]

Accelerating chemical discovery with machine learning: simulated evolution of spin crossover complexes with an artificial neural network

Jon Paul Janet, Lydia Chan, and Heather J Kulik. “Accelerating chemical discovery with machine learning: simulated evolution of spin crossover complexes with an artificial neural network”. In:The journal of physical chemistry letters9.5 (2018), pp. 1064–1071. 23

2018
[45]

Machine Learning Prediction of the Experimental Transition Temperature of Fe (II) Spin-Crossover Complexes

Vyshnavi Vennelakanti et al. “Machine Learning Prediction of the Experimental Transition Temperature of Fe (II) Spin-Crossover Complexes”. In:The Journal of Physical Chemistry A128.1 (2023), pp. 204– 216

2023
[46]

http : / / www

Greg Landrum et al.RDKit: Open-source cheminformatics. http : / / www . rdkit . org. Accessed: 2026-05-04

2026
[47]

The Annals of Statistics , author =

Donald B. Rubin. “The Bayesian Bootstrap”. In:The Annals of Statistics9.1 (1981), pp. 130–134.DOI: 10.1214/aos/1176345338

work page doi:10.1214/aos/1176345338 1981
[48]

Statistical Improvement Criteria for Use in Multiobjective Design Optimization

A. J. Keane. “Statistical Improvement Criteria for Use in Multiobjective Design Optimization”. In:AIAA Journal44.4 (2006), pp. 879–891.DOI:10.2514/1.16875

work page doi:10.2514/1.16875 2006
[49]

Recent Advances in Surrogate-Based Optimization

Alexander I. J. Forrester and Andy J. Keane. “Recent Advances in Surrogate-Based Optimization”. In: Progress in Aerospace Sciences45.1-3 (2009), pp. 50–79

2009
[50]

Stochastic Multi-Objective Optimization: A Survey on Non-Scalarizing Methods

Walter J. Gutjahr and Alois Pichler. “Stochastic Multi-Objective Optimization: A Survey on Non-Scalarizing Methods”. In:Annals of Operations Research236.2 (2016), pp. 475–499

2016
[51]

Predictive Entropy Search for Multi-objective Bayesian Optimization

Daniel Hernández-Lobato et al. “Predictive Entropy Search for Multi-objective Bayesian Optimization”. In:Proceedings of the 33rd International Conference on Machine Learning (ICML). V ol. 48. Proceedings of Machine Learning Research. 2016, pp. 1492–1501

2016
[52]

A Review of Multi-Objective Optimization: Methods and Its Applications

Nyoman Gunantara. “A Review of Multi-Objective Optimization: Methods and Its Applications”. In: Cogent Engineering5.1 (Jan. 1, 2018). Ed. by Qingsong Ai, p. 1502242.DOI: 10.1080/23311916. 2018.1502242

work page doi:10.1080/23311916 2018
[53]

The Cambridge structural database

Colin R Groom et al. “The Cambridge structural database”. In:Structural Science72.2 (2016), pp. 171– 179

2016
[54]

Architector for high-throughput cross-periodic table 3D complex building

Michael G. Taylor et al. “Architector for high-throughput cross-periodic table 3D complex building”. In: Nature Communications14.1 (2023), p. 2786.DOI:10.1038/s41467-023-38169-2

work page doi:10.1038/s41467-023-38169-2 2023
[55]

Los Alamos National Laboratory (LANL) Innovation Team.Architector: High -throughput 3D metal complex builder.https://github.com/lanl/Architector. 2023

2023
[56]

GFN2-xTB—An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions

Christoph Bannwarth, Sebastian Ehlert, and Stefan Grimme. “GFN2-xTB—An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions”. In:Journal of chemical theory and computation15.3 (2019), pp. 1652–1671

2019
[57]

Extended tight-binding quantum chemistry methods

Christoph Bannwarth et al. “Extended tight-binding quantum chemistry methods”. In:Wiley Interdisci- plinary Reviews: Computational Molecular Science11.2 (2021), e1493

2021
[58]

Comparison of multiobjective evolutionary algo- rithms: Empirical results

Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele. “Comparison of multiobjective evolutionary algo- rithms: Empirical results”. In:Evolutionary computation8.2 (2000), pp. 173–195

2000
[59]

The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini

Lidia Ceriani and Paolo Verme. “The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini”. In:The Journal of Economic Inequality10.3 (2012), pp. 421–443

1912
[60]

Measurement of diversity

Edward H Simpson. “Measurement of diversity”. In:nature163.4148 (1949), pp. 688–688

1949
[61]

Princeton university press, 2015

Richard E Bellman.Adaptive control processes: a guided tour. Princeton university press, 2015. 24 7 Supplementary Information Results reported in Fig. 3, Fig. 4 are averages and standard deviations for 10 QM9 experiment runs. Fig. 5 and Fig. 7 are ensembles of 4 runs for the SCO experiment experiment. 7.1 Architector Experiment Specifications Pool Ligands...

2015

[1] [1]

The chemical space project

Jean-Louis Reymond. “The chemical space project”. In:Accounts of chemical research48.3 (2015), pp. 722–730

2015

[2] [2]

Accelerating the discovery of materials for clean energy in the era of smart automation

Daniel P Tabor et al. “Accelerating the discovery of materials for clean energy in the era of smart automation”. In:Nature reviews materials3.5 (2018), pp. 5–20

2018

[3] [3]

Deep learning for molecular design—a review of the state of the art

Daniel C Elton et al. “Deep learning for molecular design—a review of the state of the art”. In:Molecular Systems Design & Engineering4.4 (2019), pp. 828–849

2019

[4] [4]

Exploring chemical compound space with quantum-based machine learning

O Anatole von Lilienfeld, Klaus-Robert Müller, and Alexandre Tkatchenko. “Exploring chemical compound space with quantum-based machine learning”. In:Nature Reviews Chemistry4.7 (2020), pp. 347–358

2020

[5] [5]

Hierarchical modeling of molecular energies using a deep neural network

Nicholas Lubbers, Justin S. Smith, and Kipton Barros. “Hierarchical modeling of molecular energies using a deep neural network”. In:The Journal of Chemical Physics148.24 (2018), p. 241715.DOI: 10.1063/1.5011181

work page doi:10.1063/1.5011181 2018

[6] [6]

Chemprop: a machine learning package for chemical property prediction

Esther Heid et al. “Chemprop: a machine learning package for chemical property prediction”. In:Journal of chemical information and modeling64.1 (2023), pp. 9–17

2023

[7] [7]

Designing in the face of uncertainty: exploiting electronic structure and machine learning models for discovery in inorganic chemistry

Jon Paul Janet et al. “Designing in the face of uncertainty: exploiting electronic structure and machine learning models for discovery in inorganic chemistry”. In:Inorganic chemistry58.16 (2019), pp. 10592– 10606

2019

[8] [8]

Machine learning in chemical reaction space

Sina Stocker et al. “Machine learning in chemical reaction space”. In:Nature communications11.1 (2020), p. 5505

2020

[9] [9]

Utilizing deep learning to explore chemical space for drug lead optimization

Rajkumar Chakraborty and Yasha Hasija. “Utilizing deep learning to explore chemical space for drug lead optimization”. In:Expert Systems with Applications229 (2023), p. 120592. 21

2023

[10] [10]

Accurate multiobjective design in a space of millions of transition metal complexes with neural-network-driven efficient global optimization

Jon Paul Janet et al. “Accurate multiobjective design in a space of millions of transition metal complexes with neural-network-driven efficient global optimization”. In:ACS central science6.4 (2020), pp. 513– 524

2020

[11] [11]

Automatic chemical design using a data-driven continuous representa- tion of molecules

Rafael Gómez-Bombarelli et al. “Automatic chemical design using a data-driven continuous representa- tion of molecules”. In:ACS central science4.2 (2018), pp. 268–276

2018

[12] [12]

Davide Rigoni, Nicolò Navarin, and Alessandro Sperduti.Conditional Constrained Graph Variational Autoencoders for Molecule Design. 2020. arXiv: 2009.00725 [cs.LG].URL: https://arxiv.org/ abs/2009.00725

arXiv 2020

[13] [13]

Clement Vignac et al.DiGress: Discrete Denoising diffusion for graph generation. 2023. arXiv: 2209. 14734 [cs.LG].URL:https://arxiv.org/abs/2209.14734

arXiv 2023

[14] [14]

Gang Liu et al.Graph Diffusion Transformers for Multi-Conditional Molecular Generation. 2024. arXiv: 2401.13858 [cs.LG].URL:https://arxiv.org/abs/2401.13858

arXiv 2024

[15] [15]

Generative models as an emerging paradigm in the chemical sciences

Dylan M Anstine and Olexandr Isayev. “Generative models as an emerging paradigm in the chemical sciences”. In:Journal of the American Chemical Society145.16 (2023), pp. 8736–8750

2023

[16] [16]

Deep generative molecular design reshapes drug discovery

Xiangxiang Zeng et al. “Deep generative molecular design reshapes drug discovery”. In:Cell Reports Medicine3.12 (2022)

2022

[17] [17]

A feature-aligned diffusion model for controllable generation of 3D drug-like molecules

Hao Lu et al. “A feature-aligned diffusion model for controllable generation of 3D drug-like molecules”. In:Digital Discovery(2026)

2026

[18] [18]

Harnessing generative AI for efficient organic materials discovery in low-data regimes

Jun Hyeong Kim et al. “Harnessing generative AI for efficient organic materials discovery in low-data regimes”. In:Digital Discovery(2026)

2026

[19] [19]

Foundation models for discovery and exploration in chemical space

Alexius Wadell et al. “Foundation models for discovery and exploration in chemical space”. In:arXiv preprint arXiv:2510.18900(2025)

Pith/arXiv arXiv 2025

[20] [20]

MolE: a molecular foundation model for drug discovery

Oscar Méndez-Lucio, Christos Nicolaou, and Berton Earnshaw. “MolE: a molecular foundation model for drug discovery”. In:arXiv preprint arXiv:2211.02657(2022)

arXiv 2022

[21] [21]

Active learning enables extrapolation in molecular generative models

Evan R Antoniuk et al. “Active learning enables extrapolation in molecular generative models”. In:arXiv preprint arXiv:2501.02059(2025)

arXiv 2025

[22] [22]

Computer-aided multi-objective optimization in small molecule discovery

Jenna C Fromer and Connor W Coley. “Computer-aided multi-objective optimization in small molecule discovery”. In:Patterns4.2 (2023)

2023

[23] [23]

Pareto optimization to accelerate multi-objective virtual screening

Jenna C Fromer, David E Graff, and Connor W Coley. “Pareto optimization to accelerate multi-objective virtual screening”. In:Digital Discovery3.3 (2024), pp. 467–481

2024

[24] [24]

Examining multi-objective deep reinforcement learning frameworks for molecular design

Aws Al-Jumaily et al. “Examining multi-objective deep reinforcement learning frameworks for molecular design”. In:Biosystems232 (2023), p. 104989

2023

[25] [25]

Linear graphlet models for accurate and interpretable cheminformatics

Michael Tynes et al. “Linear graphlet models for accurate and interpretable cheminformatics”. In:Digital Discovery3.10 (2024), pp. –.DOI:10.1039/D4DD00089G

work page doi:10.1039/d4dd00089g 2024

[26] [26]

Meta-Learning Linear Models for Molecular Property Prediction

Yulia Pimonova et al. “Meta-Learning Linear Models for Molecular Property Prediction”. In:arXiv preprint arXiv:2509.13527(2025). 22

arXiv 2025

[27] [27]

A survey of deep meta-learning

Mike Huisman, Jan N Van Rijn, and Aske Plaat. “A survey of deep meta-learning”. In:Artificial Intelligence Review54.6 (2021), pp. 4483–4541

2021

[28] [28]

Advances and challenges in meta-learning: A technical review

Anna Vettoruzzo et al. “Advances and challenges in meta-learning: A technical review”. In:IEEE transactions on pattern analysis and machine intelligence46.7 (2024), pp. 4763–4779

2024

[29] [29]

A comprehensive survey on meta-learning: Applications, advances, and challenges

Jingyao Wang. “A comprehensive survey on meta-learning: Applications, advances, and challenges”. In: Authorea Preprints(2024)

2024

[30] [30]

Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning

Alice EA Allen et al. “Learning together: Towards foundation models for machine learning interatomic potentials with meta-learning”. In:npj computational materials10.1 (2024), p. 154

2024

[31] [31]

Meta-learning accelerates multi-objective Bayesian optimization of chemical reaction: a monoacylation case study

Guihua Luo et al. “Meta-learning accelerates multi-objective Bayesian optimization of chemical reaction: a monoacylation case study”. In:Journal of Industrial and Engineering Chemistry(2026)

2026

[32] [32]

Meta-learning innovates chemical kinetics: An efficient approach for surrogate model construction

Chenyue Tao et al. “Meta-learning innovates chemical kinetics: An efficient approach for surrogate model construction”. In:Proceedings of the Combustion Institute41 (2025), p. 105860

2025

[33] [33]

End-to-end sequence-structure-function meta-learning predicts genome-wide chemical- protein interactions for dark proteins

Tian Cai et al. “End-to-end sequence-structure-function meta-learning predicts genome-wide chemical- protein interactions for dark proteins”. In:PLOS Computational Biology19.1 (2023), e1010851

2023

[34] [34]

Harnessing surrogate models for data-efficient predictive chemistry: descriptors vs. learned hidden representations

Guanming Chen and Thijs Stuyver. “Harnessing surrogate models for data-efficient predictive chemistry: descriptors vs. learned hidden representations”. In:Digital Discovery4.11 (2025), pp. 3227–3237

2025

[35] [35]

Quantum chemistry structures and properties of 134 kilo molecules

Raghunathan Ramakrishnan et al. “Quantum chemistry structures and properties of 134 kilo molecules”. In:Scientific data1.1 (2014), pp. 1–7

2014

[36] [36]

The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules

Justin S. Smith et al. “The ANI-1ccx and ANI-1x data sets, coupled-cluster and density functional theory properties for molecules”. In:Scientific Data7.1 (2020), p. 134.DOI: 10.1038/s41597-020-0473-z

work page doi:10.1038/s41597-020-0473-z 2020

[37] [37]

Quantitative correlation of physical and chemical properties with chemical structure: utility for prediction

Alan R Katritzky et al. “Quantitative correlation of physical and chemical properties with chemical structure: utility for prediction”. In:Chemical reviews110.10 (2010), pp. 5714–5789

2010

[38] [38]

Less is more: Sampling chemical space with active learning

Justin S Smith et al. “Less is more: Sampling chemical space with active learning”. In:The Journal of chemical physics148.24 (2018)

2018

[39] [39]

Uncertainty-driven dynamics for active learning of interatomic potentials

Maksim Kulichenko et al. “Uncertainty-driven dynamics for active learning of interatomic potentials”. In:Nature computational science3.3 (2023), pp. 230–239

2023

[40] [40]

A quantitative uncertainty metric controls error in neural network-driven chemical discovery

Jon Paul Janet et al. “A quantitative uncertainty metric controls error in neural network-driven chemical discovery”. In:Chemical science10.34 (2019), pp. 7913–7922

2019

[41] [41]

Efficient Global Optimization of Expensive Black-Box Functions

Donald R. Jones, Matthias Schonlau, and William J. Welch. “Efficient Global Optimization of Expensive Black-Box Functions”. In:Journal of Global Optimization. V ol. 13. 4. 1998, pp. 455–492.DOI: 10. 1023/A:1008306431147

1998

[42] [42]

John Wiley & Sons, 2013

Malcolm A Halcrow.Spin-crossover materials: properties and applications. John Wiley & Sons, 2013

2013

[43] [43]

How to Switch Spin-Crossover Metal Complexes at Constant Room Tempera- ture

Marat M Khusniyarov. “How to Switch Spin-Crossover Metal Complexes at Constant Room Tempera- ture”. In:Chemistry–A European Journal22.43 (2016), pp. 15178–15191

2016

[44] [44]

Accelerating chemical discovery with machine learning: simulated evolution of spin crossover complexes with an artificial neural network

Jon Paul Janet, Lydia Chan, and Heather J Kulik. “Accelerating chemical discovery with machine learning: simulated evolution of spin crossover complexes with an artificial neural network”. In:The journal of physical chemistry letters9.5 (2018), pp. 1064–1071. 23

2018

[45] [45]

Machine Learning Prediction of the Experimental Transition Temperature of Fe (II) Spin-Crossover Complexes

Vyshnavi Vennelakanti et al. “Machine Learning Prediction of the Experimental Transition Temperature of Fe (II) Spin-Crossover Complexes”. In:The Journal of Physical Chemistry A128.1 (2023), pp. 204– 216

2023

[46] [46]

http : / / www

Greg Landrum et al.RDKit: Open-source cheminformatics. http : / / www . rdkit . org. Accessed: 2026-05-04

2026

[47] [47]

The Annals of Statistics , author =

Donald B. Rubin. “The Bayesian Bootstrap”. In:The Annals of Statistics9.1 (1981), pp. 130–134.DOI: 10.1214/aos/1176345338

work page doi:10.1214/aos/1176345338 1981

[48] [48]

Statistical Improvement Criteria for Use in Multiobjective Design Optimization

A. J. Keane. “Statistical Improvement Criteria for Use in Multiobjective Design Optimization”. In:AIAA Journal44.4 (2006), pp. 879–891.DOI:10.2514/1.16875

work page doi:10.2514/1.16875 2006

[49] [49]

Recent Advances in Surrogate-Based Optimization

Alexander I. J. Forrester and Andy J. Keane. “Recent Advances in Surrogate-Based Optimization”. In: Progress in Aerospace Sciences45.1-3 (2009), pp. 50–79

2009

[50] [50]

Stochastic Multi-Objective Optimization: A Survey on Non-Scalarizing Methods

Walter J. Gutjahr and Alois Pichler. “Stochastic Multi-Objective Optimization: A Survey on Non-Scalarizing Methods”. In:Annals of Operations Research236.2 (2016), pp. 475–499

2016

[51] [51]

Predictive Entropy Search for Multi-objective Bayesian Optimization

Daniel Hernández-Lobato et al. “Predictive Entropy Search for Multi-objective Bayesian Optimization”. In:Proceedings of the 33rd International Conference on Machine Learning (ICML). V ol. 48. Proceedings of Machine Learning Research. 2016, pp. 1492–1501

2016

[52] [52]

A Review of Multi-Objective Optimization: Methods and Its Applications

Nyoman Gunantara. “A Review of Multi-Objective Optimization: Methods and Its Applications”. In: Cogent Engineering5.1 (Jan. 1, 2018). Ed. by Qingsong Ai, p. 1502242.DOI: 10.1080/23311916. 2018.1502242

work page doi:10.1080/23311916 2018

[53] [53]

The Cambridge structural database

Colin R Groom et al. “The Cambridge structural database”. In:Structural Science72.2 (2016), pp. 171– 179

2016

[54] [54]

Architector for high-throughput cross-periodic table 3D complex building

Michael G. Taylor et al. “Architector for high-throughput cross-periodic table 3D complex building”. In: Nature Communications14.1 (2023), p. 2786.DOI:10.1038/s41467-023-38169-2

work page doi:10.1038/s41467-023-38169-2 2023

[55] [55]

Los Alamos National Laboratory (LANL) Innovation Team.Architector: High -throughput 3D metal complex builder.https://github.com/lanl/Architector. 2023

2023

[56] [56]

GFN2-xTB—An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions

Christoph Bannwarth, Sebastian Ehlert, and Stefan Grimme. “GFN2-xTB—An accurate and broadly parametrized self-consistent tight-binding quantum chemical method with multipole electrostatics and density-dependent dispersion contributions”. In:Journal of chemical theory and computation15.3 (2019), pp. 1652–1671

2019

[57] [57]

Extended tight-binding quantum chemistry methods

Christoph Bannwarth et al. “Extended tight-binding quantum chemistry methods”. In:Wiley Interdisci- plinary Reviews: Computational Molecular Science11.2 (2021), e1493

2021

[58] [58]

Comparison of multiobjective evolutionary algo- rithms: Empirical results

Eckart Zitzler, Kalyanmoy Deb, and Lothar Thiele. “Comparison of multiobjective evolutionary algo- rithms: Empirical results”. In:Evolutionary computation8.2 (2000), pp. 173–195

2000

[59] [59]

The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini

Lidia Ceriani and Paolo Verme. “The origins of the Gini index: extracts from Variabilità e Mutabilità (1912) by Corrado Gini”. In:The Journal of Economic Inequality10.3 (2012), pp. 421–443

1912

[60] [60]

Measurement of diversity

Edward H Simpson. “Measurement of diversity”. In:nature163.4148 (1949), pp. 688–688

1949

[61] [61]

Princeton university press, 2015

Richard E Bellman.Adaptive control processes: a guided tour. Princeton university press, 2015. 24 7 Supplementary Information Results reported in Fig. 3, Fig. 4 are averages and standard deviations for 10 QM9 experiment runs. Fig. 5 and Fig. 7 are ensembles of 4 runs for the SCO experiment experiment. 7.1 Architector Experiment Specifications Pool Ligands...

2015