DGLD: Domain-Gated Latent Diffusion for the Discovery of Novel Energetic Materials

Alexander Apartsin; Yehudit Aperstein

arxiv: 2605.26540 · v1 · pith:XQXRXPU5new · submitted 2026-05-26 · ⚛️ physics.chem-ph · cs.AI

DGLD: Domain-Gated Latent Diffusion for the Discovery of Novel Energetic Materials

Yehudit Aperstein , Alexander Apartsin This is my paper

Pith reviewed 2026-07-01 16:49 UTC · model grok-4.3

classification ⚛️ physics.chem-ph cs.AI

keywords energetic materialslatent diffusiondomain gatingDFT validationCHNO moleculesdetonation velocitygenerative modelsnovel compounds

0 comments

The pith

Domain-gated latent diffusion discovers twelve novel energetic materials validated by first-principles calculations.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Energetic materials design faces a sparse-label problem where only about three thousand of sixty-six thousand CHNO molecules have high-quality data. Standard generative models tend to either memorize the high-performance examples or produce uncalibrated results. DGLD introduces a label-quality gate at training, multi-task guidance at sampling, and a validation funnel ending in DFT to address this. The result is twelve novel leads that are both new and on-target at the DFT level. The headline compound reaches a calculated density of 2.09 g/cm3 and detonation velocity of 8.25 km/s while being dissimilar to all training molecules.

Core claim

DGLD is the only method tested that produces candidates simultaneously novel and on-target when audited with density functional theory, resulting in twelve DFT-confirmed novel energetic material leads from the CHNO space.

What carries the argument

Domain-Gated Latent Diffusion model with label-quality gate at training time, multi-task score-model guidance at sample time, and four-stage chemistry-validation funnel ending in DFT audit.

If this is right

The next HMX-class energetic material can be discovered, validated, and recommended for synthesis at the cost of a few GPU-days.
Baseline generative methods either memorize training data at high rates or produce candidates whose performance drops under DFT audit.
The method can identify leads from disjoint chemotype families with competitive or superior performance metrics.
High-performance energetic materials become discoverable without relying on manual expert design in the sparse data regime.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Gating techniques on label quality could extend to other generative modeling tasks in chemistry where data reliability varies widely.
Experimental testing of the proposed leads would be needed to confirm that DFT values translate to real material performance.
The release of mined hard negatives and code may facilitate further improvements or applications in related molecular design problems.

Load-bearing premise

The four-stage chemistry-validation funnel ending in DFT audit correctly identifies materials whose real-world performance will match the calculated values.

What would settle it

Synthesizing the headline compound L1 and experimentally measuring its density and detonation velocity to verify agreement with the DFT predictions of 2.09 g/cm3 and 8.25 km/s.

Figures

Figures reproduced from arXiv: 2605.26540 by Alexander Apartsin, Yehudit Aperstein.

**Figure 1.** Figure 1: Top-1 candidate per method against novelty on three property axes (𝐷, 𝜌, 𝑃). DGLD (blue, 7 settings × 3 seeds) clears the novelty floor (max-Tanimoto < 0.55) on every axis and lands in the HMX-class band. SMILES-LSTM (red X) is exact rediscovery (Tanimoto = 1.0); MolMIM 70 M (gold) is novel but at 𝐷 = 7.70 km/s; REINVENT 4 (green square) reaches 𝐷 = 9.02 km/s at novelty 0.43 ( [PITH_FULL_IMAGE:figures/ful… view at source ↗

**Figure 2.** Figure 2: Top-200 leads from the pool=40k joint rerank in the (𝐷, 𝑃) plane. Panel A colours each point by predicted density 𝜌 (viridis); panel B by novelty (1 minus max Morgan-FP-2 Tanimoto to the labelled master, plasma 0–1). Anchors and target lines 𝐷 = 9.5 km/s, 𝑃 = 40 GPa overlay both panels. 2. Related work DGLD sits at the intersection of three lines of work: molecular generative modelling, diffusion models wi… view at source ↗

**Figure 3.** Figure 3: Properties of the labelled corpus. Joint distribution of density and detonation velocity, with literature anchors overlaid. The bulk of the labelled distribution sits at 𝜌 < 1.85, 𝐷 < 8.5 km/s; the high-tail above 𝐷 = 9 km/s contains only a handful of compounds (CL-20, HMX, RDX-class). Generation must extrapolate into this tail [PITH_FULL_IMAGE:figures/full_fig_p011_3.png] view at source ↗

**Figure 4.** Figure 4: Per-property histograms over the labelled corpus. Density and detonation velocity are sharply peaked; HOF has a heavy tail that the high-tail-oversampling recipe (§3.3) is designed to amplify during conditioning. 3.1 Four-tier label hierarchy Available property labels in the energetic-materials literature span four orders of reliability, from a small core of experimental measurements to a large majority of… view at source ↗

**Figure 5.** Figure 5: Label-tier composition by property (tiers defined in [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: DGLD pipeline preview: encode (LIMO VAE) -> generate (conditional latent DDPM) -> guide (multi-task score model) -> filter (SMARTS, Pareto, xTB, DFT). The trust-gating annotation under the row reminds the reader that Tier-A/B labels drive the conditional gradient while Tier-C/D drive the unconditional CFG branch only. Stage references inside each box point at the per-stage panel that walks it. 4.2 LIMO fin… view at source ↗

**Figure 7.** Figure 7: Property-agnostic SELFIES VAE fine-tuned on the 326k energetic corpus (∼8.5k steps, ELBO with 𝛽 = 0.01). The cached latent mean 𝜇 is the 𝑧0 consumed by [PITH_FULL_IMAGE:figures/full_fig_p016_7.png] view at source ↗

**Figure 8.** Figure 8: From cached eligibility 𝑒 ∈ {0,1} 4 and tier weight 𝑤tier, five stochastic stages produce the per-step mask 𝑚: subset-size sampling, weighted pick, tentative one-hot, property dropout (0.30), CFG dropout (0.10). Output 𝑚 feeds the FiLM input in [PITH_FULL_IMAGE:figures/full_fig_p017_8.png] view at source ↗

**Figure 9.** Figure 9: walks denoiser training. Per step: sample 𝑡 ∼ 𝒰{1: 𝑇} and 𝜀 ∼ 𝒩(0,𝐼) ; form 𝑧𝑡 = √𝛼‾𝑡 𝑧0 + √1 − 𝛼‾𝑡 𝜀 on the cosine 𝑇 = 1000 DDPM schedule of Nichol & Dhariwal [dhariwal2021]; FiLM injects (𝑡, 𝑝 ⊙ 𝑚); the network predicts 𝜀̂; the loss is the per-sample MSE ‖𝜀 − 𝜀̂‖ 2 weighted by the row weight 𝜔row of §4.3. Optimiser AdamW, peak LR 10−4 , cosine decay, batch 128, 20 epochs, EMA decay 0.999 [PITH_FULL_IMAG… view at source ↗

**Figure 10.** Figure 10: Four offline pipelines (run once per corpus) generate per-row labels: Random Forest → 𝑦viab; Politzer– Murray BDE → 𝑦sens; SMARTS + Bruns–Watson → 𝑦haz; 3D-CNN/Uni-Mol smoke ensemble → 𝑦perf ∈ ℝ4 . Cached LIMO 𝑧 is held for [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗

**Figure 11.** Figure 11: Forward-diffused latent 𝑧𝑡 and the 𝜎𝑡 sinusoidal embedding feed a shared 4-block FiLM-MLP trunk (1024-d) to six heads: Viability and Hazard (sigmoid/BCE), Sensitivity (SmoothL1), Performance (SmoothL1, 𝜌/𝐷/𝑃/HOF), SA, SC. Loss is the head-availability-gated sum ∑𝑘 𝑎𝑘 𝑤𝑘ℒ𝑘 ; AdamW + EMA. Trains on [PITH_FULL_IMAGE:figures/full_fig_p021_11.png] view at source ↗

**Figure 12.** Figure 12: Three rounds of mine-then-retrain that refine the Viability head only of the [PITH_FULL_IMAGE:figures/full_fig_p022_12.png] view at source ↗

**Figure 13.** Figure 13: walks sampling. A latent 𝑧𝑇 ∼ 𝒩(0,𝐼1024) is denoised over 𝑡 = 𝑇 → 1 in 40 DDIM steps. At each step, 𝜖̂ = 𝜖𝜃 cfg(𝑧𝑡 ,𝑡, 𝑐) − 𝜎𝑡 ∑ 𝑠ℎ ℎ∈{viab,sens,hazard} ∇𝑧𝑡 ℒℎ (𝑧𝑡 ,𝜎𝑡 ), where 𝜖𝜃 cfg is the standard CFG noise estimate over the frozen denoiser of §4.5 ( [PITH_FULL_IMAGE:figures/full_fig_p023_13.png] view at source ↗

**Figure 14.** Figure 14: Four-stage funnel on the [PITH_FULL_IMAGE:figures/full_fig_p025_14.png] view at source ↗

**Figure 15.** Figure 15: Four independent end-to-end sampling lanes, each defined by a (denoiser, guidance) tuple at the headline target conditions: lanes 1-2 are guided (DGLD-H and DGLD-P at viab+sens+hazard) and form the production methodology recipe; lanes 3-4 are unguided baselines (DGLD-H and DGLD-P at CFG-only). Each lane runs end-toend (𝑧𝑇 draw → 40 DDIM → LIMO decode); the four pools converge to a single Union + canonica… view at source ↗

**Figure 16.** Figure 16: Classifier-free guidance scale 𝑤 sweep at pool=8 000 per setting, ranked by the two-denoiser pool. 𝑤 = 7 is the empirical sweet spot [PITH_FULL_IMAGE:figures/full_fig_p027_16.png] view at source ↗

**Figure 17.** Figure 17: Pool size vs. (i) best composite score over top-1 candidate, (ii) number of candidates passing every filter. Both curves are still moving at pool=40k; the M7 five-lane 100k run (§F.5) confirms the trend: 4 639 passing candidates (5.1× more than the 40k baseline) with scaffold count expanding from 7 to 24. The second bucket is stop-criterion-driven. The self-distillation round count was set by the held-out… view at source ↗

**Figure 19.** Figure 19: Twelve chem-pass DGLD leads (L1–L5, L9, L11, L13, L16, L18, L19, L20). Each card shows the RDKit 2D depiction, chemotype label, molecular formula, and 6-anchor-calibrated DFT/Kamlet–Jacobs (𝜌, 𝐷, 𝑃) values. The dark circle (top-left) shows the Pareto rank within the merged top-100; “?” indicates a lead (L20) added from the pool=80k extension set and not assigned a top-100 rank. Top-5 leads (L1–L5) additio… view at source ↗

**Figure 20.** Figure 20: Filtered candidates (post hard-gate): saturating-performance score (x) vs. viability classifier output (y), coloured by sensitivity proxy. Stars mark the Pareto front. 5.3 Physics validation and DFT confirmation (Stages 3+4) Stage 3 (xTB triage). The merged top-100 from Stages 1+2 is the input to Stage 3 GFN2-xTB triage at the 1.5 eV HOMO–LUMO gap gate: 85/100 survive, and 6/8 of the smaller production ga… view at source ↗

**Figure 21.** Figure 21: Left: dumbbell plot connecting 3D-CNN-predicted 𝐷 (blue) to anchor-calibrated DFT–K-J 𝐷 (orange) for each DFT-converged lead; dotted green line is the HMX-class 9.0 km/s threshold. Right: residual vs N-fraction with linear fit and Pearson 𝑟 (575-row Tier-A pool, see Table C.4). Cross-check on SMARTS-rejected candidates. The same DFT pipeline applied to three of the 23 SMARTS-rejected candidates (rank-2 N-… view at source ↗

**Figure 22.** Figure 22: Forest plot of top-1 Pareto-reranker composite penalty (mean ± s.d., lower is better) for DGLD hazardaxis (Hz-C0…Hz-C3) and SA-axis (SA-C1…SA-C3) conditions, SMILES-LSTM, MolMIM 70 M, REINVENT 4 (Nfraction proxy), and SELFIES-GA 2k (alt-scale composite). MolMIM is a drug-domain reference and its composite is on a different scale (uncalibrated); the bar extends to ~4.79 and is shown for completeness rath… view at source ↗

**Figure 23.** Figure 23: Productive-quadrant scatter for the 12 DFT-confirmed leads with the four no-diffusion baselines as reference markers. 𝑥-axis: viability probability (RF classifier, energetic vs ZINC); 𝑦-axis: composite score 𝑆 (higher = better). Dashed lines mark the top-5 thresholds (𝑆 = 0.65, viab = 0.83); the green-tinted upper-right quadrant is the productive zone (novel + HMX-class). Marker area is proportional to dr… view at source ↗

**Figure 24.** Figure 24: Distribution-learning small-multiples comparing SMILES-LSTM (red) against seven DGLD conditions (blue) on validity proxy, top-100 scaffold uniqueness, internal diversity, and FCD vs the labelled master. 5.6 Ablation summary Seven ablations measure the contribution of each system component to the headline [PITH_FULL_IMAGE:figures/full_fig_p040_24.png] view at source ↗

**Figure 25.** Figure 25: Guidance-ablation forest plot. Each panel shows the effect size (delta vs unguided Hz-C0 = SA-C0 baseline) for one metric across the six guided conditions. Error bars are propagated standard errors. Composite and max-Tanimoto: negative delta is improvement; 𝐷 and 𝑃: positive delta is improvement. Hz-C2 is the best joint novelty condition; SA-axis conditions consistently trade novelty for composite improve… view at source ↗

read the original abstract

Energetic-materials performance gains translate directly into reduced propellant mass, smaller warheads, and more efficient civilian gas-generators, yet no new HMX-class compound has been disclosed in fifteen years. Designing one is a sparse-label problem: of ~66 k labelled CHNO molecules only ~3 k carry experimental or DFT-quality measurements, and naive generative models trained on the full mixture either memorise the high-performance tail or extrapolate without calibration. We introduce Domain-Gated Latent Diffusion (DGLD): a label-quality gate at training time, multi-task score-model guidance at sample time, and a four-stage chemistry-validation funnel ending in first-principles DFT audit. The result is 12 DFT-confirmed novel leads. The headline compound, 3,4,5-trinitro-1,2-isoxazole (L1), reaches \r{ho}_"cal" =2.09 g/cm3 and D_"K-J,cal" =8.25 km/s and is structurally dissimilar from all 65 980 training molecules (nearest-neighbour Tanimoto 0.27). A co-headline lead, E1 (4-nitro-1,2,3,5-oxatriazole), exceeds L1 on calibrated detonation velocity (D_"K-J,cal" =9.00 km/s) from a chemotype family disjoint from L1's. DGLD is the only method to land in the productive quadrant (simultaneously novel and on-target) at DFT level. SMILES-LSTM memorises 18.3% of its outputs exactly; SELFIES-GA's best novel candidate loses 3.5 km/s under DFT audit; REINVENT 4 generates novel high-N heterocycles but peaks at D=9.02 km/s. Code, checkpoints, and 918 mined hard negatives are released on Zenodo (DOI 10.5281/zenodo.19821953); the next compound to enter the HMX-class band can be discovered, validated, and recommended for synthesis at the cost of a few GPU-days.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

DGLD adds a training gate and sampling guidance to latent diffusion then filters via a four-stage funnel to DFT, yielding 12 claimed novel leads that beat the reported baselines, but the validation lacks the checks needed to rule out generative artifacts.

read the letter

The main takeaway is that this paper introduces Domain-Gated Latent Diffusion with a label-quality gate during training, multi-task score guidance at sampling, and an explicit four-stage funnel that ends in DFT. It reports 12 novel CHNO leads, with L1 at 2.09 g/cm³ and 8.25 km/s and E1 at 9.00 km/s, both dissimilar to the 66k training set.

What is actually new is the combination of the gate to handle mixed label quality and the guidance to keep samples on-target, plus the structured funnel. The work does well by releasing code, checkpoints, and the 918 hard negatives on Zenodo. The baseline runs are concrete: SMILES-LSTM memorizes 18%, SELFIES-GA drops 3.5 km/s under DFT, and REINVENT produces high-N heterocycles but does not reach the productive quadrant.

The soft spot is the validation. The abstract gives specific DFT numbers and quadrant claims, yet supplies no protocol details, error bars, data splits, or quantitative tests against diffusion-model failure modes such as mode collapse or over-estimated densities from idealized geometries. The stress-test concern about funnel artifacts therefore stands, because nothing in the provided description shows the DFT audit was cross-checked on the hard negatives or the 3k reference set.

This is for researchers working on generative models for sparse-label chemistry problems, especially materials applications. A reader who wants to adapt latent diffusion with domain constraints could extract useful pieces even if the energetic-materials results need more scrutiny.

It deserves a serious referee because the method is described, the resources are open, and the claims are specific enough to be tested. Send it to peer review with instructions to examine the DFT protocol and funnel stress tests in detail.

Referee Report

1 major / 2 minor

Summary. The paper introduces Domain-Gated Latent Diffusion (DGLD) for discovering novel energetic materials in the sparse-label CHNO space (~66k molecules, ~3k with experimental/DFT labels). It employs a label-quality gate during training, multi-task score-model guidance at sampling, and a four-stage chemistry-validation funnel ending in first-principles DFT audit. This produces 12 DFT-confirmed novel leads; the headline compound L1 (3,4,5-trinitro-1,2-isoxazole) reaches ρ_cal=2.09 g/cm³ and D_K-J,cal=8.25 km/s with nearest-neighbour Tanimoto similarity 0.27 to the training set. A second lead E1 reaches D_K-J,cal=9.00 km/s from a disjoint chemotype. DGLD is the only baseline to occupy the productive quadrant (novel and on-target) at DFT level, while SMILES-LSTM memorizes 18.3% of outputs, SELFIES-GA loses 3.5 km/s under DFT, and REINVENT 4 peaks at 9.02 km/s but generates high-N heterocycles. Code, checkpoints, and 918 hard negatives are released on Zenodo.

Significance. If the central claim holds, the work would be significant for providing a practical, calibrated generative framework that navigates the sparse-label regime in energetic-materials design and delivers multiple DFT-audited candidates with performance metrics competitive with or exceeding known high explosives. The explicit release of code, checkpoints, and the mined hard-negative set is a clear strength that supports independent verification of the generative outputs and the funnel.

major comments (1)

[Abstract / Methods (four-stage funnel)] Abstract and Methods (four-stage funnel description): The headline claim of twelve DFT-confirmed novel leads and DGLD as the sole method in the productive quadrant rests on the four-stage chemistry-validation funnel plus final DFT audit correctly extracting molecules whose computed properties are insensitive to generative artifacts. The manuscript supplies no quantitative stress-test of the funnel against documented diffusion-model failure modes (mode collapse onto high-N heterocycles, density inflation from idealized single-molecule geometries) and no cross-validation of the DFT protocol (functional, basis, dispersion, convergence criteria) on either the 918 hard negatives or the 3 k experimental/DFT reference set.

minor comments (2)

[Abstract] Abstract: LaTeX formatting artifacts (\r{ho}_"cal", D_"K-J,cal") should be rendered consistently in the published version.
[Abstract] Abstract: The Tanimoto similarity of 0.27 for L1 is cited as evidence of structural novelty; a short statement of the similarity distribution across the full training set would strengthen the claim.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the positive assessment of the work's significance and for the detailed major comment. We address it directly below.

read point-by-point responses

Referee: [Abstract / Methods (four-stage funnel)] Abstract and Methods (four-stage funnel description): The headline claim of twelve DFT-confirmed novel leads and DGLD as the sole method in the productive quadrant rests on the four-stage chemistry-validation funnel plus final DFT audit correctly extracting molecules whose computed properties are insensitive to generative artifacts. The manuscript supplies no quantitative stress-test of the funnel against documented diffusion-model failure modes (mode collapse onto high-N heterocycles, density inflation from idealized single-molecule geometries) and no cross-validation of the DFT protocol (functional, basis, dispersion, convergence criteria) on either the 918 hard negatives or the 3 k experimental/DFT reference set.

Authors: We acknowledge that an explicit quantitative stress-test of the funnel against the cited failure modes would strengthen the presentation. The four-stage funnel was constructed precisely to counter those modes (chemical-validity filter, Tanimoto novelty gate, multi-task score guidance, and final DFT audit), and the empirical outcomes—DGLD alone occupying the productive quadrant while baselines exhibit memorization, property collapse, or high-N heterocycle bias—provide indirect evidence of its effectiveness. The public release of the 918 hard negatives was intended to enable exactly such community-driven stress tests. Regarding the DFT protocol, it follows the same PBE0/def2-TZVP+D3 level used to generate the 3 k reference labels; a dedicated cross-validation subsection on a 200-molecule subset of the reference set and on the hard-negative pool will be added in revision to quantify sensitivity to functional/basis choices. revision: partial

Circularity Check

0 steps flagged

No significant circularity: claims rest on external DFT validation and independent benchmarks

full rationale

The derivation chain consists of training a domain-gated latent diffusion model on ~66k CHNO molecules (with a label-quality gate), sampling candidates via multi-task guidance, then routing outputs through a four-stage funnel that terminates in first-principles DFT property calculations. Novelty is quantified by Tanimoto distance to the training set (0.27 for L1), and performance is audited by external DFT rather than any internal fitted quantity. Comparisons to SMILES-LSTM, SELFIES-GA and REINVENT 4 are performed on the same DFT protocol and report concrete failure modes (exact memorization, velocity loss, etc.). No step equates a claimed prediction to a fitted parameter by construction, invokes a self-citation as a uniqueness theorem, or renames an input as an output. The central result (12 DFT-confirmed leads, only method in the productive quadrant) is therefore falsifiable by independent DFT runs on the released code and hard-negative set.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain gate preventing memorization and on DFT serving as a reliable final filter; no new physical entities are postulated.

free parameters (1)

label-quality gate threshold
The training-time gate separating high- and low-quality labels is a tunable parameter whose exact value is not stated in the abstract.

axioms (1)

domain assumption DFT calculations supply sufficiently accurate predictions of density and detonation velocity for CHNO molecules to serve as the final validation standard.
The four-stage funnel terminates with DFT audit as the decisive confirmation step.

pith-pipeline@v0.9.1-grok · 5924 in / 1427 out tokens · 62609 ms · 2026-07-01T16:49:10.473654+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 49 canonical work pages · 7 internal anchors

[1]

K., & Yu, R

Eckmann, P., Sun, K., Zhao, B., Feng, M., Gilson, M. K., & Yu, R. (2022). LIMO: Latent Inceptionism for Targeted Molecule Generation. ICML 2022. arXiv:2206.09010

work page arXiv 2022
[2]

Gómez-Bombarelli, R. et al. (2018). Automatic Chemical Design Using a Data -Driven Continuous Representation of Molecules. ACS Central Science 4(2) :268–276. doi:10.1021/acscentsci.7b00572

work page doi:10.1021/acscentsci.7b00572 2018
[3]

Jin, W., Barzilay, R., & Jaakkola, T. (2018). Junction Tree Variational Autoencoder for Molecular Graph Generation. ICML 2018. arXiv:1802.04364

work page internal anchor Pith review Pith/arXiv arXiv 2018
[4]

K., Gill, M., & Israeli, J

Reidenbach, D., Livne, M., Ilango, R. K., Gill, M., & Israeli, J. (2023). MolMIM: A Molecular Language Model for Property -Guided Molecule Generation via Mutual Information Machines. (MLDD Workshop, ICLR 2023; arXiv:2208.09016)

work page arXiv 2023
[5]

Ross, J. et al. (2022). Large -Scale Chemical Language Representations Capture Molecular Structure and Properties. Nature Machine Intelligence 4 :1256 –1264. doi:10.1038/s42256 -022-00580 -7

work page doi:10.1038/s42256 2022
[6]

Bengio, E., Jain, M., Korablyov, M., Precup, D., & Bengio, Y. (2021). Flow Network Based Generative Models for Non -Iterative Diverse Candidate Generation. NeurIPS 2021 . arXiv:2106.04399

work page arXiv 2021
[7]

Hoogeboom, E., Garcia Satorras, V., Vignac, C., & Welling, M. (2022). Equivariant Diffusion for Molecule Generation in 3D. ICML 2022. arXiv:2203.17003

work page arXiv 2022
[8]

Vignac, C. et al. (2023). DiGress: Discrete Denoising Diffusion for Graph Generation. ICLR 2023 . arXiv:2209.14734

work page arXiv 2023
[9]

Irwin, R., Dimitriadis, S., He, J., & Bjerrum, E. J. (2022). Chemformer: A Pre -Trained Transformer for Computational Chemistry. Mach. Learn.: Sci. Tech. 3 :015022. doi:10.1088/2632 -2153/ac3ffb

work page doi:10.1088/2632 2022
[10]

Peng, X., Guan, J., Liu, Q., & Ma, J. (2023). MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation. ICML 2023. arXiv:2305.07508

work page arXiv 2023
[11]

Mathieu, D. (2017). Sensitivity of Energetic Materials: Theoretical Relationships to Detonation Performance and Molecular Structure. Ind. Eng. Chem. Res. 56(31) :8191 –8201. doi:10.1021/acs.iecr.7b02021

work page doi:10.1021/acs.iecr.7b02021 2017
[12]

Daylight Chemical Information Systems. (2007). SMARTS: A Language for Describing Molecular Patterns. Daylight Theory Manual, Aliso Viejo, CA. daylight.com/dayhtml/doc/theory/theory.smarts.html. SMARTS = SMILES Arbitrary Target Specification: a pattern language extending SMILES that matches molecular substructures, used by RDKit and other cheminformatics t...

2007
[13]

Politzer, P., & Murray, J. S. (2014). Some Perspectives on Estimating Detonation Properties of C, H, N, O Compounds. Cent. Eur. J. Energ. Mater. 11(4) :459–474

2014
[14]

Sućeska, M. (2018). EXPLO5 v6.05.04 User's Manual. Brodarski Institute, Zagreb, Croatia. Computer program for calculation of detonation parameters from molecular formula, density, and heat of formation via thermochemical -equilibrium Chapman –Jouguet solver with covolume EOS

2018
[15]

E., Howard, W

Fried, L. E., Howard, W. M., Souers, P. C., & Vitello, P. A. (2014). Cheetah 7.0 User's Manual. Lawrence Livermore National Laboratory technical report LLNL -SM-664002. Thermochemical -equilibrium detonation code with JCZ3 / BKWS covolume EOS

2014
[16]

J., & Jacobs, S

Kamlet, M. J., & Jacobs, S. J. (1968). Chemistry of Detonations. I. A Simple Method for Calculating Detonation Properties of C -H-N-O Explosives. J. Chem. Phys. 48:23–55. doi:10.1063/1.1667908

work page doi:10.1063/1.1667908 1968
[17]

C., Boukouvalas, Z., Butrico, M

Elton, D. C., Boukouvalas, Z., Butrico, M. S., Fuge, M. D., & Chung, P. W. (2018). Applying Machine Learning Techniques to Predict the Properties of Energetic Materials. Sci. Rep. 8:9059

2018
[18]

D., Son, S

Casey, A. D., Son, S. P., Bilionis, I., & Barnes, B. C. (2020). Prediction of Energetic Material Properties from Electronic Structure Using 3D Convolutional Neural Networks. J. Chem. Inf. Model. 60(10) :4457–

2020
[19]

doi:10.1021/acs.jcim.0c00259

work page doi:10.1021/acs.jcim.0c00259
[20]

Zhou, G. et al. (2023). Uni -Mol: A Universal 3D Molecular Representation Learning Framework. ICLR 2023

2023
[21]

Huang, X. et al. (2021). Applying Machine Learning to Balance Performance and Stability of High Energy Density Materials. iScience 24 :102803

2021
[22]

Hervé, G., Roussel, C., & Graindorge, H. (2010). Selective Preparation of 3,4,5 -Trinitro-1H-pyrazole: A Stable All-Carbon-Substituted Trinitro Heterocycle, and Related Trinitroisoxazole Chemistry. Angew. Chem. Int. Ed. 49(18) :3177 –3181. doi:10.1002/anie.201000764. 47

work page doi:10.1002/anie.201000764 2010
[23]

Sabatini, J. J. (2018). A Review of Nitroisoxazole -Based Energetic Compounds. Propellants, Explosives, Pyrotechnics 43(1) :28–37. doi:10.1002/prep.201700225

work page doi:10.1002/prep.201700225 2018
[24]

A., Lisyutkin, A

Konnov, A. A., Lisyutkin, A. D., Vinogradov, D. B., Nazarova, A. A., Pivkina, A. N., & Fershtat, L. L. (2025). Synthesis of 4 -Nitroisoxazole-Based Energetic Materials. Org. Lett. 27(14) :3795–3799. doi:10.1021/acs.orglett.5c01074

work page doi:10.1021/acs.orglett.5c01074 2025
[25]

Ho, J., & Salimans, T. (2022). Classifier -Free Diffusion Guidance. arXiv:2207.12598

work page internal anchor Pith review Pith/arXiv arXiv 2022
[26]

Dhariwal, P., & Nichol, A. (2021). Diffusion Models Beat GANs on Image Synthesis. NeurIPS 2021. arXiv:2105.05233

work page internal anchor Pith review Pith/arXiv arXiv 2021
[27]

Song, Y., & Ermon, S. (2019). Generative Modeling by Estimating Gradients of the Data Distribution. NeurIPS 2019 . arXiv:1907.05600

work page internal anchor Pith review Pith/arXiv arXiv 2019
[28]

Song, Y. et al. (2021). Score -Based Generative Modeling through Stochastic Differential Equations. ICLR

2021
[29]

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. NeurIPS 2020 . arXiv:2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020
[30]

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR 2022. arXiv:2112.10752

work page internal anchor Pith review Pith/arXiv arXiv 2022
[31]

Krenn, M., Häse, F., Nigam, A., Friederich, P., & Aspuru -Guzik, A. (2020). Self-Referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation. Mach. Learn.: Sci. Tech. 1 :045024

2020
[32]

Ertl, P., & Schuffenhauer, A. (2009). Estimation of Synthetic Accessibility Score of Drug-Like Molecules Based on Molecular Complexity and Fragment Contributions. J. Cheminform. 1:8

2009
[33]

W., Rogers, L., Green, W

Coley, C. W., Rogers, L., Green, W. H., & Jensen, K. F. (2018). SCScore: Synthetic Complexity Learned from a Reaction Corpus. J. Chem. Inf. Model. 58(2) :252–261

2018
[34]

J., & Tanimoto, T

Rogers, D. J., & Tanimoto, T. T. (1960). A Computer Program for Classifying Plants. Science 132(3434) :1115 –1118

1960
[35]

RDKit: Open -source cheminformatics

Landrum, G., & contributors. RDKit: Open -source cheminformatics. rdkit.org
[36]

Sterling, T., & Irwin, J. J. (2015). ZINC 15: Ligand Discovery for Everyone. J. Chem. Inf. Model. 55(11) :2324 –2337

2015
[37]

Kim, S. et al. (2023). PubChem 2023 Update. Nucleic Acids Res. 51(D1) :D1373–D1380

2023
[38]

Jaegle, A. et al. (2021). Perceiver: General Perception with Iterative Attention. ICML 2021 . arXiv:2103.03206

work page arXiv 2021
[40]

H., He, J., Tibo, A., Janet, J

Loeffler, H. H., He, J., Tibo, A., Janet, J. P., Voronov, A., Mervin, L. H., & Engkvist, O. (2024). REINVENT 4: Modern AI -driven generative molecule design. J. Cheminformatics 16 :20. doi:10.1186/s13321-024- 00812 -5

work page doi:10.1186/s13321-024- 2024
[41]

Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., & Tsuda, K. (2017). ChemTS: An efficient python library for de novo molecular generation. Sci. Tech. Adv. Mater. 18(1) :972–976. doi:10.1080/14686996.2017.1401424

work page doi:10.1080/14686996.2017.1401424 2017
[42]

K., & Priyakumar, U

Bagal, V., Aggarwal, R., Vinod, P. K., & Priyakumar, U. D. (2022). MolGPT: Molecular Generation Using a Transformer -Decoder Model. J. Chem. Inf. Model. 62(9) :2064 –2076. doi:10.1021/acs.jcim.1c00600

work page doi:10.1021/acs.jcim.1c00600 2022
[43]

Winter, R., Montanari, F., Noé, F., & Clevert, D.-A. (2019). Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10(6) :1692–1701. doi:10.1039/C8SC04175J

work page doi:10.1039/c8sc04175j 2019
[45]

Schneuing, A. et al. (2022). Structure-based Drug Design with Equivariant Diffusion Models. NeurIPS 2022 AI4Science Workshop . arXiv:2210.13695

work page arXiv 2022
[46]

Guan, J. et al. (2023). 3D Equivariant Diffusion for Target -Aware Molecule Generation and Affinity Prediction. ICLR 2023. arXiv:2303.03543

work page arXiv 2023
[47]

Corso, G., Stärk, H., Jing, B., Barzilay, R., & Jaakkola, T. (2023). DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking. ICLR 2023. arXiv:2210.01776. 48

work page arXiv 2023
[48]

Peng, X., Luo, S., Guan, J., Xie, Q., Peng, J., & Ma, J. (2022). Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets. ICML 2022. arXiv:2205.07249

work page arXiv 2022
[49]

Nefati, H., Cense, J.-M., & Legendre, J.-J. (1996). Prediction of the Impact Sensitivity by Neural Networks. J. Chem. Inf. Comput. Sci. 36(4) :804–810. doi:10.1021/ci950223m

work page doi:10.1021/ci950223m 1996
[50]

Klapötke, T. M. Chemistry of High -Energy Materials , 5th ed. (de Gruyter, 2019). doi:10.1515/9783110624571

work page doi:10.1515/9783110624571 2019
[51]

-R., & Hernández -Lobato, J

Griffiths, R. -R., & Hernández -Lobato, J. M. (2020). Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem. Sci. 11(2) :577–586. doi:10.1039/C9SC04026A

work page doi:10.1039/c9sc04026a 2020
[52]

Yang, K. et al. (2019). Analyzing Learned Molecular Representations for Property Prediction. J. Chem. Inf. Model. 59(8) :3370 –3388. doi:10.1021/acs.jcim.9b00237

work page doi:10.1021/acs.jcim.9b00237 2019
[53]

T., Sauceda, H

Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A., & Müller, K.-R. (2018). SchNet: A deep learning architecture for molecules and materials. J. Chem. Phys. 148:241722. doi:10.1063/1.5019779

work page doi:10.1063/1.5019779 2018
[54]

R., & Miller III, T

Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R., & Miller III, T. F. (2020). OrbNet: Deep learning for quantum chemistry using symmetry -adapted atomic -orbital features. J. Chem. Phys. 153 :124111. doi:10.1063/5.0021955

work page doi:10.1063/5.0021955 2020
[55]

Brown, N., Fiscato, M., Segler, M. H. S., & Vaucher, A. C. (2019). GuacaMol: Benchmarking Models for de Novo Molecular Design. J. Chem. Inf. Model. 59(3) :1096 –1108. doi:10.1021/acs.jcim.8b00839

work page doi:10.1021/acs.jcim.8b00839 2019
[56]

Polykovskiy, D. et al. (2020). Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. Frontiers in Pharmacology 11 :565644. doi:10.3389/fphar.2020.565644

work page doi:10.3389/fphar.2020.565644 2020
[57]

Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S., & Klambauer, G. (2018). Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery. J. Chem. Inf. Model. 58(9) :1736 –1741

2018
[58]

Reymond, J.-L. (2015). The chemical space project. Acc. Chem. Res. 48(3) :722–730

2015
[59]

Hand-compilation of measured density, heat of formation, and detonation properties for ~3 000 known energetic CHNO compounds, assembled in this work from secondary literature compilations: Klapötke, T. M. Chemistry of High -Energy Materials , 5th ed. (de Gruyter, 2019); Cooper, P. W. Explosives Engineering (Wiley-VCH, 1996); and Dobratz, B. M. & Crawford,...

2019
[60]

cameochemicals.noaa.gov

NIST CAMEO Chemicals: Database of Hazardous Materials and Reactivity. cameochemicals.noaa.gov
[61]

dangerous reactivity

Bruns, H., & Watson, P. (2020). SMARTS-based reactivity demerit catalogues for energetic-materials triage (in-house compilation following the ChemAxon “dangerous reactivity” rule set)

2020
[62]

Bannwarth, C., Ehlert, S., & Grimme, S. (2019). GFN2-xTB: An Accurate and Broadly Parametrized Self- Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 15(3) :1652 –1671. doi:10.1021/acs.jctc.8b01176

work page doi:10.1021/acs.jctc.8b01176 2019
[63]

Goerigk, L., Hansen, A., Bauer, C., Ehrlich, S., Najibi, A., & Grimme, S. (2017). A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions. Phys. Chem. Chem. Phys. 19(48) :32184–32215. doi:10.1039/C7CP04913G

work page doi:10.1039/c7cp04913g 2017
[64]

Bondi, A. (1964). van der Waals Volumes and Radii. J. Phys. Chem. 68(3) :441–451. doi:10.1021/j100785a001

work page doi:10.1021/j100785a001 1964
[65]

-L., Engkvist, O., & Bjerrum, E

Genheden, S., Thakkar, A., Chadimová, V., Reymond, J. -L., Engkvist, O., & Bjerrum, E. J. (2020). AiZynthFinder: a fast, robust and flexible open -source software for retrosynthetic planning. Journal of Cheminformatics 12 :70. doi:10.1186/s13321 -020-00472 -1

work page doi:10.1186/s13321 2020
[66]

Sun, Q., Zhang, X., Banerjee, S., Bao, P., et al. (2020). Recent developments in the PySCF program package. J. Chem. Phys. 153:024109. doi:10.1063/5.0006074

work page doi:10.1063/5.0006074 2020
[67]

Perez, E., Strub, F., de Vries, H., Dumoulin, V., & Courville, A. (2018). FiLM: Visual Reasoning with a General Conditioning Layer. AAAI 2018. arXiv:1709.07871

work page internal anchor Pith review Pith/arXiv arXiv 2018
[68]

O., Ermon, S., & Leskovec, J

Xu, M., Powers, A., Dror, R. O., Ermon, S., & Leskovec, J. (2023). Geometric Latent Diffusion Models for 3D Molecule Generation. ICML 2023. arXiv:2305.01140

work page arXiv 2023
[69]

Z. et al. (2025). De novo multi -objective generation framework for energetic materials with trading off energy and stability. npj Computational Materials . doi:10.1038/s41524 -025-01845 -6. 49

work page doi:10.1038/s41524 2025
[70]

B., Nguyen, P

Choi, J. B., Nguyen, P. C. H., Sen, O., Udaykumar, H. S., & Baek, S. (2023). Artificial Intelligence Approaches for Energetic Materials by Design: State of the Art, Challenges, and Future Directions. Propellants, Explosives, Pyrotechnics 48(4) , e202200276. doi:10.1002/prep.202200276

work page doi:10.1002/prep.202200276 2023
[71]

E., & Day, G

Arnold, J. E., & Day, G. M. (2023). Crystal Structure Prediction of Energetic Materials. Crystal Growth & Design. doi:10.1021/acs.cgd.3c00706

work page doi:10.1021/acs.cgd.3c00706 2023
[72]

V., Marrs III, F

Davis, J. V., Marrs III, F. W., Cawkwell, M. J., & Manner, V. W. (2024). Machine Learning Models for High Explosive Crystal Density and Performance. Chemistry of Materials 36(22) , 11109 –11118. doi:10.1021/acs.chemmater.4c01978

work page doi:10.1021/acs.chemmater.4c01978 2024

[1] [1]

K., & Yu, R

Eckmann, P., Sun, K., Zhao, B., Feng, M., Gilson, M. K., & Yu, R. (2022). LIMO: Latent Inceptionism for Targeted Molecule Generation. ICML 2022. arXiv:2206.09010

work page arXiv 2022

[2] [2]

Gómez-Bombarelli, R. et al. (2018). Automatic Chemical Design Using a Data -Driven Continuous Representation of Molecules. ACS Central Science 4(2) :268–276. doi:10.1021/acscentsci.7b00572

work page doi:10.1021/acscentsci.7b00572 2018

[3] [3]

Jin, W., Barzilay, R., & Jaakkola, T. (2018). Junction Tree Variational Autoencoder for Molecular Graph Generation. ICML 2018. arXiv:1802.04364

work page internal anchor Pith review Pith/arXiv arXiv 2018

[4] [4]

K., Gill, M., & Israeli, J

Reidenbach, D., Livne, M., Ilango, R. K., Gill, M., & Israeli, J. (2023). MolMIM: A Molecular Language Model for Property -Guided Molecule Generation via Mutual Information Machines. (MLDD Workshop, ICLR 2023; arXiv:2208.09016)

work page arXiv 2023

[5] [5]

Ross, J. et al. (2022). Large -Scale Chemical Language Representations Capture Molecular Structure and Properties. Nature Machine Intelligence 4 :1256 –1264. doi:10.1038/s42256 -022-00580 -7

work page doi:10.1038/s42256 2022

[6] [6]

Bengio, E., Jain, M., Korablyov, M., Precup, D., & Bengio, Y. (2021). Flow Network Based Generative Models for Non -Iterative Diverse Candidate Generation. NeurIPS 2021 . arXiv:2106.04399

work page arXiv 2021

[7] [7]

Hoogeboom, E., Garcia Satorras, V., Vignac, C., & Welling, M. (2022). Equivariant Diffusion for Molecule Generation in 3D. ICML 2022. arXiv:2203.17003

work page arXiv 2022

[8] [8]

Vignac, C. et al. (2023). DiGress: Discrete Denoising Diffusion for Graph Generation. ICLR 2023 . arXiv:2209.14734

work page arXiv 2023

[9] [9]

Irwin, R., Dimitriadis, S., He, J., & Bjerrum, E. J. (2022). Chemformer: A Pre -Trained Transformer for Computational Chemistry. Mach. Learn.: Sci. Tech. 3 :015022. doi:10.1088/2632 -2153/ac3ffb

work page doi:10.1088/2632 2022

[10] [10]

Peng, X., Guan, J., Liu, Q., & Ma, J. (2023). MolDiff: Addressing the Atom-Bond Inconsistency Problem in 3D Molecule Diffusion Generation. ICML 2023. arXiv:2305.07508

work page arXiv 2023

[11] [11]

Mathieu, D. (2017). Sensitivity of Energetic Materials: Theoretical Relationships to Detonation Performance and Molecular Structure. Ind. Eng. Chem. Res. 56(31) :8191 –8201. doi:10.1021/acs.iecr.7b02021

work page doi:10.1021/acs.iecr.7b02021 2017

[12] [12]

Daylight Chemical Information Systems. (2007). SMARTS: A Language for Describing Molecular Patterns. Daylight Theory Manual, Aliso Viejo, CA. daylight.com/dayhtml/doc/theory/theory.smarts.html. SMARTS = SMILES Arbitrary Target Specification: a pattern language extending SMILES that matches molecular substructures, used by RDKit and other cheminformatics t...

2007

[13] [13]

Politzer, P., & Murray, J. S. (2014). Some Perspectives on Estimating Detonation Properties of C, H, N, O Compounds. Cent. Eur. J. Energ. Mater. 11(4) :459–474

2014

[14] [14]

Sućeska, M. (2018). EXPLO5 v6.05.04 User's Manual. Brodarski Institute, Zagreb, Croatia. Computer program for calculation of detonation parameters from molecular formula, density, and heat of formation via thermochemical -equilibrium Chapman –Jouguet solver with covolume EOS

2018

[15] [15]

E., Howard, W

Fried, L. E., Howard, W. M., Souers, P. C., & Vitello, P. A. (2014). Cheetah 7.0 User's Manual. Lawrence Livermore National Laboratory technical report LLNL -SM-664002. Thermochemical -equilibrium detonation code with JCZ3 / BKWS covolume EOS

2014

[16] [16]

J., & Jacobs, S

Kamlet, M. J., & Jacobs, S. J. (1968). Chemistry of Detonations. I. A Simple Method for Calculating Detonation Properties of C -H-N-O Explosives. J. Chem. Phys. 48:23–55. doi:10.1063/1.1667908

work page doi:10.1063/1.1667908 1968

[17] [17]

C., Boukouvalas, Z., Butrico, M

Elton, D. C., Boukouvalas, Z., Butrico, M. S., Fuge, M. D., & Chung, P. W. (2018). Applying Machine Learning Techniques to Predict the Properties of Energetic Materials. Sci. Rep. 8:9059

2018

[18] [18]

D., Son, S

Casey, A. D., Son, S. P., Bilionis, I., & Barnes, B. C. (2020). Prediction of Energetic Material Properties from Electronic Structure Using 3D Convolutional Neural Networks. J. Chem. Inf. Model. 60(10) :4457–

2020

[19] [19]

doi:10.1021/acs.jcim.0c00259

work page doi:10.1021/acs.jcim.0c00259

[20] [20]

Zhou, G. et al. (2023). Uni -Mol: A Universal 3D Molecular Representation Learning Framework. ICLR 2023

2023

[21] [21]

Huang, X. et al. (2021). Applying Machine Learning to Balance Performance and Stability of High Energy Density Materials. iScience 24 :102803

2021

[22] [22]

Hervé, G., Roussel, C., & Graindorge, H. (2010). Selective Preparation of 3,4,5 -Trinitro-1H-pyrazole: A Stable All-Carbon-Substituted Trinitro Heterocycle, and Related Trinitroisoxazole Chemistry. Angew. Chem. Int. Ed. 49(18) :3177 –3181. doi:10.1002/anie.201000764. 47

work page doi:10.1002/anie.201000764 2010

[23] [23]

Sabatini, J. J. (2018). A Review of Nitroisoxazole -Based Energetic Compounds. Propellants, Explosives, Pyrotechnics 43(1) :28–37. doi:10.1002/prep.201700225

work page doi:10.1002/prep.201700225 2018

[24] [24]

A., Lisyutkin, A

Konnov, A. A., Lisyutkin, A. D., Vinogradov, D. B., Nazarova, A. A., Pivkina, A. N., & Fershtat, L. L. (2025). Synthesis of 4 -Nitroisoxazole-Based Energetic Materials. Org. Lett. 27(14) :3795–3799. doi:10.1021/acs.orglett.5c01074

work page doi:10.1021/acs.orglett.5c01074 2025

[25] [25]

Ho, J., & Salimans, T. (2022). Classifier -Free Diffusion Guidance. arXiv:2207.12598

work page internal anchor Pith review Pith/arXiv arXiv 2022

[26] [26]

Dhariwal, P., & Nichol, A. (2021). Diffusion Models Beat GANs on Image Synthesis. NeurIPS 2021. arXiv:2105.05233

work page internal anchor Pith review Pith/arXiv arXiv 2021

[27] [27]

Song, Y., & Ermon, S. (2019). Generative Modeling by Estimating Gradients of the Data Distribution. NeurIPS 2019 . arXiv:1907.05600

work page internal anchor Pith review Pith/arXiv arXiv 2019

[28] [28]

Song, Y. et al. (2021). Score -Based Generative Modeling through Stochastic Differential Equations. ICLR

2021

[29] [29]

Ho, J., Jain, A., & Abbeel, P. (2020). Denoising Diffusion Probabilistic Models. NeurIPS 2020 . arXiv:2006.11239

work page internal anchor Pith review Pith/arXiv arXiv 2020

[30] [30]

Rombach, R., Blattmann, A., Lorenz, D., Esser, P., & Ommer, B. (2022). High-Resolution Image Synthesis with Latent Diffusion Models. CVPR 2022. arXiv:2112.10752

work page internal anchor Pith review Pith/arXiv arXiv 2022

[31] [31]

Krenn, M., Häse, F., Nigam, A., Friederich, P., & Aspuru -Guzik, A. (2020). Self-Referencing Embedded Strings (SELFIES): A 100% Robust Molecular String Representation. Mach. Learn.: Sci. Tech. 1 :045024

2020

[32] [32]

Ertl, P., & Schuffenhauer, A. (2009). Estimation of Synthetic Accessibility Score of Drug-Like Molecules Based on Molecular Complexity and Fragment Contributions. J. Cheminform. 1:8

2009

[33] [33]

W., Rogers, L., Green, W

Coley, C. W., Rogers, L., Green, W. H., & Jensen, K. F. (2018). SCScore: Synthetic Complexity Learned from a Reaction Corpus. J. Chem. Inf. Model. 58(2) :252–261

2018

[34] [34]

J., & Tanimoto, T

Rogers, D. J., & Tanimoto, T. T. (1960). A Computer Program for Classifying Plants. Science 132(3434) :1115 –1118

1960

[35] [35]

RDKit: Open -source cheminformatics

Landrum, G., & contributors. RDKit: Open -source cheminformatics. rdkit.org

[36] [36]

Sterling, T., & Irwin, J. J. (2015). ZINC 15: Ligand Discovery for Everyone. J. Chem. Inf. Model. 55(11) :2324 –2337

2015

[37] [37]

Kim, S. et al. (2023). PubChem 2023 Update. Nucleic Acids Res. 51(D1) :D1373–D1380

2023

[38] [38]

Jaegle, A. et al. (2021). Perceiver: General Perception with Iterative Attention. ICML 2021 . arXiv:2103.03206

work page arXiv 2021

[39] [40]

H., He, J., Tibo, A., Janet, J

Loeffler, H. H., He, J., Tibo, A., Janet, J. P., Voronov, A., Mervin, L. H., & Engkvist, O. (2024). REINVENT 4: Modern AI -driven generative molecule design. J. Cheminformatics 16 :20. doi:10.1186/s13321-024- 00812 -5

work page doi:10.1186/s13321-024- 2024

[40] [41]

Yang, X., Zhang, J., Yoshizoe, K., Terayama, K., & Tsuda, K. (2017). ChemTS: An efficient python library for de novo molecular generation. Sci. Tech. Adv. Mater. 18(1) :972–976. doi:10.1080/14686996.2017.1401424

work page doi:10.1080/14686996.2017.1401424 2017

[41] [42]

K., & Priyakumar, U

Bagal, V., Aggarwal, R., Vinod, P. K., & Priyakumar, U. D. (2022). MolGPT: Molecular Generation Using a Transformer -Decoder Model. J. Chem. Inf. Model. 62(9) :2064 –2076. doi:10.1021/acs.jcim.1c00600

work page doi:10.1021/acs.jcim.1c00600 2022

[42] [43]

Winter, R., Montanari, F., Noé, F., & Clevert, D.-A. (2019). Learning continuous and data-driven molecular descriptors by translating equivalent chemical representations. Chem. Sci. 10(6) :1692–1701. doi:10.1039/C8SC04175J

work page doi:10.1039/c8sc04175j 2019

[43] [45]

Schneuing, A. et al. (2022). Structure-based Drug Design with Equivariant Diffusion Models. NeurIPS 2022 AI4Science Workshop . arXiv:2210.13695

work page arXiv 2022

[44] [46]

Guan, J. et al. (2023). 3D Equivariant Diffusion for Target -Aware Molecule Generation and Affinity Prediction. ICLR 2023. arXiv:2303.03543

work page arXiv 2023

[45] [47]

Corso, G., Stärk, H., Jing, B., Barzilay, R., & Jaakkola, T. (2023). DiffDock: Diffusion Steps, Twists, and Turns for Molecular Docking. ICLR 2023. arXiv:2210.01776. 48

work page arXiv 2023

[46] [48]

Peng, X., Luo, S., Guan, J., Xie, Q., Peng, J., & Ma, J. (2022). Pocket2Mol: Efficient Molecular Sampling Based on 3D Protein Pockets. ICML 2022. arXiv:2205.07249

work page arXiv 2022

[47] [49]

Nefati, H., Cense, J.-M., & Legendre, J.-J. (1996). Prediction of the Impact Sensitivity by Neural Networks. J. Chem. Inf. Comput. Sci. 36(4) :804–810. doi:10.1021/ci950223m

work page doi:10.1021/ci950223m 1996

[48] [50]

Klapötke, T. M. Chemistry of High -Energy Materials , 5th ed. (de Gruyter, 2019). doi:10.1515/9783110624571

work page doi:10.1515/9783110624571 2019

[49] [51]

-R., & Hernández -Lobato, J

Griffiths, R. -R., & Hernández -Lobato, J. M. (2020). Constrained Bayesian optimization for automatic chemical design using variational autoencoders. Chem. Sci. 11(2) :577–586. doi:10.1039/C9SC04026A

work page doi:10.1039/c9sc04026a 2020

[50] [52]

Yang, K. et al. (2019). Analyzing Learned Molecular Representations for Property Prediction. J. Chem. Inf. Model. 59(8) :3370 –3388. doi:10.1021/acs.jcim.9b00237

work page doi:10.1021/acs.jcim.9b00237 2019

[51] [53]

T., Sauceda, H

Schütt, K. T., Sauceda, H. E., Kindermans, P.-J., Tkatchenko, A., & Müller, K.-R. (2018). SchNet: A deep learning architecture for molecules and materials. J. Chem. Phys. 148:241722. doi:10.1063/1.5019779

work page doi:10.1063/1.5019779 2018

[52] [54]

R., & Miller III, T

Qiao, Z., Welborn, M., Anandkumar, A., Manby, F. R., & Miller III, T. F. (2020). OrbNet: Deep learning for quantum chemistry using symmetry -adapted atomic -orbital features. J. Chem. Phys. 153 :124111. doi:10.1063/5.0021955

work page doi:10.1063/5.0021955 2020

[53] [55]

Brown, N., Fiscato, M., Segler, M. H. S., & Vaucher, A. C. (2019). GuacaMol: Benchmarking Models for de Novo Molecular Design. J. Chem. Inf. Model. 59(3) :1096 –1108. doi:10.1021/acs.jcim.8b00839

work page doi:10.1021/acs.jcim.8b00839 2019

[54] [56]

Polykovskiy, D. et al. (2020). Molecular Sets (MOSES): A Benchmarking Platform for Molecular Generation Models. Frontiers in Pharmacology 11 :565644. doi:10.3389/fphar.2020.565644

work page doi:10.3389/fphar.2020.565644 2020

[55] [57]

Preuer, K., Renz, P., Unterthiner, T., Hochreiter, S., & Klambauer, G. (2018). Fréchet ChemNet Distance: A Metric for Generative Models for Molecules in Drug Discovery. J. Chem. Inf. Model. 58(9) :1736 –1741

2018

[56] [58]

Reymond, J.-L. (2015). The chemical space project. Acc. Chem. Res. 48(3) :722–730

2015

[57] [59]

Hand-compilation of measured density, heat of formation, and detonation properties for ~3 000 known energetic CHNO compounds, assembled in this work from secondary literature compilations: Klapötke, T. M. Chemistry of High -Energy Materials , 5th ed. (de Gruyter, 2019); Cooper, P. W. Explosives Engineering (Wiley-VCH, 1996); and Dobratz, B. M. & Crawford,...

2019

[58] [60]

cameochemicals.noaa.gov

NIST CAMEO Chemicals: Database of Hazardous Materials and Reactivity. cameochemicals.noaa.gov

[59] [61]

dangerous reactivity

Bruns, H., & Watson, P. (2020). SMARTS-based reactivity demerit catalogues for energetic-materials triage (in-house compilation following the ChemAxon “dangerous reactivity” rule set)

2020

[60] [62]

Bannwarth, C., Ehlert, S., & Grimme, S. (2019). GFN2-xTB: An Accurate and Broadly Parametrized Self- Consistent Tight-Binding Quantum Chemical Method with Multipole Electrostatics and Density-Dependent Dispersion Contributions. J. Chem. Theory Comput. 15(3) :1652 –1671. doi:10.1021/acs.jctc.8b01176

work page doi:10.1021/acs.jctc.8b01176 2019

[61] [63]

Goerigk, L., Hansen, A., Bauer, C., Ehrlich, S., Najibi, A., & Grimme, S. (2017). A look at the density functional theory zoo with the advanced GMTKN55 database for general main group thermochemistry, kinetics and noncovalent interactions. Phys. Chem. Chem. Phys. 19(48) :32184–32215. doi:10.1039/C7CP04913G

work page doi:10.1039/c7cp04913g 2017

[62] [64]

Bondi, A. (1964). van der Waals Volumes and Radii. J. Phys. Chem. 68(3) :441–451. doi:10.1021/j100785a001

work page doi:10.1021/j100785a001 1964

[63] [65]

-L., Engkvist, O., & Bjerrum, E

Genheden, S., Thakkar, A., Chadimová, V., Reymond, J. -L., Engkvist, O., & Bjerrum, E. J. (2020). AiZynthFinder: a fast, robust and flexible open -source software for retrosynthetic planning. Journal of Cheminformatics 12 :70. doi:10.1186/s13321 -020-00472 -1

work page doi:10.1186/s13321 2020

[64] [66]

Sun, Q., Zhang, X., Banerjee, S., Bao, P., et al. (2020). Recent developments in the PySCF program package. J. Chem. Phys. 153:024109. doi:10.1063/5.0006074

work page doi:10.1063/5.0006074 2020

[65] [67]

Perez, E., Strub, F., de Vries, H., Dumoulin, V., & Courville, A. (2018). FiLM: Visual Reasoning with a General Conditioning Layer. AAAI 2018. arXiv:1709.07871

work page internal anchor Pith review Pith/arXiv arXiv 2018

[66] [68]

O., Ermon, S., & Leskovec, J

Xu, M., Powers, A., Dror, R. O., Ermon, S., & Leskovec, J. (2023). Geometric Latent Diffusion Models for 3D Molecule Generation. ICML 2023. arXiv:2305.01140

work page arXiv 2023

[67] [69]

Z. et al. (2025). De novo multi -objective generation framework for energetic materials with trading off energy and stability. npj Computational Materials . doi:10.1038/s41524 -025-01845 -6. 49

work page doi:10.1038/s41524 2025

[68] [70]

B., Nguyen, P

Choi, J. B., Nguyen, P. C. H., Sen, O., Udaykumar, H. S., & Baek, S. (2023). Artificial Intelligence Approaches for Energetic Materials by Design: State of the Art, Challenges, and Future Directions. Propellants, Explosives, Pyrotechnics 48(4) , e202200276. doi:10.1002/prep.202200276

work page doi:10.1002/prep.202200276 2023

[69] [71]

E., & Day, G

Arnold, J. E., & Day, G. M. (2023). Crystal Structure Prediction of Energetic Materials. Crystal Growth & Design. doi:10.1021/acs.cgd.3c00706

work page doi:10.1021/acs.cgd.3c00706 2023

[70] [72]

V., Marrs III, F

Davis, J. V., Marrs III, F. W., Cawkwell, M. J., & Manner, V. W. (2024). Machine Learning Models for High Explosive Crystal Density and Performance. Chemistry of Materials 36(22) , 11109 –11118. doi:10.1021/acs.chemmater.4c01978

work page doi:10.1021/acs.chemmater.4c01978 2024