arxiv: 2605.02657 · v1 · submitted 2026-05-04 · 💻 cs.LG

Recognition: unknown

CARD: Coarse-to-fine Autoregressive Modeling with Radix-based Decomposition for Transferable Free Energy Estimation

Wenbing Huang, Wen Yan, Yang Liu, Yi He, Ziyang Yu

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:02 UTC · model grok-4.3

classification 💻 cs.LG

keywords free energy estimationautoregressive modelingradix decompositiongenerative modelsmolecular interactionstransferable modelsdrug discoverythermodynamics

0 comments

The pith

CARD creates a zero-free-energy distribution from molecular coordinates for absolute free energy estimation on arbitrary systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops CARD to overcome the high cost of molecular dynamics simulations and the poor generalization of existing deep learning methods for free energy differences. By decomposing 3D coordinates using a radix-based approach into sequences, it enables an autoregressive model that learns from coarse to fine details. This yields a generative distribution with zero free energy that can propose configurations for any molecular system, allowing direct absolute free energy computation without alchemical pathways or retraining per system. The approach matches classical accuracy on diverse unseen molecules while running about 40 times faster.

Core claim

CARD uses a novel radix-based decomposition to bijectively map 3D molecular coordinates to mixed discrete-continuous sequences. This enables coarse-to-fine autoregressive modeling whose resulting distribution has exactly zero free energy. Such a distribution provides a universal proposal for computing absolute free energies of arbitrary systems without dependence on alchemical transformations between states.

What carries the argument

Radix-based bijective decomposition of 3D coordinates into sequences for coarse-to-fine autoregressive density estimation that enforces zero free energy.

If this is right

Enables absolute free energy computation for arbitrary systems without alchemical pathways.
Achieves accuracy comparable to classical methods on unseen systems with diverse topologies.
Delivers approximately 40-fold speedup in inference compared to simulation-based approaches.
Overcomes constraints of system-specific input dimensions in prior deep learning methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The zero free energy property could simplify calculations of other thermodynamic quantities like entropy or enthalpy.
Applying the same trained model across many different molecules might accelerate high-throughput screening in drug design.
The decomposition technique may generalize to other 3D structure modeling tasks beyond free energy.

Load-bearing premise

The combination of radix-based bijective decomposition and coarse-to-fine autoregressive modeling produces a distribution with exactly zero free energy that generalizes accurately to unseen molecular systems with diverse topologies.

What would settle it

Computing the free energy of samples drawn from the CARD model and finding it is not zero, or observing large errors in free energy estimates for a new molecular topology not seen during training.

read the original abstract

Estimating free energy differences quantifies thermodynamic preferences in molecular interactions, which is central to chemistry and drug discovery. Despite fruitful progress, existing methods still face key limitations: classical computational approaches remain prohibitively expensive due to their reliance on extensive molecular dynamics simulations, while deep learning-based methods are constrained by either less-expressive generative models or input dimensions tied to a specific system, resulting in negligible generalization. To address these challenges, we propose CARD, a generative framework that employs a novel radix-based decomposition to bijectively convert 3D coordinates into mixed discrete-continuous sequences, enabling coarse-to-fine autoregressive modeling with enhanced expressiveness. Notably, the model corresponds to a distribution with zero free energy, serving as a proposal for absolute free energy computation of arbitrary systems without relying on alchemical pathways. Experiments across diverse tasks demonstrate that CARD matches the accuracy of classical computational methods on unseen systems with diverse topologies, while achieving an approximately 40-fold speedup in inference.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CARD introduces a radix decomposition to turn 3D molecular coordinates into mixed sequences for autoregressive modeling and claims this gives an exact zero-free-energy reference distribution, but the normalization step looks shaky without explicit Jacobian handling.

read the letter

The main takeaway is that this work tries to make absolute free energy estimation transferable and fast by converting continuous 3D coordinates into a mixed discrete-continuous sequence via radix decomposition, then modeling it with a coarse-to-fine autoregressive network. They position the resulting distribution as having zero free energy by construction, so it can serve as a reference proposal without needing alchemical transformations between systems.

Referee Report

1 major / 2 minor

Summary. The paper introduces CARD, a generative framework for transferable absolute free energy estimation in molecular systems. It proposes a radix-based decomposition that bijectively maps continuous 3D atomic coordinates to mixed discrete-continuous sequences, which are then modeled autoregressively in a coarse-to-fine manner. The central claim is that this construction yields a reference distribution with exactly zero free energy, enabling direct absolute free energy computation for arbitrary systems without alchemical pathways or system-specific training. Experiments across diverse tasks report accuracy matching classical computational methods on unseen systems with varied topologies, alongside an approximately 40-fold inference speedup.

Significance. If the zero-free-energy property is rigorously established and the generalization holds, CARD would offer a significant advance for computational chemistry and drug discovery by providing a fast, transferable alternative to expensive MD-based free energy calculations that avoids alchemical transformations and input-dimension constraints of prior deep learning methods.

major comments (1)

[Abstract and §3 (Model formulation)] Abstract and §3 (Model formulation): The claim that the model 'corresponds to a distribution with zero free energy' is load-bearing for the absolute free energy proposal. The radix-based bijective map converts Euclidean 3D coordinates to a mixed discrete-continuous sequence; the autoregressive product-of-conditionals then defines a density on sequence space. For the induced density q(x) on coordinate space to integrate to 1 (required for F_ref = 0 by construction), the change-of-variables formula must explicitly include log|det J| where J is the Jacobian of the inverse mapping. The manuscript does not derive or correct for this term in the mixed discrete-continuous setting, so the normalization (and thus zero free energy) is not guaranteed.

minor comments (2)

[§4 (Experiments)] §4 (Experiments): The abstract states 'matching accuracy' and '40-fold speedup' but supplies no quantitative tables, error bars, or explicit baseline descriptions (e.g., which classical methods and system sizes). Adding these would improve clarity.
[Notation throughout] Notation throughout: Define the precise radix decomposition function and its inverse more formally, including how discrete and continuous components are handled in the density.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their thorough review and for identifying a key point regarding the normalization of the reference distribution. We address the major comment below and will incorporate clarifications in the revised manuscript.

read point-by-point responses

Referee: [Abstract and §3 (Model formulation)] Abstract and §3 (Model formulation): The claim that the model 'corresponds to a distribution with zero free energy' is load-bearing for the absolute free energy proposal. The radix-based bijective map converts Euclidean 3D coordinates to a mixed discrete-continuous sequence; the autoregressive product-of-conditionals then defines a density on sequence space. For the induced density q(x) on coordinate space to integrate to 1 (required for F_ref = 0 by construction), the change-of-variables formula must explicitly include log|det J| where J is the Jacobian of the inverse mapping. The manuscript does not derive or correct for this term in the mixed discrete-continuous setting, so the normalization (and thus zero free energy) is not guaranteed.

Authors: We appreciate the referee's precise identification of this technical requirement. The radix decomposition is constructed to be bijective, with the continuous residuals mapped in a volume-preserving manner (Jacobian determinant of 1) and discrete indices handled via summation over the finite radix choices. This ensures the induced density q(x) on coordinate space integrates to 1 by construction. While §3 presents the overall autoregressive factorization and bijectivity, we acknowledge that an explicit change-of-variables derivation including the Jacobian term for the mixed discrete-continuous case was omitted. In the revised manuscript we will add this derivation in §3, verifying ∫ q(x) dx = 1 and thereby rigorously confirming the zero free energy property. revision: yes

Circularity Check

1 steps flagged

Zero free energy asserted by construction of radix decomposition plus autoregressive density on sequence space

specific steps

self definitional [Abstract]
"Notably, the model corresponds to a distribution with zero free energy, serving as a proposal for absolute free energy computation of arbitrary systems without relying on alchemical pathways."

The zero-free-energy property is presented as an automatic consequence of the radix-based bijective decomposition combined with coarse-to-fine autoregressive modeling. The autoregressive product of conditionals normalizes the density on the decomposed sequence space by construction; the claim that the corresponding q(x) on original 3D coordinates also integrates to 1 (hence F_ref = 0) therefore collapses to the modeling choice itself unless the Jacobian of the inverse mapping is separately shown to preserve the required measure.

full rationale

The paper's central claim is that the CARD model induces a reference distribution with exactly zero free energy, enabling absolute FE estimation without alchemical paths. This property is stated as following directly from the bijective radix decomposition to mixed discrete-continuous sequences and the subsequent coarse-to-fine autoregressive factorization. The autoregressive construction guarantees a normalized density on the sequence space by definition, but the induced density q(x) on Euclidean coordinate space requires an explicit log|det J| term from the change-of-variables formula. Because the paper presents zero free energy as an inherent feature without demonstrating that the Jacobian correction is either zero or included, the claimed property reduces to a definitional consequence of the generative construction rather than an independent result.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides insufficient technical detail to identify specific free parameters, axioms, or invented entities. The central innovation is the radix-based decomposition and the zero free energy correspondence, but their mathematical foundations are not elaborated.

pith-pipeline@v0.9.0 · 5476 in / 1246 out tokens · 37982 ms · 2026-05-09T16:02:56.988852+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

74 extracted references · 9 canonical work pages · 5 internal anchors

[1]

Rdkit: Open-source cheminformatics.https://www.rdkit.org
[2]

GPT-4 Technical Report

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. Gpt-4 technical report.arXiv preprint arXiv:2303.08774, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[3]

Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv preprint arXiv:2303.08797, 2023

work page internal anchor Pith review arXiv 2023
[4]

Layer Normalization

Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E Hinton. Layer normalization.arXiv preprint arXiv:1607.06450, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[5]

Free energy calculations by computer simulation

Paul A Bash, UC Singh, R Langridge, and Peter A Kollman. Free energy calculations by computer simulation. Science, 236(4801):564–568, 1987

1987
[6]

Efficient estimation of free energy differences from monte carlo data.Journal of Computational Physics, 22(2):245–268, 1976

Charles H Bennett. Efficient estimation of free energy differences from monte carlo data.Journal of Computational Physics, 22(2):245–268, 1976

1976
[7]

Development and benchmarking of open force field 2.0

Simon Boothroyd, Pavan Kumar Behara, Owen C Madin, David F Hahn, Hyesu Jang, Vytautas Gapsys, Jeffrey R Wagner, Joshua T Horton, David L Dotson, Matthew W Thompson, et al. Development and benchmarking of open force field 2.0. 0: the sage small molecule force field.Journal of chemical theory and computation, 19(11): 3251–3275, 2023

2023
[8]

Language models are few-shot learners

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. Language models are few-shot learners. Advancesin neural information processing systems, 33:1877–1901, 2020

1901
[9]

Free energy methods for the description of molecular processes.Annual Review of Biophysics, 52(1):113–138, 2023

Christophe Chipot. Free energy methods for the description of molecular processes.Annual Review of Biophysics, 52(1):113–138, 2023

2023
[10]

Alchemical free energy methods for drug discovery: progress and challenges.Current opinion in structural biology, 21(2):150–160, 2011

John D Chodera, David L Mobley, Michael R Shirts, Richard W Dixon, Kim Branson, and Vijay S Pande. Alchemical free energy methods for drug discovery: progress and challenges.Current opinion in structural biology, 21(2):150–160, 2011

2011
[11]

Relative binding free energy calculations in drug discovery: recent advances and practical considerations.Journal of chemical information and modeling, 57(12):2911–2937, 2017

Zoe Cournia, Bryce Allen, and Woody Sherman. Relative binding free energy calculations in drug discovery: recent advances and practical considerations.Journal of chemical information and modeling, 57(12):2911–2937, 2017

2017
[12]

Extending the applicability of the ani deep learning molecular potential to sulfur and halogens

Christian Devereux, Justin S Smith, Kate K Huddleston, Kipton Barros, Roman Zubatyuk, Olexandr Isayev, and Adrian E Roitberg. Extending the applicability of the ani deep learning molecular potential to sulfur and halogens. Journal of chemical theory and computation, 16(7):4192–4202, 2020

2020
[13]

Deepbar: a fast and exact method for binding free energy computation.The journal of physical chemistry letters, 12(10):2509–2515, 2021

Xinqiang Ding and Bin Zhang. Deepbar: a fast and exact method for binding free energy computation.The journal of physical chemistry letters, 12(10):2509–2515, 2021

2021
[14]

Uncertainty driven active learning of coarse grained free energy models.npj Computational Materials, 10(1):9, 2024

Blake R Duschatko, Jonathan Vandermause, Nicola Molinari, and Boris Kozinsky. Uncertainty driven active learning of coarse grained free energy models.npj Computational Materials, 10(1):9, 2024

2024
[15]

Openmm 8: molecular dynamics simulation with machine learning potentials.The Journal of Physical Chemistry B, 128(1):109–116, 2023

Peter Eastman, Raimondas Galvelis, Raúl P Peláez, Charlles RA Abreu, Stephen E Farr, Emilio Gallicchio, Anton Gorenko, Michael M Henry, Frank Hu, Jing Huang, et al. Openmm 8: molecular dynamics simulation with machine learning potentials.The Journal of Physical Chemistry B, 128(1):109–116, 2023

2023
[16]

Freeflow: Latent flow matching for free energy difference estimation

Ege Erdogan, Radoslav Ralev, Mika Rebensburg, Céline Marquet, Leon Klein, and Hannes Stark. Freeflow: Latent flow matching for free energy difference estimation. InICLR 2025 Workshop on Machine Learning Multiscale Processes, 2025

2025
[17]

Protein-ligand binding representation learning from fine-grained interactions

Shikun Feng, Minghao Li, Yinjun Jia, Wei-Ying Ma, and Yanyan Lan. Protein-ligand binding representation learning from fine-grained interactions. InThe Twelfth International Conference on Learning Representations, 2024

2024
[18]

Algorithm 97: shortest path.Communications of the ACM, 5(6):345–345, 1962

Robert W Floyd. Algorithm 97: shortest path.Communications of the ACM, 5(6):345–345, 1962

1962
[19]

Self-supervised pocket pretraining via protein fragment-surroundings alignment

Bowen Gao, Yinjun Jia, YuanLe Mo, Yuyan Ni, Wei-Ying Ma, Zhi-Ming Ma, and Yanyan Lan. Self-supervised pocket pretraining via protein fragment-surroundings alignment. InThe Twelfth International Conference on Learning Representations, 2024. 15

2024
[20]

On free energy calculations in drug discovery.Accounts of Chemical Research, 58(20):3137–3145, 2025

Alessia Ghidini, Eleonora Serra, and Andrea Cavalli. On free energy calculations in drug discovery.Accounts of Chemical Research, 58(20):3137–3145, 2025

2025
[21]

Merck molecular force field

Thomas A Halgren. Merck molecular force field. i. basis, form, scope, parameterization, and performance of mmff94. Journal of computational chemistry, 17(5-6):490–519, 1996

1996
[22]

Feat: Free energy estimators with adaptive transport.arXiv preprint arXiv:2504.11516, 2025

Jiajun He, Yuanqi Du, Francisco Vargas, Yuanqing Wang, Carla P Gomes, José Miguel Hernández-Lobato, and Eric Vanden-Eijnden. Feat: Free energy estimators with adaptive transport.arXiv preprint arXiv:2504.11516, 2025

work page arXiv 2025
[23]

Denoising diffusion probabilistic models

Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion probabilistic models. Advances in neural information processing systems, 33:6840–6851, 2020

2020
[24]

Multiresolution equivariant graph variational autoencoder.Machine Learning: Science and Technology, 4(1):015031, 2023

Truong Son Hy and Risi Kondor. Multiresolution equivariant graph variational autoencoder.Machine Learning: Science and Technology, 4(1):015031, 2023

2023
[25]

Zinc20—a free ultralarge-scale chemical database for ligand discovery.Journal of chemical information and modeling, 60(12):6065–6073, 2020

John J Irwin, Khanh G Tang, Jennifer Young, Chinzorig Dandarchuluun, Benjamin R Wong, Munkhzul Khurel- baatar, Yurii S Moroz, John Mayfield, and Roger A Sayle. Zinc20—a free ultralarge-scale chemical database for ligand discovery.Journal of chemical information and modeling, 60(12):6065–6073, 2020

2020
[26]

Nonequilibrium equality for free energy differences.Physical Review Letters, 78(14):2690, 1997

Christopher Jarzynski. Nonequilibrium equality for free energy differences.Physical Review Letters, 78(14):2690, 1997

1997
[27]

Targeted free energy perturbation.Physical Review E, 65(4):046122, 2002

Christopher Jarzynski. Targeted free energy perturbation.Physical Review E, 65(4):046122, 2002

2002
[28]

Rare events and the convergence of exponentially averaged work values.Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, 73(4):046105, 2006

Christopher Jarzynski. Rare events and the convergence of exponentially averaged work values.Physical Review E—Statistical, Nonlinear, and Soft Matter Physics, 73(4):046105, 2006

2006
[29]

Alphafold meets flow matching for generating protein ensembles

Bowen Jing, Bonnie Berger, and Tommi Jaakkola. Alphafold meets flow matching for generating protein ensembles. In Forty-firstInternational Conference on Machine Learning, 2024

2024
[30]

Deepaffinity: interpretable deep learning of compound– protein affinity through unified recurrent and convolutional neural networks.Bioinformatics, 35(18):3329–3338, 2019

Mostafa Karimi, Di Wu, Zhangyang Wang, and Yang Shen. Deepaffinity: interpretable deep learning of compound– protein affinity through unified recurrent and convolutional neural networks.Bioinformatics, 35(18):3329–3338, 2019

2019
[31]

The good, the bad, and the ugly:“hipen”, a new dataset for validating (s) qm/mm free energy simulations.Molecules, 24(4):681, 2019

Fiona L Kearns, Luke Warrensford, Stefan Boresch, and H Lee Woodcock. The good, the bad, and the ugly:“hipen”, a new dataset for validating (s) qm/mm free energy simulations.Molecules, 24(4):681, 2019

2019
[32]

Statistical mechanics of fluid mixtures.The Journal of chemical physics, 3(5):300–313, 1935

John G Kirkwood. Statistical mechanics of fluid mixtures.The Journal of chemical physics, 3(5):300–313, 1935

1935
[33]

Transferable boltzmann generators.Advances in Neural Information Processing Systems, 37:45281–45314, 2024

Leon Klein and Frank Noé. Transferable boltzmann generators.Advances in Neural Information Processing Systems, 37:45281–45314, 2024

2024
[34]

Generalist equivariant transformer towards 3d molecular interaction learning

Xiangzhe Kong, Wenbing Huang, and Yang Liu. Generalist equivariant transformer towards 3d molecular interaction learning. InForty-firstInternational Conference on Machine Learning, 2024

2024
[35]

Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, page eadv9817, 2025

Sarah Lewis, Tim Hempel, José Jiménez-Luna, Michael Gastegger, Yu Xie, Andrew YK Foong, Victor García Satorras, Osama Abdin, Bastiaan S Veeling, Iryna Zaporozhets, et al. Scalable emulation of protein equilibrium ensembles with generative deep learning.Science, page eadv9817, 2025

2025
[36]

Harnessing pre-trained models for accurate prediction of protein-ligand binding affinity

Jiashan Li and Xinqi Gong. Harnessing pre-trained models for accurate prediction of protein-ligand binding affinity. BMC bioinformatics, 26(1):1–21, 2025

2025
[37]

Flow matching for generative modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matthew Le. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Representations, 2023

2023
[38]

Molecular geometry pretraining with se (3)-invariant denoising distance matching

Shengchao Liu, Hongyu Guo, and Jian Tang. Molecular geometry pretraining with se (3)-invariant denoising distance matching. InThe Eleventh International Conference on Learning Representations, 2023

2023
[39]

SGDR: Stochastic Gradient Descent with Warm Restarts

Ilya Loshchilov and Frank Hutter. Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016

work page internal anchor Pith review arXiv 2016
[40]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[41]

Solvation free energies from neural thermodynamic integration

Bálint Máté, François Fleuret, and Tristan Bereau. Solvation free energies from neural thermodynamic integration. The Journal of Chemical Physics, 162(12), 2025. 16

2025
[42]

Considerations in the use of ml interaction potentials for free energy calculations.arXiv preprint arXiv:2403.13952, 2024

Orlando A Mendible, Jonathan K Whitmer, and Yamil J Colón. Considerations in the use of ml interaction potentials for free energy calculations.arXiv preprint arXiv:2403.13952, 2024

work page arXiv 2024
[43]

Best practices for alchemical free energy calculations [article v1

Antonia SJS Mey, Bryce K Allen, Hannah E Bruce Macdonald, John D Chodera, David F Hahn, Maximilian Kuhn, Julien Michel, David L Mobley, Levi N Naden, Samarjeet Prasad, et al. Best practices for alchemical free energy calculations [article v1. 0].Living journal of computational molecular science, 2(1):18378, 2020

2020
[44]

Escaping atom types in force fields using direct chemical perception.Journal of chemical theory and computation, 14(11):6076–6092, 2018

David L Mobley, Caitlin C Bannan, Andrea Rizzi, Christopher I Bayly, John D Chodera, Victoria T Lim, Nathan M Lim, Kyle A Beauchamp, David R Slochower, Michael R Shirts, et al. Escaping atom types in force fields using direct chemical perception.Journal of chemical theory and computation, 14(11):6076–6092, 2018

2018
[45]

Neural thermodynamic integration: Free energies from energy-based diffusion models

Bálint Máté, François Fleuret, and Tristan Bereau. Neural thermodynamic integration: Free energies from energy-based diffusion models. The Journal of Physical Chemistry Letters, 15(45):11395–11404, 2024. doi: 10.1021/acs.jpclett.4c01958. URLhttps://doi.org/10.1021/acs.jpclett.4c01958. PMID: 39503734

work page doi:10.1021/acs.jpclett.4c01958 2024
[46]

Multiresolution graph transformers and wavelet positional encoding for learning long-range and hierarchical structures.The Journal of Chemical Physics, 159(3), 2023

Nhat Khang Ngo, Truong Son Hy, and Risi Kondor. Multiresolution graph transformers and wavelet positional encoding for learning long-range and hierarchical structures.The Journal of Chemical Physics, 159(3), 2023

2023
[47]

Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning.Science, 365(6457):eaaw1147, 2019

Frank Noé, Simon Olsson, Jonas Köhler, and Hao Wu. Boltzmann generators: Sampling equilibrium states of many-body systems with deep learning.Science, 365(6457):eaaw1147, 2019

2019
[48]

Modification of the generalized born model suitable for macromolecules

Alexey Onufriev, Donald Bashford, and David A Case. Modification of the generalized born model suitable for macromolecules. The Journal of Physical Chemistry B, 104(15):3712–3720, 2000

2000
[49]

Moltaut: A tool for the rapid generation of favorable tautomer in aqueous solution.Journal of Chemical Information and Modeling, 63(7):1833–1840, 2023

Xiaolin Pan, Fanyu Zhao, Yueqing Zhang, Xingyu Wang, Xudong Xiao, John ZH Zhang, and Changge Ji. Moltaut: A tool for the rapid generation of favorable tautomer in aqueous solution.Journal of Chemical Information and Modeling, 63(7):1833–1840, 2023

2023
[50]

Fast and accurate prediction of tautomer ratios in aqueous solution via a siamese neural network.Journal of chemical theory and computation, 21(6):3132–3141, 2025

Xiaolin Pan, Xudong Zhang, Song Xia, and Yingkai Zhang. Fast and accurate prediction of tautomer ratios in aqueous solution via a siamese neural network.Journal of chemical theory and computation, 21(6):3132–3141, 2025

2025
[51]

Variational inference with normalizing flows

Danilo Rezende and Shakir Mohamed. Variational inference with normalizing flows. InInternational conference on machine learning, pages 1530–1538. PMLR, 2015

2015
[52]

Ppi-affinity: A web tool for the prediction and optimization of protein–peptide and protein–protein binding affinity.Journal of proteome research, 21(8):1829–1841, 2022

Sandra Romero-Molina, Yasser B Ruiz-Blanco, Joel Mieres-Perez, Mirja Harms, Jan Munch, Michael Ehrmann, and Elsa Sanchez-Garcia. Ppi-affinity: A web tool for the prediction and optimization of protein–peptide and protein–protein binding affinity.Journal of proteome research, 21(8):1829–1841, 2022

2022
[53]

2d similarity, diversity and clustering in rdkit.RDKit UGM, 2019

RA Sayle. 2d similarity, diversity and clustering in rdkit.RDKit UGM, 2019

2019
[54]

Statistically optimal analysis of samples from multiple equilibrium states

Michael R Shirts and John D Chodera. Statistically optimal analysis of samples from multiple equilibrium states. The Journal of chemical physics, 129(12), 2008

2008
[55]

Score- based generative modeling through stochastic differential equations

Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score- based generative modeling through stochastic differential equations. InInternational Conference on Learning Representations, 2021

2021
[56]

Scalable equilibrium sampling with sequential boltzmann generators

Charlie B Tan, Joey Bose, Chen Lin, Leon Klein, Michael M Bronstein, and Alexander Tong. Scalable equilibrium sampling with sequential boltzmann generators. InForty-secondInternational Conference on Machine Learning, 2025

2025
[57]

Amortized sampling with transferable normalizing flows

Charlie B Tan, Majdi Hassan, Leon Klein, Saifuddin Syed, Dominique Beaini, Michael M Bronstein, Alexander Tong, and Kirill Neklyudov. Amortized sampling with transferable normalizing flows. InThe Thirty-ninthAnnual Conference on Neural Information Processing Systems, 2025

2025
[58]

Reweighting from molecular mechanics force fields to the ani-2x neural network potential

Sara Tkaczyk, Johannes Karwounopoulos, Andreas Schöller, H Lee Woodcock, Thierry Langer, Stefan Boresch, and Marcus Wieder. Reweighting from molecular mechanics force fields to the ani-2x neural network potential. Journal of Chemical Theory and Computation, 20(7):2719–2728, 2024

2024
[59]

Atom3d: Tasks on molecules in three dimensions

Raphael John Lamarre Townshend, Martin Vögele, Patricia Adriana Suriana, Alexander Derry, Alexander Powers, Yianni Laloudakis, Sidhika Balachandar, Bowen Jing, Brandon M Anderson, Stephan Eismann, et al. Atom3d: Tasks on molecules in three dimensions. InThirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track(Round 1)...

2021
[60]

Scalable hierarchical self-attention with learnable hierarchy for long-range interactions.Transactions on Machine Learning Research, 2024

Thuan Nguyen Anh Trang, Khang Nhat Ngo, Hugo Sonnery, Thieu Vo, Siamak Ravanbakhsh, and Truong Son Hy. Scalable hierarchical self-attention with learnable hierarchy for long-range interactions.Transactions on Machine Learning Research, 2024

2024
[61]

Escorted free energy simulations: Improving convergence by reducing dissipation.Physical Review Letters, 100(19):190601, 2008

Suriyanarayanan Vaikuntanathan and Christopher Jarzynski. Escorted free energy simulations: Improving convergence by reducing dissipation.Physical Review Letters, 100(19):190601, 2008

2008
[62]

Development and testing of a general amber force field.Journal of computational chemistry, 25(9):1157–1174, 2004

Junmei Wang, Romain M Wolf, James W Caldwell, Peter A Kollman, and David A Case. Development and testing of a general amber force field.Journal of computational chemistry, 25(9):1157–1174, 2004

2004
[63]

Protein conformation generation via force-guided se (3) diffusion models

Lihao Wang, Yuning Shen, Yiqun Wang, Huizhuo Yuan, Yue Wu, Quanquan Gu, et al. Protein conformation generation via force-guided se (3) diffusion models. InForty-firstInternational Conference on Machine Learning, 2024

2024
[64]

Ab initio characterization of protein molecular dynamics with ai2bmd.Nature, 635(8040): 1019–1027, 2024

Tong Wang, Xinheng He, Mingyu Li, Yatao Li, Ran Bi, Yusong Wang, Chaoran Cheng, Xiangzhen Shen, Jiawei Meng, He Zhang, et al. Ab initio characterization of protein molecular dynamics with ai2bmd.Nature, 635(8040): 1019–1027, 2024

2024
[65]

Fitting quantum machine learning potentials to experimental free energy data: predicting tautomer ratios in solution.Chemical science, 12(34):11364–11381, 2021

Marcus Wieder, Josh Fass, and John D Chodera. Fitting quantum machine learning potentials to experimental free energy data: predicting tautomer ratios in solution.Chemical science, 12(34):11364–11381, 2021

2021
[66]

Targeted free energy estimation via learned mappings

Peter Wirnsberger, Andrew J Ballard, George Papamakarios, Stuart Abercrombie, Sébastien Racanière, Alexander Pritzel, Danilo Jimenez Rezende, and Charles Blundell. Targeted free energy estimation via learned mappings. The Journal of Chemical Physics, 153(14), 2020

2020
[67]

Ppi-graphomer: enhanced protein-protein affinity prediction using pretrained and graph transformer models.BMC bioinformatics, 26(1):116, 2025

Jun Xie, Youli Zhang, Ziyang Wang, Xiaocheng Jin, Xiaoli Lu, Shengxiang Ge, and Xiaoping Min. Ppi-graphomer: enhanced protein-protein affinity prediction using pretrained and graph transformer models.BMC bioinformatics, 26(1):116, 2025

2025
[68]

Normalizing flows are capable generative models

Shuangfei Zhai, Ruixiang ZHANG, Preetum Nakkiran, David Berthelot, Jiatao Gu, Huangjie Zheng, Tianrong Chen, Miguel Ángel Bautista, Navdeep Jaitly, and Joshua M Susskind. Normalizing flows are capable generative models. In Forty-secondInternational Conference on Machine Learning, 2025

2025
[69]

Predicting protein-protein interactions in the human proteome.Science, 390(6771):eadt1630, 2025

Jing Zhang, Ian R Humphreys, Jimin Pei, Jinuk Kim, Chulwon Choi, Rongqing Yuan, Jesse Durham, Siqi Liu, Hee-Jung Choi, Minkyung Baek, et al. Predicting protein-protein interactions in the human proteome.Science, 390(6771):eadt1630, 2025

2025
[70]

Uni-mol: A universal 3d molecular representation learning framework

Gengmo Zhou, Zhifeng Gao, Qiankun Ding, Hang Zheng, Hongteng Xu, Zhewei Wei, Linfeng Zhang, and Guolin Ke. Uni-mol: A universal 3d molecular representation learning framework. InThe eleventh international conference on learning representations, 2023

2023
[71]

A multi-source molecular network representation model for protein– protein interactions prediction.Scientific Reports, 14(1):6184, 2024

Hai-Tao Zou, Bo-Ya Ji, and Xiao-Lan Xie. A multi-source molecular network representation model for protein– protein interactions prediction.Scientific Reports, 14(1):6184, 2024

2024
[72]

High-temperature equation of state by a perturbation method

Robert W Zwanzig. High-temperature equation of state by a perturbation method. i. nonpolar gases.The Journal of Chemical Physics, 22(8):1420–1426, 1954. 18 Appendix A Reproducibility The code and data will be made publicly available upon publication. B Proof of Propositions B.1 Proof of proposition 4.1 Assume that every conformation x∈ Ωadmits a unique PC...

1954
[73]

The learning rate is set to 1e-3

Stage I.We first train the model using only the negative log-likelihood objectiveLNLL, with λ1 = 1and λ2 = 0. The learning rate is set to 1e-3
[74]

In this stage, we setλ1 = 1andλ 2 = 0.01, with a learning rate of 2e-4

Stage II.After convergence of the first stage, we continue training by incorporating the energy-matching objective Lenergy, thereby exploiting the force-field labels to rescale and refine the energy landscape. In this stage, we setλ1 = 1andλ 2 = 0.01, with a learning rate of 2e-4. Specifically, we find that jointly optimizing both objectives from scratch ...

work page arXiv