pith. machine review for the scientific record. sign in

arxiv: 2604.24245 · v1 · submitted 2026-04-27 · ⚛️ physics.chem-ph

Recognition: unknown

A Machine-Learned Symbolic Committor for a Chemical Reaction: Retinal Isomerization

Bettina G. Keller, Florian Renner, Gianluca Lattanzi, Gianmarco Lazzeri, Kai T\"opfer, Roberto Covino, Vittoria Ossanna

Authors on Pith no claims yet

Pith reviewed 2026-05-07 17:45 UTC · model grok-4.3

classification ⚛️ physics.chem-ph
keywords retinal isomerizationcommittorsymbolic regressionmolecular dynamicsreaction coordinatemachine learningcis-trans isomerizationdihedral angles
0
0 comments X

The pith

A machine-learned symbolic committor for retinal isomerization reveals an S-shaped reaction path from nonlinear dihedral coupling that the free-energy surface misses.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to extract an interpretable analytical expression for the committor of retinal cis-trans isomerization directly from unbiased molecular dynamics trajectories. Parametrizing the logit of the committor lets the network resolve the full transition region, after which holdback randomization selects four proper dihedrals and symbolic regression yields a compact nonlinear expression. This expression reproduces the S-shaped, stepwise trajectory observed in the transition path ensemble. The shape is absent from the minimum free-energy path because short non-equilibrium transition events of about 0.13 ps interact with mass asymmetry between heavy-atom and hydrogen dihedrals. A reader would care because the workflow needs no assumed reaction coordinate and demonstrates that equilibrium free-energy surfaces can miss essential dynamical features of fast barrier crossings.

Core claim

We apply Artificial Intelligence for Molecular Mechanism Discovery to N-retinylidene-lysine in vacuum, learning the committor from unbiased trajectories generated by two-way shooting. Parametrizing the logit resolves the coordinate across the transition region. Holdback input randomization identifies four proper dihedrals around the reactive bond as informative while improper dihedrals at C13 and C14 are unsuitable. Symbolic regression then distills the network into compact analytical expressions showing that nonlinear coupling of all four dihedrals is required to reproduce the S-shaped, stepwise pathway seen in the transition path ensemble, a feature arising from the non-equilibriumdynamics

What carries the argument

The logit-parametrized neural committor distilled by symbolic regression into a nonlinear function of the four proper dihedrals, which encodes the reaction coordinate.

If this is right

  • The reaction coordinate consists of nonlinear coupling among exactly four proper dihedrals.
  • The S-shaped path is a direct consequence of non-equilibrium dynamics during the short transition events combined with mass asymmetry.
  • The minimum-free-energy path does not capture the observed mechanism.
  • The workflow applies without prior assumptions to other isomerizations and chemical reactions.
  • An interpretable committor can expose dynamical mechanism details invisible to equilibrium free-energy surfaces.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same distillation step could be applied to reactions in explicit solvent to test whether solvent degrees of freedom enter the symbolic expression.
  • Mass asymmetry effects on short-time torsional dynamics may appear in any system containing both heavy-atom and hydrogen-bearing dihedrals.
  • The identified four-dihedral form could be used as a collective variable in enhanced sampling to accelerate rare-event calculations for similar isomerizations.

Load-bearing premise

Holdback input randomization reliably isolates the four proper dihedrals and symbolic regression faithfully preserves the essential nonlinear coupling without neural-network artifacts.

What would settle it

Running an independent set of transition-path simulations with the four-dihedral nonlinear expression removed and checking whether the S-shaped trajectories disappear or the committor values deviate systematically from 0.5 at the expected surface.

Figures

Figures reproduced from arXiv: 2604.24245 by Bettina G. Keller, Florian Renner, Gianluca Lattanzi, Gianmarco Lazzeri, Kai T\"opfer, Roberto Covino, Vittoria Ossanna.

Figure 1
Figure 1. Figure 1: (a.) Lewis structure of retinal covalently linked to a lysine via a protonated Schiff base (N-retinylidene-lysine). The reaction center around the C13=C14 double-bond is shown in red with atom labeling. (b.) Cis-trans isomerization around the C13=C14 double-bond, with θ1 as naive reaction coordinate marked in orange. (c.) Minimum energy path (black) and logarithmic transition path ensemble (green) of react… view at source ↗
Figure 2
Figure 2. Figure 2: AIMMD. a) Neural-network representation of the logit-committor as a function view at source ↗
Figure 3
Figure 3. Figure 3: (a.) Correlation between sampled committor values and NN committor predictions of shooting point conformations (blue dots). (b.) Normalized absolute importance of the top 8 features from the HIPR analysis and the most important features of improper angles χ1 and χ2. Dihedral angles ζ1 and ζ2 includes atoms not included in the atomic sketches, respectively. The black line indicates the standard deviation of… view at source ↗
Figure 4
Figure 4. Figure 4: a shows the same TPE as in Fig. 1.c but now projected into the space of the dihedrals view at source ↗
Figure 4
Figure 4. Figure 4: (a.) TPE density profile in the θ1, θ2 dihedral angle space with sketches of Newman projections of all 4 θi angles labeled in the projection of the TS (top left). (b.) Isocontour lines of the logit committor q(x) projected in θ1, θ2 dihedral angle space with θ3 = θ4 = π − (θ1 + θ2)/2 as P4 i=1 θi = 2π. The colormap and colored solid lines depict the isocontour lines of the logit committor according to the … view at source ↗
Figure 5
Figure 5. Figure 5: b shows the average dihedral angle velocities view at source ↗
Figure 5
Figure 5. Figure 5: (a.) FES from a series of umbrella simulation and the minimum free energy path (MFEP, dotted light green line) from NEB on the FES between (A) cis and (B) trans configuration. (b.) TPE average dihedral angle velocities (left) v1 = ⟨ ˙θ1⟩TPE and (right) v2 = ⟨ ˙θ2⟩TPE in frames with the respective (θ1, θ2) configurations. The solid black line marks the zero logit committor estimated by Eq. 4.c. Note the dif… view at source ↗
Figure 6
Figure 6. Figure 6: (a.) cis-trans isomerization of N-retinylidene-lysine (b.) Topology of the molecule with reaction center highlighted view at source ↗
Figure 7
Figure 7. Figure 7: A total of 145 retinal conformations with predicted committor values near 0 view at source ↗
Figure 8
Figure 8. Figure 8: Pairwise correlations between the sampled logit committor values and our 5 logit view at source ↗
read the original abstract

The thermal cis-trans isomerization around the C$_{13}$=C$_{14}$ double bond of retinal is a prototypical high-barrier reaction whose mechanism hinges on subtle out-of-plane bending motions. We apply Artificial Intelligence for Molecular Mechanism Discovery (AIMMD) to N-retinylidene-lysine in vacuum, learning the committor from unbiased molecular dynamics trajectories generated by two-way shooting. Parametrizing the logit of the committor, rather than the committor itself, allows the neural network to resolve the reaction coordinate across the full transition region, not only at the isocommittor surface $p_B(\mathbf{x}) = 0.5$. Holdback input randomization identifies four proper dihedrals around the reactive bond as the informative coordinates, while the improper dihedrals at C$_{13}$ and C$_{14}$ prove unsuitable because reactant, transition, and product states share the same values. Symbolic regression then distills the network into compact analytical expressions and shows that a nonlinear coupling of all four dihedrals is required to reproduce the S-shaped, stepwise pathway seen in the transition path ensemble. This S-shape is absent from the minimum-free-energy path: it arises from the non-equilibrium dynamics of the short ($\sim 0.13$ ps) transition events combined with the mass asymmetry between heavy-atom and hydrogen-bearing dihedrals. An interpretable, machine-learned committor thus exposes dynamical features of the mechanism to which the free-energy surface is blind. The workflow requires no prior assumptions about the reaction coordinate and extends naturally to other isomerizations and to chemical reactions more broadly.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper applies a machine-learning workflow (AIMMD) to learn the committor for retinal cis-trans isomerization from unbiased two-way shooting MD trajectories. A neural network is trained on the logit of the committor to resolve the full transition region; holdback input randomization identifies four proper dihedrals as the key coordinates; symbolic regression then distills the network into a compact analytical expression. The resulting committor exhibits an S-shaped, stepwise pathway absent from the minimum free-energy path, which the authors attribute to non-equilibrium dynamics during the short (~0.13 ps) transition events combined with mass asymmetry between the dihedrals.

Significance. If the symbolic expression faithfully reproduces the neural-network committor without introducing fitting artifacts, the work offers a concrete example of how data-driven distillation can expose dynamical features of a reaction mechanism that are invisible on the equilibrium free-energy surface. The unbiased trajectory generation, logit parametrization, and extension to symbolic forms are strengths that could be adopted more broadly for isomerizations and other activated processes.

major comments (3)
  1. [Symbolic regression] Symbolic regression section: the central claim that nonlinear coupling among all four proper dihedrals is required to recover the S-shaped committor pathway lacks a quantitative benchmark. No mean-squared error, correlation coefficient, or other metric is reported between the neural-network output and the distilled symbolic expression, nor are reduced models (linear combinations or ablated cross terms) compared to demonstrate that the nonlinearity is essential rather than an artifact of the network approximation.
  2. [Holdback input randomization] Holdback input randomization subsection: the identification of the four proper dihedrals as the informative coordinates does not include sensitivity checks for inter-dihedral correlations or variability across independent network trainings. Without such analysis, it remains unclear whether the ranking is robust or whether the subsequent symbolic regression embeds spurious couplings that drive the reported S-shape.
  3. [Results] Results on the transition path ensemble: the attribution of the S-shape to non-equilibrium dynamics plus mass asymmetry would be strengthened by explicit error bars on the committor values along the pathway and by cross-validation of the neural network (e.g., train/test splits or multiple random seeds). These are absent, weakening the distinction from possible equilibrium features or fitting bias.
minor comments (2)
  1. [Introduction/Methods] Notation for the committor p_B(x) and its logit is introduced without a clear equation reference in the main text; a single defining equation would improve readability.
  2. [Figures] Figure captions for the committor surfaces and dihedral projections should explicitly state the number of trajectories or frames used to generate each panel.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major point below and indicate the revisions we will make to strengthen the manuscript.

read point-by-point responses
  1. Referee: Symbolic regression section: the central claim that nonlinear coupling among all four proper dihedrals is required to recover the S-shaped committor pathway lacks a quantitative benchmark. No mean-squared error, correlation coefficient, or other metric is reported between the neural-network output and the distilled symbolic expression, nor are reduced models (linear combinations or ablated cross terms) compared to demonstrate that the nonlinearity is essential rather than an artifact of the network approximation.

    Authors: We agree that quantitative benchmarks would strengthen the validation of the symbolic regression. In the revised manuscript we will report the mean-squared error and Pearson correlation coefficient between the neural-network committor and the symbolic expression on held-out transition-path data. We will also add direct comparisons to a linear model of the four dihedrals and to ablated versions lacking selected cross terms, showing that the full nonlinear coupling is required to match the neural-network accuracy and to reproduce the observed S-shape. revision: yes

  2. Referee: Holdback input randomization subsection: the identification of the four proper dihedrals as the informative coordinates does not include sensitivity checks for inter-dihedral correlations or variability across independent network trainings. Without such analysis, it remains unclear whether the ranking is robust or whether the subsequent symbolic regression embeds spurious couplings that drive the reported S-shape.

    Authors: Holdback randomization was repeated on several independently trained networks, yielding consistent top ranking of the four proper dihedrals. To address the referee's concern we will add, in the revision, both the ranking variability across those runs and the pairwise correlation matrix of the dihedral inputs, together with a brief check that removing the most correlated pair does not alter the identified set or the resulting symbolic form. revision: partial

  3. Referee: Results on the transition path ensemble: the attribution of the S-shape to non-equilibrium dynamics plus mass asymmetry would be strengthened by explicit error bars on the committor values along the pathway and by cross-validation of the neural network (e.g., train/test splits or multiple random seeds). These are absent, weakening the distinction from possible equilibrium features or fitting bias.

    Authors: We will include standard-error bars on the committor values along the reported pathway, obtained by averaging over the full transition-path ensemble. We will also add a cross-validation section reporting performance metrics on independent train/test splits and across multiple random seeds for the neural-network training, confirming that the S-shape is reproducible and not an artifact of a single fit. revision: yes

Circularity Check

0 steps flagged

No circularity: committor learned from unbiased MD and distilled via symbolic regression without reduction to inputs by construction

full rationale

The derivation proceeds from unbiased two-way shooting MD trajectories to a neural network trained on the logit of the committor, followed by holdback input randomization to rank coordinates and symbolic regression to obtain an analytical expression. None of these steps reduce by the paper's own equations to a self-definition, a fitted parameter renamed as prediction, or a load-bearing self-citation chain; the S-shaped pathway and its attribution to non-equilibrium dynamics are outputs of the data-driven model compared against an independently computed MFEP. The workflow is self-contained against external benchmarks (MD data and path ensembles) with no invoked uniqueness theorems or smuggled ansatzes.

Axiom & Free-Parameter Ledger

1 free parameters · 2 axioms · 0 invented entities

The central claim rests on standard assumptions of transition path sampling and neural network approximation of the committor; no new physical entities are introduced.

free parameters (1)
  • neural network parameters
    Weights and biases of the network trained to approximate the committor from trajectory data.
axioms (2)
  • domain assumption Unbiased molecular dynamics trajectories generated by two-way shooting adequately sample the transition region for committor learning.
    Invoked when stating that the committor is learned from such trajectories.
  • domain assumption Holdback input randomization correctly ranks the importance of input coordinates for the committor.
    Used to identify the four proper dihedrals as informative.

pith-pipeline@v0.9.0 · 5628 in / 1362 out tokens · 62997 ms · 2026-05-07T17:45:41.011865+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

3 extracted references · 2 canonical work pages · 1 internal anchor

  1. [1]

    G.; Chandler, D.; Dellago, C.; Geissler, P

    (3) Bolhuis, P. G.; Chandler, D.; Dellago, C.; Geissler, P. L. Transition path sampling: Throwing ropes over rough mountain passes, in the dark.Annual review of physical chemistry2002,53, 291–318. (4) Peters, B. Reaction coordinates and mechanistic hypothesis tests.Annual review of physical chemistry2016,67, 669–690. (5) Chen, H.; Roux, B.; Chipot, C. Dis...

  2. [2]

    Understanding Reaction Mechanisms from Start to Finish

    (15) Mitchell, A. R.; Rotskoff, G. M. Committor guided estimates of molecular transition rates.Journal of Chemical Theory and Computation2024,20, 9378–9393. (16) Ma, A.; Dinner, A. R. Automatic method for identifying reaction coordinates in complex systems.The Journal of Physical Chemistry B2005,109, 6769–6779. (17) Contreras Arredondo, S.; Tang, C.; Talm...

  3. [3]

    uVyLKIjZqjivRTKojK/8PwHExN4=

    (36) France-Lanord, A.; Vroylandt, H.; Salanne, M.; Rotenberg, B.; Saitta, A. M.; Pietrucci, F. Data-driven path collective variables.Journal of Chemical Theory and Computation 2024,20, 3069–3084. (37) Vanden-Eijnden, E.; others Transition-path theory and path-finding algorithms for the study of rare events.Annual review of physical chemistry2010,61, 391–...