pith. sign in

arxiv: 2606.06866 · v1 · pith:PWTVUNYWnew · submitted 2026-06-05 · 💻 cs.LG · nucl-th

Product units in gated recurrent units improve nuclear-mass prediction

Pith reviewed 2026-06-27 22:29 UTC · model grok-4.3

classification 💻 cs.LG nucl-th
keywords nuclear mass predictiongated recurrent unitscomplex-valued networksproduct unitsmachine learningatomic nucleiextrapolationsequence modeling
0
0 comments X

The pith

A complex additive-multiplicative product-unit GRU achieves the lowest errors in nuclear mass prediction.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper integrates multiplicative product-unit transformations and complex-domain operations into gated recurrent units to model sequences of atomic nuclear masses. It demonstrates that the resulting AM-PU-GRU architecture produces lower root-mean-square errors than real-valued GRU baselines and other machine-learning methods on both interpolation within the AME2016 data and temporal extrapolation to AME2020 values. A sympathetic reader would care because more accurate mass predictions help map poorly known regions of the nuclear chart and work alongside existing theoretical models such as WS4 and SEMF. The approach treats mass values as ordered sequences whose long-range dependencies benefit from joint amplitude and phase dynamics.

Core claim

By integrating multiplicative interactions and product-unit transformations within recurrent units and performing computations in the complex domain to jointly capture amplitude and phase dynamics, the complex additive-multiplicative product-unit gated recurrent unit (AM-PU-GRU) model consistently achieves the lowest prediction errors, with an interpolation RMSE of 0.227 ± 0.004 MeV and an extrapolation RMSE of 0.179 ± 0.015 MeV on tasks based on the atomic mass evaluation (AME2016 and AME2020). These results surpass other state-of-the-art machine learning models, outperform the real-valued GRU baseline and product-unit ablation variants, and remain robust to different theoretical priors inc

What carries the argument

The complex additive-multiplicative product-unit gated recurrent unit (AM-PU-GRU), which embeds product-unit transformations and additive-multiplicative interactions inside GRU cells operating in the complex domain to model long-range dependencies in nuclear mass sequences.

If this is right

  • The AM-PU-GRU establishes complex-valued product-unit recurrent networks as a new benchmark for sequence-based nuclear-mass prediction.
  • The model outperforms real-valued GRU baselines and product-unit ablation variants on the reported tasks.
  • Prediction accuracy remains robust when different theoretical priors such as WS4 and SEMF are supplied as input features.
  • The approach can complement theoretical nuclear models when exploring regions of the nuclear chart with limited experimental data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same multiplicative complex-domain structure might transfer to sequence modeling of other nuclear observables such as radii or binding energies.
  • Further tests on additional mass-evaluation releases or on different extrapolation time horizons would strengthen the temporal-generalization result.
  • The balance between additive and multiplicative pathways inside the recurrent cell could be tuned as a hyperparameter in related physics sequence tasks.

Load-bearing premise

Nuclear mass values can be usefully represented as ordered sequences whose long-range dependencies are better captured by product-unit multiplicative interactions in the complex domain than by standard real-valued GRU operations.

What would settle it

An experiment in which any other recurrent or feed-forward model, trained and evaluated on identical AME2016-to-AME2020 splits, reports a lower interpolation RMSE than 0.227 MeV or a lower extrapolation RMSE than 0.179 MeV would falsify the central performance claim.

Figures

Figures reproduced from arXiv: 2606.06866 by Babette Dellen, John W. Clark, Paulo S.A. Freitas, Ziyuan Li.

Figure 1
Figure 1. Figure 1: illustrates the internal structure of a standard GRU cell. The dia￾gram is consistent with the above equations: The input xt and previous hidden state ht−1 are jointly used to compute the reset gate rt and update gate zt. The reset gate rt controls how much of ht−1 contributes to the candidate activation [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Computational graph of the MI-PU-GRU cell [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Computational graph of the AM-PU-GRU cell [PITH_FULL_IMAGE:figures/full_fig_p005_3.png] view at source ↗
read the original abstract

The prediction of masses of atomic nuclei using machine learning can complement theoretical models and advance the exploration of poorly known domains of the nuclear chart. We propose a machine learning technique based on gated recurrent units (GRU), which have demonstrated competitive performance in nuclear-mass prediction by exploiting long-term dependencies. By integrating multiplicative interactions and product-unit transformations within recurrent units, we report significant improvements in nuclear-mass prediction. Computations are performed in the complex domain to jointly capture amplitude and phase dynamics. For interpolation and temporal-extrapolation tasks based on the atomic mass evaluation (AME2016 and AME2020), the complex additive-multiplicative product-unit gated recurrent unit (AM-PU-GRU) model consistently achieves the lowest prediction errors, with an interpolation RMSE of 0.227 $\pm$ 0.004 MeV and an extrapolation RMSE of 0.179 $\pm$ 0.015 MeV. These results surpass other state-of-the-art machine learning models and also outperform the real-valued GRU baseline and product-unit ablation variants, while remaining robust to different theoretical priors, including WS4 and SEMF. Our findings establish complex-valued product-unit recurrent networks as a new benchmark for sequence-based nuclear-mass prediction.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the complex additive-multiplicative product-unit gated recurrent unit (AM-PU-GRU) for nuclear mass prediction. It integrates multiplicative product-unit transformations and complex-domain computations into GRUs to capture amplitude and phase dynamics in sequences of atomic masses from AME2016 and AME2020 datasets. The central claim is that AM-PU-GRU achieves the lowest prediction errors on interpolation (RMSE 0.227 ± 0.004 MeV) and temporal-extrapolation (RMSE 0.179 ± 0.015 MeV) tasks, outperforming real-valued GRUs, product-unit ablations, other ML models, and remaining robust to priors like WS4 and SEMF.

Significance. If the reported performance gains are verified under controlled experimental conditions, this work would provide a new benchmark for sequence modeling in nuclear physics, demonstrating the value of complex-valued multiplicative interactions for long-range dependencies in nuclear data. The concrete RMSE values with uncertainties and comparisons to baselines and theoretical models strengthen the empirical contribution.

major comments (2)
  1. [Abstract] Abstract: The central performance claims report an extrapolation RMSE (0.179 ± 0.015 MeV) lower than the interpolation RMSE (0.227 ± 0.004 MeV), which contradicts the typical expectation that extrapolation is more challenging; this requires explicit discussion of test-set characteristics, such as number of nuclei, mass range, or potential data overlap, to substantiate the superiority claim.
  2. [Results] Results (and abstract): The reported RMSE values with uncertainties are presented without details on data splits for the interpolation vs. temporal-extrapolation tasks, training protocol, hyperparameter search, ablation controls, or statistical testing; these omissions are load-bearing because they prevent verification of the claimed outperformance over the real-valued GRU baseline and other models.
minor comments (1)
  1. [Abstract] Abstract: A brief reference to the specific equations defining the product-unit integration within the GRU update and reset gates would clarify the architectural modification.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript introducing the AM-PU-GRU model. We address each major comment below and will revise the manuscript to incorporate the requested clarifications and details.

read point-by-point responses
  1. Referee: [Abstract] Abstract: The central performance claims report an extrapolation RMSE (0.179 ± 0.015 MeV) lower than the interpolation RMSE (0.227 ± 0.004 MeV), which contradicts the typical expectation that extrapolation is more challenging; this requires explicit discussion of test-set characteristics, such as number of nuclei, mass range, or potential data overlap, to substantiate the superiority claim.

    Authors: We agree that the lower extrapolation RMSE requires explicit justification, as it deviates from typical expectations. This outcome stems from the distinct characteristics of the two tasks: the temporal-extrapolation test set consists of nuclei newly measured in AME2020 (absent from AME2016), which occupy specific regions of the nuclear chart where the complex multiplicative interactions in AM-PU-GRU capture long-range dependencies effectively. The interpolation task, by contrast, uses held-out nuclei within the AME2016 distribution that may include more challenging or sparsely sampled cases. We will revise the abstract and results sections to include a dedicated discussion of test-set characteristics, specifying the number of nuclei, mass ranges (A and Z), and any data overlap or distributional differences between the sets. revision: yes

  2. Referee: [Results] Results (and abstract): The reported RMSE values with uncertainties are presented without details on data splits for the interpolation vs. temporal-extrapolation tasks, training protocol, hyperparameter search, ablation controls, or statistical testing; these omissions are load-bearing because they prevent verification of the claimed outperformance over the real-valued GRU baseline and other models.

    Authors: We acknowledge that the current version omits key experimental details needed for full verification and reproducibility. In the revised manuscript, we will expand the methods and results sections to provide: precise definitions of the data splits (including exact counts of training/validation/test nuclei for each task and how temporal extrapolation is constructed from AME2016 to AME2020), the complete training protocol (optimizer, batch size, epochs, regularization), the hyperparameter search procedure (grid or random search ranges and selection criteria), full ablation controls with all variants and their configurations, and the statistical methods (e.g., how uncertainties are computed via multiple runs or bootstrapping, and any significance testing against baselines). These additions will directly support verification of the outperformance claims. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical ML results on public datasets

full rationale

The paper reports an empirical ML architecture (AM-PU-GRU) trained on AME2016/AME2020 nuclear mass data and evaluated via RMSE on interpolation and temporal-extrapolation splits. No derivation, uniqueness theorem, or ansatz is invoked that reduces the reported performance numbers to fitted inputs by construction. All load-bearing claims are direct numerical outcomes on external public benchmarks; no self-citation chain or self-definitional step appears in the provided text.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

Abstract-only review yields no explicit free parameters, invented physical entities, or non-standard axioms beyond the implicit domain assumption that nuclear masses form sequences amenable to recurrent modeling. The model architecture itself constitutes the primary addition.

axioms (1)
  • domain assumption Nuclear masses can be represented as ordered sequences whose long-term dependencies are learnable by recurrent units.
    Stated in the abstract's description of GRU application to AME data.

pith-pipeline@v0.9.1-grok · 5751 in / 1362 out tokens · 31672 ms · 2026-06-27T22:29:44.833202+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

24 extracted references · 1 linked inside Pith

  1. [1]

    arXiv preprint arXiv:1406.1078 (2014)

    Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., Bengio, Y.: Learning phrase representations using rnn encoder-decoder for sta- tistical machine translation. arXiv preprint arXiv:1406.1078 (2014)

  2. [2]

    Physics Letters B852, 138608 (2024)

    Dellen, B., Jaekel, U., Freitas, P.S., Clark, J.W.: Predicting nuclear masses with product-unit networks. Physics Letters B852, 138608 (2024)

  3. [3]

    In: Computational Science–ICCS 2019: 19th International Conference,Faro,Portugal,June12–14,2019,Proceedings,PartII19.pp.174–188

    Dellen, B., Jaekel, U., Wolnitza, M.: Function and pattern extrapolation with product-unit networks. In: Computational Science–ICCS 2019: 19th International Conference,Faro,Portugal,June12–14,2019,Proceedings,PartII19.pp.174–188. Springer (2019)

  4. [4]

    Neural Computation 1(1), 133–142 (1989)

    Durbin, R., Rumelhart, D.E.: Product units: A computationally powerful and bi- ologically plausible extension to backpropagation networks. Neural Computation 1(1), 133–142 (1989)

  5. [5]

    Physical Review C111(5), 054322 (2025)

    Guo, J.L., Wang, H.L., Zhang, Z.Z., Liu, M.L.: Probing the refined performance of the categorical-boosting algorithm to the hartree-fock-bogoliubov mass model with several skyrme forces. Physical Review C111(5), 054322 (2025)

  6. [6]

    evaluation of input data, and adjustment procedures

    Huang, W., Wang, M., Kondev, F.G., Audi, G., Naimi, S.: The ame 2020 atomic mass evaluation (i). evaluation of input data, and adjustment procedures. Chinese Physics C45(3), 030002 (2021)

  7. [7]

    Physical Review C111(3), 034329 (2025) Product units in gated recurrent units improve nuclear-mass prediction 15

    Huang, Y., Chen, J., Jia, J., Liu, L.M., Ma, Y.G., Zhang, C.: Validation and extrapolation of atomic masses with a physics-informed fully connected neural network. Physical Review C111(3), 034329 (2025) Product units in gated recurrent units improve nuclear-mass prediction 15

  8. [8]

    arXiv preprint arXiv:2503.19348 (2025)

    Jalili, A., Pan, F., Chen, A.X., Draayer, J.P.: Deep learning approaches for nuclear binding energy prediction: a comparative study of rnn, gru and lstm models. arXiv preprint arXiv:2503.19348 (2025)

  9. [9]

    evaluation of input data; and adjustment procedures

    Kondev, F., Naimi, S.: The ame2016 atomic mass evaluation (i). evaluation of input data; and adjustment procedures. Chinese physics C41(3), 030002 (2017)

  10. [10]

    tables, graphs and references

    Kondev, F., Naimi, S.: The ame2016 atomic mass evaluation (ii). tables, graphs and references. Chinese Physics C41(3), 030003 (2017)

  11. [11]

    Advances in neural information processing systems7(1994)

    Leerink, L., Giles, C., Horne, B., Jabri, M.: Learning with product units. Advances in neural information processing systems7(1994)

  12. [12]

    Physics Letters B848, 138385 (2024)

    Li,M.,Sprouse,T.M.,Meyer,B.S.,Mumpower,M.R.:Atomicmasseswithmachine learning for the astrophysical r process. Physics Letters B848, 138385 (2024)

  13. [13]

    In: International Conference on Computational Science

    Li, Z., Jaekel, U., Dellen, B.: Data-driven 3d shape completion with product units. In: International Conference on Computational Science. pp. 302–315. Springer (2024)

  14. [14]

    In: International Conference on Neural In- formation Processing

    Li, Z., Jaekel, U., Dellen, B.: Advancing complex-valued neural networks with product units for mri reconstruction. In: International Conference on Neural In- formation Processing. pp. 540–554. Springer (2025)

  15. [15]

    arXiv preprint arXiv:2505.04397 (2025)

    Li, Z., Jaekel, U., Dellen, B.: Deep residual learning with product units. arXiv preprint arXiv:2505.04397 (2025)

  16. [16]

    Physical Review C111(1), 014325 (2025)

    Lu, Y., Shang, T., Du, P., Li, J., Liang, H., Niu, Z.: Nuclear mass predictions based on a convolutional neural network. Physical Review C111(1), 014325 (2025)

  17. [17]

    Frontiers in physics11, 1198572 (2023)

    Mumpower, M., Li, M., Sprouse, T.M., Meyer, B.S., Lovell, A.E., Mohan, A.T.: Bayesian averaging for ground state masses of atomic nuclei in a machine learning approach. Frontiers in physics11, 1198572 (2023)

  18. [18]

    Physical Review C106(2), L021301 (2022)

    Mumpower, M.R., Sprouse, T.M., Lovell, A.E., Mohan, A.T.: Physically inter- pretable machine learning for nuclear masses. Physical Review C106(2), L021301 (2022)

  19. [19]

    Communications Physics8(1), 101 (2025)

    Munoz, J.M., Udrescu, S.M., Garcia Ruiz, R.F.: Discovering nuclear models from symbolic machine learning. Communications Physics8(1), 101 (2025)

  20. [20]

    AIP Advances14(10) (2024)

    Pandey, B., Giri, S., Pant, R.D., Jalan, M., Chaudhary, A., Adhikari, N.P.: Pre- diction of binding energy using machine learning approach. AIP Advances14(10) (2024)

  21. [21]

    Wang, M., Huang, W.J., Kondev, F.G., Audi, G., Naimi, S.: The ame 2020 atomic massevaluation(ii).tables,graphsandreferences.ChinesePhysicsC45(3),030003 (2021)

  22. [22]

    Physics Letters B734, 215–219 (2014)

    Wang, N., Liu, M., Wu, X., Meng, J.: Surface diffuseness correction in global mass formula. Physics Letters B734, 215–219 (2014)

  23. [23]

    Zeitschrift für Physik96(7), 431–458 (1935)

    Weizsäcker, C.v.: Zur theorie der kernmassen. Zeitschrift für Physik96(7), 431–458 (1935)

  24. [24]

    Physical Review C109(6), 064322 (2024)

    Yüksel, E., Soydaner, D., Bahtiyar, H.: Nuclear mass predictions using machine learning models. Physical Review C109(6), 064322 (2024)