arxiv: 2605.12704 · v1 · submitted 2026-05-12 · 💻 cs.SC · cs.AI· cs.LG

Recognition: 2 theorem links

· Lean Theorem

FePySR: A Neural Feature Extraction Framework for Efficient and Scalable Symbolic Regression

Zhiming Yu , Wangtao Lu , Xin Lai

Authors on Pith no claims yet

Pith reviewed 2026-05-14 20:46 UTC · model grok-4.3

classification 💻 cs.SC cs.AIcs.LG

keywords symbolic regressionneural feature extractionequation discoverysearch space reductionbiological ODEsPySR

0 comments

The pith

A neural network first extracts candidate features to shrink the search space for symbolic regression, recovering more complex equations than direct search.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces FePySR, a two-stage framework that first employs a heterogeneous neural network to constrain observational data to a set of candidate expressions. The second stage then performs structural optimization inside this reduced space using PySR. This yields higher equation recovery rates on five standard benchmarks, recovers 36 out of 75 highly complex synthesized equations, and produces smaller mean squared errors on the rest with lower computation time. The same pipeline identifies governing equations in 24 out of 100 biological ordinary differential equation tests where PySR recovers none. Performance remains stable when the number of selected top features varies or when noise is added to the data.

Core claim

By first constraining observational data to valid candidate expressions with a heterogeneous neural network and then optimizing equation structure inside that refined space with PySR, FePySR recovers 36 of 75 complex synthesized equations and governing equations in 24 of 100 biological ODE tests where PySR recovers none.

What carries the argument

Heterogeneous neural network that extracts a constrained set of candidate expressions to reduce the symbolic regression search space before PySR optimization.

If this is right

Higher equation recovery rates on five standard benchmarks than state-of-the-art methods.
Substantially smaller mean squared errors on the unrecovered complex equations.
Reduced computation time relative to PySR alone.
Consistent recovery performance under varying numbers of selected top features and increasing noise levels.
Successful identification of governing equations for biological ODE systems where direct symbolic search fails.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The neural extraction stage could be paired with symbolic regression solvers other than PySR to improve their scalability on complex problems.
The approach may extend to real experimental data from physics or chemistry domains beyond the synthesized benchmarks and biological ODEs tested.
Further tuning of the number of top features extracted could optimize performance for specific scientific domains.

Load-bearing premise

Observational data can be reliably constrained by the neural network to a useful set of valid candidate expressions without systematically excluding critical nonlinear modules.

What would settle it

On an independent collection of 75 highly complex equations, FePySR would recover no more equations and would show no reduction in mean squared error or runtime compared with PySR.

Figures

Figures reproduced from arXiv: 2605.12704 by Wangtao Lu, Xin Lai, Zhiming Yu.

**Figure 2.** Figure 2: Overall architecture of FMN. The architecture comprises an input layer, multiple heterogeneous hidden layers, and a regression layer. A heterogeneous activation unit (HAU) uses a specific mathematical primitive (e.g., (·)2 ,sin(·), cos(·), exp(·), +, ×) as its activation function and learns optimal weights to approximate the target equation’s structural components. For simplicity, only four HAUs are presen… view at source ↗

**Figure 3.** Figure 3: Illustration of the feature extraction process in FMN [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗

**Figure 4.** Figure 4: FePySR performance on the Recover-10 equation across different feature counts. [PITH_FULL_IMAGE:figures/full_fig_p014_4.png] view at source ↗

**Figure 5.** Figure 5: FMN’s sensitivity to noises. The figure presents the normalized values of three evaluation metrics, namely EFR, DCG-1, and DCG-2, across noise levels ranging from 0.02 to 0.20 applied to the observational data. All values are normalized relative to the zero-noise baseline, where values above unity indicate improved feature extraction performance and values below unity indicate degraded performance. The raw… view at source ↗

**Figure 6.** Figure 6: MSE comparison between FePySR and PySR for unrecoverable equations [PITH_FULL_IMAGE:figures/full_fig_p016_6.png] view at source ↗

**Figure 7.** Figure 7: Ground-truth equations and corresponding observational data for SR [PITH_FULL_IMAGE:figures/full_fig_p017_7.png] view at source ↗

read the original abstract

A fundamental challenge in symbolic regression (SR) is efficiently recovering complex mathematical expressions from observational data. Although this problem is NP-hard, many expressions of practical interest decompose naturally into combinations of nonlinear feature modules, concentrating structural complexity into a small number of reusable components. Here, we introduce FePySR, a two-stage framework that reduces the SR search space by extracting valid features prior to equation search. FePySR first employs a heterogeneous neural network to constrain observational data to a set of candidate expressions, then performs structural optimization within this refined expression space using PySR. Across five standard benchmarks, FePySR outperforms state-of-the-art methods by achieving higher equation recovery rates. On a set of 75 highly complex synthesized equations, FePySR recovers 36 equations, while producing substantially smaller mean squared errors on the remaining unrecovered cases, with reduced computation time compared to PySR. FePySR's first stage also maintains consistent performance under varying numbers of selected top features and increasing levels of noise in the observational data. Applied to ordinary differential equations governing biological systems, FePySR successfully identifies governing equations in 24 out of 100 tests where PySR recovers none. Taken together, FePySR is a generalizable framework that can enhance the SR solvers, enabling the efficient and reliable recovery of symbolic expressions across scientific domains.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

FePySR adds a heterogeneous neural preprocessor before PySR to shrink the search space and reports better recovery on some benchmarks, but the thin methods leave the gains hard to trust.

read the letter

The main point is that FePySR runs a heterogeneous neural network first to pull out candidate nonlinear features from data, then hands a reduced expression space to PySR for the structural search. That specific two-stage combination is not in the prior work they cite, so it counts as a concrete engineering move rather than just another neural SR variant. The empirical side shows up in the numbers: higher recovery on five standard benchmarks, 36 out of 75 complex synthetic equations recovered, smaller MSE on the misses, and 24 out of 100 biological ODE cases where plain PySR found nothing. The checks on noise levels and varying numbers of top features are useful practical additions that show the first stage holds up reasonably well. The citation pattern is standard and does not skip obvious related work. The soft spots are the missing pieces. The abstract supplies almost no architecture details, training procedure, or exact definition of how the candidate pool is built, so it is difficult to judge whether the reported lifts are supported or whether the neural stage is systematically dropping multiplicative or compositional terms that PySR would have needed. If that happens, the comparison to full PySR becomes unfair because the search space is incomplete by construction. Without error bars or clearer success criteria, the 36/75 and 24/100 figures are hard to weigh. This is for people who already use or extend symbolic regression tools and want faster runs on real data. A reader focused on practical scaling will find the empirical comparisons worth looking at. It deserves a serious referee because the core idea is straightforward to implement and the claims are specific enough to check or refute once the methods are filled in.

Referee Report

3 major / 2 minor

Summary. The manuscript introduces FePySR, a two-stage framework for symbolic regression. A heterogeneous neural network first extracts a set of candidate nonlinear features from observational data; PySR then performs structural search within this reduced expression space. The paper claims higher equation recovery rates than state-of-the-art methods on five standard benchmarks, recovery of 36 out of 75 highly complex synthesized equations (with lower MSE on the remainder and reduced runtime versus PySR), robustness to noise and feature count, and recovery of governing ODEs in 24/100 biological-system tests where PySR recovers none.

Significance. If the empirical claims are reproducible, FePySR would offer a practical route to scaling symbolic regression to more complex expressions by using neural feature extraction to prune the search space, with demonstrated gains on both synthetic benchmarks and real ODE identification tasks.

major comments (3)

[Abstract and §4] Abstract and §4 (Experiments): The headline recovery numbers (36/75 complex equations, 24/100 ODE cases) and MSE/runtime improvements are stated without any description of the heterogeneous NN architecture, training procedure, loss function, hyperparameter selection, or how the top-k features are converted into the refined grammar for PySR. These details are load-bearing for assessing whether the reported gains are supported by the data.
[§3.2] §3.2 (Feature Extraction): The central assumption that the NN stage produces a candidate pool containing every necessary nonlinear module (e.g., multiplicative or compositional terms such as x*sin(y*z)) is not tested. If the network's heterogeneity is limited to additive or low-order combinations, the subsequent PySR search operates on an incomplete grammar; the reported MSE improvement on unrecovered cases does not rule out systematic exclusion of critical terms.
[§4.3] §4.3 (Biological ODE experiments): The claim that FePySR recovers governing equations in 24/100 tests while PySR recovers none requires the exact definition of the 100 test cases, the noise model, the integration method used to generate data, and the precise success criterion (exact symbolic match versus numerical tolerance). Without these, the 24/100 figure cannot be interpreted.

minor comments (2)

[Figure 2 and §4.1] Figure 2 and §4.1: The caption and text should explicitly state the number of independent runs, random seeds, and whether error bars represent standard deviation or standard error.
[§3.1] Notation: The manuscript uses “top features” without defining the selection threshold or ranking criterion; a short paragraph in §3.1 would remove ambiguity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback. We address each major comment below and will revise the manuscript accordingly to improve reproducibility and address the concerns raised.

read point-by-point responses

Referee: [Abstract and §4] Abstract and §4 (Experiments): The headline recovery numbers (36/75 complex equations, 24/100 ODE cases) and MSE/runtime improvements are stated without any description of the heterogeneous NN architecture, training procedure, loss function, hyperparameter selection, or how the top-k features are converted into the refined grammar for PySR. These details are load-bearing for assessing whether the reported gains are supported by the data.

Authors: We agree these details are essential for reproducibility. In the revised manuscript we will expand Section 3.1 with a full description of the heterogeneous NN architecture (layer types and connectivity), training procedure, loss function, hyperparameter selection, and the exact procedure for converting the top-k extracted features into the refined grammar passed to PySR. revision: yes
Referee: [§3.2] §3.2 (Feature Extraction): The central assumption that the NN stage produces a candidate pool containing every necessary nonlinear module (e.g., multiplicative or compositional terms such as x*sin(y*z)) is not tested. If the network's heterogeneity is limited to additive or low-order combinations, the subsequent PySR search operates on an incomplete grammar; the reported MSE improvement on unrecovered cases does not rule out systematic exclusion of critical terms.

Authors: The heterogeneous architecture is explicitly constructed with dedicated branches for multiplicative, compositional, and higher-order nonlinearities. Nevertheless, we acknowledge that an explicit verification is valuable. We will add an analysis (new figure or table in §3.2) showing the distribution of feature types recovered on the benchmarks, confirming that multiplicative and compositional terms are present in the candidate pool. revision: partial
Referee: [§4.3] §4.3 (Biological ODE experiments): The claim that FePySR recovers governing equations in 24/100 tests while PySR recovers none requires the exact definition of the 100 test cases, the noise model, the integration method used to generate data, and the precise success criterion (exact symbolic match versus numerical tolerance). Without these, the 24/100 figure cannot be interpreted.

Authors: We agree that these experimental details must be provided. In the revised §4.3 we will specify the exact 100 test cases (including source and generation protocol), the noise model, the numerical integration method, and the precise success criterion (symbolic equivalence within a stated numerical tolerance). revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper describes an empirical two-stage engineering framework: a heterogeneous neural network extracts candidate nonlinear features from data, after which PySR performs symbolic search in the reduced space. No equations, fitted parameters, or predictions are presented that reduce to their own inputs by construction. No self-citation chains, uniqueness theorems, or ansatzes are invoked as load-bearing steps. Performance numbers (recovery rates, MSE, runtime) are reported from external benchmarks and are not statistically forced by the method's own definitions. The work is therefore self-contained against external benchmarks with no detectable circular reduction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that practical expressions decompose into reusable nonlinear feature modules; no free parameters or invented entities are specified in the abstract.

axioms (1)

domain assumption Expressions of practical interest decompose naturally into combinations of nonlinear feature modules, concentrating structural complexity into a small number of reusable components.
Explicitly stated in the abstract as the justification for reducing the SR search space via neural pre-extraction.

pith-pipeline@v0.9.0 · 5541 in / 1301 out tokens · 53473 ms · 2026-05-14T20:46:54.432175+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

heterogeneous neural network ... HAU ... library of candidates {(·)²,sin(·),cos(·),exp(·),+,×} ... L = L2 + Lsparse + Lcontrast
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

two-stage framework that reduces the SR search space by extracting valid features prior to equation search

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

43 extracted references · 4 canonical work pages · 1 internal anchor

[1]

SymbolicregressionisNP-hard.TransactionsonMachineLearning Research, 2022

MarcoVirgolinandSolonP.Pissis. SymbolicregressionisNP-hard.TransactionsonMachineLearning Research, 2022

2022
[2]

Prove symbolic regression is NP-hard by symbol graph.arXiv preprint arXiv:2404.13820, 2024

Jinglu Song, Qiang Lu, Bozhou Tian, Jingwen Zhang, Jake Luo, and Zhiguang Wang. Prove symbolic regression is NP-hard by symbol graph.arXiv preprint arXiv:2404.13820, 2024

work page arXiv 2024
[3]

Koza.Genetic programming 2 - automatic discovery of reusable programs

John R. Koza.Genetic programming 2 - automatic discovery of reusable programs. MIT Press, 1994. ISBN 978-0-262-11189-8

1994
[4]

Distilling free-form natural laws from experimental data.Science, 324(5923):81–85, 2009

Michael Schmidt and Hod Lipson. Distilling free-form natural laws from experimental data.Science, 324(5923):81–85, 2009

2009
[5]

La Cava, Lee Spector, and Kourosh Danai

William G. La Cava, Lee Spector, and Kourosh Danai. Epsilon-lexicase selection for regression. In Genetic and Evolutionary Computation Conference, pages 741–748, 2016

2016
[6]

Surrogate modeling for genetic program- mingbyevolvingmodelcomplexity.InGeneticProgrammingTheoryandPracticeXIV,pages217–236, 2017

Marco Virgolin, Tanja Alderliesten, and Peter AN Bosman. Surrogate modeling for genetic program- mingbyevolvingmodelcomplexity.InGeneticProgrammingTheoryandPracticeXIV,pages217–236, 2017

2017
[7]

Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl

Miles Cranmer. Interpretable machine learning for science with PySR and SymbolicRegression.jl. arXiv preprint arXiv:2305.01582, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[8]

Cranmer, and Swarat Chaudhuri

Arya Grayeli, Atharva Sehgal, Omar Costilla-Reyes, Miles D. Cranmer, and Swarat Chaudhuri. Sym- bolicregressionwithalearnedconceptlibrary. InAdvancesinNeuralInformationProcessingSystems, 2024

2024
[9]

Deepsymbolicregression: Recoveringmathematicalexpressionsfromdatavia risk-seeking policy gradients

BrendenK.Petersen,MikelLandajuela,T.NathanMundhenk,CláudioPrataSantiago,SookyungKim, andJoanneTaeryKim. Deepsymbolicregression: Recoveringmathematicalexpressionsfromdatavia risk-seeking policy gradients. InInternational Conference on Learning Representations, 2021

2021
[10]

Petersen, Soo K

Mikel Landajuela, Brenden K. Petersen, Soo K. Kim, Claudio P. Santiago, Ruben Glatt, T. Nathan Mundhenk, Jacob F. Pettit, and Daniel M. Faissol. Improving exploration in policy gradient search: Application to symbolic optimization. InMathematical Reasoning in General Artificial Intelligence Workshop, 2021

2021
[11]

Santiago, Daniel M

Terrell Mundhenk, Mikel Landajuela, Ruben Glatt, Claudio P. Santiago, Daniel M. Faissol, and Bren- den K. Petersen. Symbolic regression via neural-guided genetic programming population seeding. In Advances in Neural Information Processing Systems, volume 34, pages 24912–24923, 2021

2021
[12]

Santiago, Ignacio Aravena, Terrell Nathan Mundhenk, Garrett Mulcahy, and Brenden K

Mikel Landajuela, Chak Shing Lee, Jiachen Yang, Ruben Glatt, Cláudio P. Santiago, Ignacio Aravena, Terrell Nathan Mundhenk, Garrett Mulcahy, and Brenden K. Petersen. A unified framework for deep symbolic regression. InAdvances in Neural Information Processing Systems, 2022

2022
[13]

Incorporating domain knowledge into neural-guidedsearchviainsitupriorsandconstraints

Brenden K Petersen, Claudio Santiago, and Mikel Landajuela. Incorporating domain knowledge into neural-guidedsearchviainsitupriorsandconstraints. InInternationalConferenceonMachineLearn- ing. PMLR, 2021

2021
[14]

Pettit, Chak Shing Lee, Jiachen Yang, Alex Ho, Daniel M

Jacob F. Pettit, Chak Shing Lee, Jiachen Yang, Alex Ho, Daniel M. Faissol, Brenden K. Petersen, and Mikel Landajuela. DisCo-DSO: Coupling discrete and continuous optimization for efficient generative design in hybrid spaces. InAAAI Conference on Artificial Intelligence, pages 27117–27125, 2025. 21

2025
[15]

RL-GEP:symbolicregressionviageneexpressionprogrammingand reinforcement learning

HengzheZhangandAiminZhou. RL-GEP:symbolicregressionviageneexpressionprogrammingand reinforcement learning. InInternational Joint Conference on Neural Networks, pages 1–8, 2021

2021
[16]

Neural symbolic regression that scales

Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurélien Lucchi, and Giambattista Parascandolo. Neural symbolic regression that scales. InInternational Conference on Machine Learning, pages 936–945, 2021

2021
[17]

Kosiorek, Seungjin Choi, and Yee Whye Teh

Juho Lee, Yoonho Lee, Jungtaek Kim, Adam R. Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. InInternational Conference on Machine Learning, pages 3744–3753, 2019

2019
[18]

Transformer-based model for symbolic regression via joint supervised learning

Wenqiang Li, Weijun Li, Linjun Sun, Min Wu, Lina Yu, Jingyi Liu, Yanjie Li, and Songsong Tian. Transformer-based model for symbolic regression via joint supervised learning. InInternational Con- ference on Learning Representations, 2023

2023
[19]

SymbolicGPT : A generative trans- former model for symbolic regression.arXiv preprint arXiv:2106.14131, 2021

Mojtaba Valipour, Bowen You, Maysum Panju, and Ali Ghodsi. SymbolicGPT : A generative trans- former model for symbolic regression.arXiv preprint arXiv:2106.14131, 2021

work page arXiv 2021
[20]

Deep learning for symbolic mathematics

Guillaume Lample and François Charton. Deep learning for symbolic mathematics. InInternational Conference on Learning Representations, 2020

2020
[21]

End-to-end symbolic regression with transformers

Pierre-AlexandreKamienny,Stéphaned’Ascoli,GuillaumeLample,andFrançoisCharton. End-to-end symbolic regression with transformers. InAdvances in Neural Information Processing Systems, 2022

2022
[22]

Kammer, and Olga Fink

Yuan Tian, Wenqi Zhou, Michele Viscione, Hao Dong, David S. Kammer, and Olga Fink. Interac- tive symbolic regression with co-design mechanism through offline reinforcement learning.Nature Communications, 16(1):3930, 2025

2025
[23]

Brunton, Joshua L

Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. Discovering governing equations from data by sparse identification of nonlinear dynamical systems.Proceedings of the National Academy of Sciences, 113(15):3932–3937, 2016

2016
[24]

Mangan, Steven L

Niall M. Mangan, Steven L. Brunton, Joshua L. Proctor, and J. Nathan Kutz. Inferring biological networksbysparseidentificationofnonlineardynamics.IEEETrans.Mol.Biol.MultiScaleCommun., 2(1):52–63, 2016

2016
[25]

Sindy-pi: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics.Proceedings of the Royal Society A, 476(2242): 20200279, 2020

Kadierdan Kaheman, J Nathan Kutz, and Steven L Brunton. Sindy-pi: a robust algorithm for parallel implicit sparse identification of nonlinear dynamics.Proceedings of the Royal Society A, 476(2242): 20200279, 2020

2020
[26]

Adam-sindy: An efficient optimization framework for parameterized nonlinear dynamical system identification.arXiv preprint arXiv:2410.16528, 2024

Siva Viknesh, Younes Tatari, Chase Christenson, and Amirhossein Arzani. Adam-sindy: An efficient optimization framework for parameterized nonlinear dynamical system identification.arXiv preprint arXiv:2410.16528, 2024

work page arXiv 2024
[27]

Learningequationsforextrapolationand control

SubhamS.Sahoo,ChristophH.Lampert,andGeorgMartius. Learningequationsforextrapolationand control. InInternational Conference on Machine Learning, pages 4439–4447, 2018

2018
[28]

Integrationofneuralnetwork-basedsymbolicregressionindeeplearningforscientificdiscovery

SamuelKim,PeterY.Lu,SrijonMukherjee,MichaelGilbert,LiJing,VladimirCeperic,andMarinSol- jacic. Integrationofneuralnetwork-basedsymbolicregressionindeeplearningforscientificdiscovery. IEEE Trans. Neural Networks Learn. Syst., 32(9):4166–4177, 2021

2021
[29]

Ai feynman: A physics-inspired method for symbolic regression.Science Advances, 6(16):eaay2631, 2020

Silviu-Marian Udrescu and Max Tegmark. Ai feynman: A physics-inspired method for symbolic regression.Science Advances, 6(16):eaay2631, 2020. 22

2020
[30]

AI feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

Silviu-MarianUdrescu,AndrewK.Tan,JiahaiFeng,OrisvaldoNeto,TailinWu,andMaxTegmark. AI feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity. InAdvances in Neural Information Processing Systems, 2020

2020
[31]

Koza.Genetic Programming: On the Programming of Computers by Means of Natural Selection

John R. Koza.Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, 1993

1993
[32]

InceptionSR: Recursive symbolic regression for equation synthesis

Edward Gu, Simon Alford, Omar Costilla-Reyes, Miles Cranmer, and Kevin Ellis. InceptionSR: Recursive symbolic regression for equation synthesis. InAAAI Conference on Artificial Intelligence, 2025

2025
[33]

Marco Virgolin, Tanja Alderliesten, Arjan Bel, Cees Witteveen, and Peter A. N. Bosman. Symbolic regression and feature construction with GP-GOMEA applied to radiotherapy dose reconstruction of childhood cancer survivors. InGenetic and Evolutionary Computation Conference, pages 1395–1402, 2018

2018
[34]

Marco Virgolin, Tanja Alderliesten, Cees Witteveen, and Peter A. N. Bosman. Scalable genetic pro- gramming by gene-pool optimal mixing and input-space entropy-based building-block learning. In Genetic and Evolutionary Computation Conference, pages 1041–1048, 2017

2017
[35]

Can we gain more from orthogonality regulariza- tions in training deep networks?Advances in Neural Information Processing Systems, 2018

Nitin Bansal, Xiaohan Chen, and Zhangyang Wang. Can we gain more from orthogonality regulariza- tions in training deep networks?Advances in Neural Information Processing Systems, 2018

2018
[36]

McKay, and Edgar Galván López

Nguyen Quang Uy, Nguyen Xuan Hoai, Michael O’Neill, Robert I. McKay, and Edgar Galván López. Semantically-based crossover in genetic programming: application to real-valued symbolic regression. Genetic Programming and Evolvable Machines, 12(2):91–119, 2011

2011
[37]

MMSR: symbolic regression is a multi-modal information fusion task.Inf

YanjieLi,JingyiLiu,MinWu,LinaYu,WeijunLi,XinNing,WenqiangLi,MeilanHao,YusongDeng, and Shu Wei. MMSR: symbolic regression is a multi-modal information fusion task.Inf. Fusion, 114: 102681, 2025

2025
[38]

Approximating geometric crossover by semantic backpropa- gation

Krzysztof Krawiec and Tomasz Pawlak. Approximating geometric crossover by semantic backpropa- gation. InGenetic and Evolutionary Computation Conference, pages 941–948, 2013

2013
[39]

Tyson, Réka Albert, Albert Goldbeter, Peter Ruoff, and Jill C

John J. Tyson, Réka Albert, Albert Goldbeter, Peter Ruoff, and Jill C. Sible. Functional motifs in biochemical reaction networks.Annual Review of Physical Chemistry, 61:219–240, 2010

2010
[40]

Guiding deep molecular optimization with genetic exploration

Sungsoo Ahn, Junsu Kim, Hankook Lee, and Jinwoo Shin. Guiding deep molecular optimization with genetic exploration. InAdvances in Neural Information Processing Systems, 2020

2020
[41]

Cumulated gain-based evaluation of IR techniques.ACM Transactions on Information Systems, 20(4):422–446, 2002

Kalervo Järvelin and Jaana Kekäläinen. Cumulated gain-based evaluation of IR techniques.ACM Transactions on Information Systems, 20(4):422–446, 2002

2002
[42]

Robust learn- ing from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression.Nature communications, 12(1):3219, 2021

Patrick AK Reinbold, Logan M Kageorge, Michael F Schatz, and Roman O Grigoriev. Robust learn- ing from noisy, incomplete, high-dimensional experimental data via physically constrained symbolic regression.Nature communications, 12(1):3219, 2021

2021
[43]

Noise-resilient symbolic regression with dynamic gating reinforcement learning

Chenglu Sun, Shuo Shen, Wenzhi Tao, Deyi Xue, and Zixia Zhou. Noise-resilient symbolic regression with dynamic gating reinforcement learning. InProceedings of the AAAI Conference on Artificial Intelligence, 2025. 23 Appendix ThisappendixdetailsthetrainingconfigurationoftheFePySRparameters,alongwiththetrainingprocedures applied to the benchmark datasets an...

2025