arxiv: 2604.03705 · v1 · submitted 2026-04-04 · 💻 cs.NE

Recognition: 2 theorem links

· Lean Theorem

TransGP: Task-Conditioned Transformer-Guided Genetic Programming for Multitask Dynamic Flexible Job Shop Scheduling

Meng Xu , Jiao Liu , Hua Yu , Yew Soon Ong

Authors on Pith no claims yet

Pith reviewed 2026-05-13 17:13 UTC · model grok-4.3

classification 💻 cs.NE

keywords genetic programmingtransformermultitask learningdynamic flexible job shop schedulinghyper-heuristicsevolutionary computationtask-conditioned generation

0 comments

The pith

A task-conditioned Transformer guides genetic programming to evolve better heuristics for multiple dynamic job shop scheduling tasks.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces TransGP, a framework that embeds a Transformer model inside genetic programming to solve multiple dynamic flexible job shop scheduling problems at once. The Transformer first learns the distribution of high-performing heuristics that GP has already discovered, then uses task features to generate new, tailored heuristics that steer the evolutionary search. This hybrid approach replaces pure random variation in GP with guided generation, producing faster convergence and stronger final schedules than either standalone multitask GP or a pure Transformer. A reader would care because real manufacturing systems must repeatedly re-optimize under changing conditions, and manually designed rules rarely keep pace with the variety of task combinations.

Core claim

TransGP integrates a task-conditioned Transformer into the genetic programming loop so that the model both captures the distribution of elite heuristics across tasks and produces new heuristics conditioned on the specific task at hand, thereby directing the evolutionary population toward higher-quality regions of the heuristic space and enabling simultaneous optimization of multiple DFJSS instances.

What carries the argument

Task-conditioned Transformer that learns the distribution of elite GP heuristics and performs conditional generation to bias the evolutionary search toward task-specific promising structures.

If this is right

GP populations converge in fewer generations when guided by the Transformer-generated heuristics.
The resulting heuristics achieve lower makespan and higher robustness than both handcrafted rules and pure Transformer outputs.
Knowledge transfers across related scheduling tasks through the shared Transformer model without explicit block swapping.
The same evolutionary loop can handle a variable number of tasks without redesigning the fitness function or representation.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same conditioning mechanism could be applied to other hyper-heuristic domains such as vehicle routing or bin packing where GP already evolves operators.
Online deployment might allow the Transformer to be fine-tuned incrementally as new shop-floor data arrives, reducing the need for full retraining.
The approach suggests a general template for embedding generative models inside evolutionary algorithms to shrink the effective search space.

Load-bearing premise

The Transformer can reliably learn the distribution of elite heuristics from training tasks and generate effective new heuristics for unseen tasks without overfitting or introducing search bias.

What would settle it

Train TransGP on a collection of DFJSS task instances, then test it on a fresh set of task instances drawn from the same distribution and measure whether it still shows faster convergence and lower makespan than multitask GP baselines.

Figures

Figures reproduced from arXiv: 2604.03705 by Hua Yu, Jiao Liu, Meng Xu, Yew Soon Ong.

**Figure 2.** Figure 2: Overall framework of TransGP. III. METHOD A. Overall Framework The core of TransGP lies in its generative approach to evolving symbolic heuristics. We formulate the generation of routing and sequencing rules as a sequence modeling problem by linearizing their abstract syntax trees into token sequences. This representation naturally aligns with the Transformer architecture, which excels at capturing comple… view at source ↗

**Figure 3.** Figure 3: Vectorization of symbolic heuristics. Overall, the proposed method consists of two offline stages. The first stage trains the two Transformer models. The second stage trains the Transformer-guided GP. The final heuristics obtained can then be applied online to make real-time decisions whenever a decision point occurs. The following sections describe the task-conditioned Transformer for symbolic heuristic… view at source ↗

**Figure 4.** Figure 4: Overview of the evolutionary framework in TransGP [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Convergence curves of test performance across 30 [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Training loss convergence of the task-conditioned [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗

**Figure 8.** Figure 8: Function and terminal usage heatmap analysis of the [PITH_FULL_IMAGE:figures/full_fig_p010_8.png] view at source ↗

**Figure 7.** Figure 7: Analysis of rule size and function-to-terminal ratio of [PITH_FULL_IMAGE:figures/full_fig_p010_7.png] view at source ↗

**Figure 9.** Figure 9: The tree structures of learned sequencing rules for 3 tasks. [PITH_FULL_IMAGE:figures/full_fig_p012_9.png] view at source ↗

**Figure 10.** Figure 10: The tree structures of learned routing rules for 3 tasks. [PITH_FULL_IMAGE:figures/full_fig_p012_10.png] view at source ↗

read the original abstract

Hyper-heuristics have become a popular approach for solving dynamic flexible job shop scheduling (DFJSS) problems. They use gradient-free optimization techniques like Genetic Programming (GP) to evolve non-differentiable heuristics. However, conventional GP methods tend to converge slowly because they rely solely on evolutionary search to find good heuristics. Existing multitask GP methods can solve multiple tasks simultaneously and speed up the search by transferring knowledge across similar tasks. But they mostly exchange heuristic building blocks without truly generating heuristics conditioned on task information. In this paper, we aim to accelerate convergence and enable task-specific heuristic generation by incorporating a task-conditioned Transformer model. The Transformer works in two ways. First, it learns the distribution of elite heuristics, biasing the search toward promising regions of the heuristic space. Second, through conditional generation, it produces heuristics tailored to specific tasks, allowing the model to handle multiple scheduling tasks at once and improving overall optimization efficiency. Based on these ideas, we propose TransGP, a Task-Conditioned Transformer-Guided GP framework. This evolutionary paradigm integrates generative modeling with GP, enabling efficient multitask heuristic learning and knowledge transfer. We evaluate TransGP on a range of DFJSS scenarios. Experimental results show that TransGP consistently outperforms multitask GP baselines, widely used handcrafted heuristics, and the pure Transformer model, achieving faster convergence, superior solution quality, and enhanced robustness.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

TransGP uses a task-conditioned Transformer to bias GP search and generate task-specific heuristics for multitask DFJSS, a coherent extension of existing hybrids, but the outperformance claims lack visible experimental detail to evaluate.

read the letter

The core idea is straightforward: take standard multitask GP for dynamic flexible job shop scheduling and add a Transformer that learns the distribution of elite heuristics to steer the evolutionary search while also doing conditional generation for different tasks. This is presented as an improvement over plain knowledge transfer of building blocks, aiming for faster convergence and better handling of multiple scenarios at once. The two-way role for the Transformer is described clearly enough that the architecture makes internal sense on paper. It directly targets the slow search problem in GP-based hyper-heuristics, which is a real practical bottleneck in scheduling applications. That part is a reasonable next step from the multitask GP baselines they cite. The evaluation claims consistent wins on solution quality, speed, and robustness against those baselines, handcrafted rules, and a standalone Transformer. If the full experiments include enough independent runs, proper statistical tests, and controls for how the Transformer was trained and integrated, this could be useful for people building scheduling systems. The soft spot is exactly the lack of those details in the description. Without seeing effect sizes, variance across scenarios, or ablation on the biasing versus generation components, the performance edge is hard to judge and could easily come down to implementation choices rather than the method itself. This is for readers already working on evolutionary methods for job-shop or similar combinatorial problems. A practitioner or researcher in that niche might pick up the architecture and test it on their own instances. It deserves peer review because the problem is relevant and the hybrid framing is coherent, even if the results section will need close checking by referees.

Referee Report

2 major / 2 minor

Summary. The manuscript proposes TransGP, a hybrid framework that integrates a task-conditioned Transformer model with Genetic Programming (GP) for multitask dynamic flexible job shop scheduling (DFJSS). The Transformer is claimed to learn the distribution of elite heuristics to bias GP search and to perform conditional generation of task-specific heuristics, enabling knowledge transfer across tasks and yielding faster convergence, higher solution quality, and greater robustness than multitask GP baselines, handcrafted heuristics, and a pure Transformer model.

Significance. If the reported outperformance is reproducible and statistically supported, the work would offer a concrete advance in hyper-heuristic design by showing how a generative model can usefully condition evolutionary search without replacing it. The two-way use of the Transformer (distribution learning plus conditional generation) is a natural extension of existing multitask GP ideas and could generalize to other domains where heuristic spaces are large and task similarity can be exploited.

major comments (2)

[Experimental Results] The central empirical claim (consistent outperformance with faster convergence and superior solution quality) is load-bearing for the paper's contribution, yet the abstract and description supply no information on the number of independent runs, statistical tests, baseline re-implementations, or effect-size reporting. Without these details the data-to-claim link cannot be evaluated and the result remains unverifiable.
[Method] The description of the Transformer-GP interface (learning elite-heuristic distributions and performing conditional generation) is presented at a high level with no equations, pseudocode, or architectural diagram showing how task conditioning is injected into the GP population or fitness evaluation. This omission makes it impossible to assess whether the claimed two-way interaction is implemented without introducing harmful bias or overfitting to the training task set.

minor comments (2)

[Abstract] The abstract states that TransGP is evaluated 'on a range of DFJSS scenarios' but does not enumerate the exact problem instances, dynamic event types, or objective functions used; this information should appear in the experimental setup section.
[Method] Notation for the task-conditioning mechanism and the elite-heuristic distribution is introduced without a clear table or figure summarizing the input/output shapes or the loss functions employed for Transformer training.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive report. We address each major comment below and will revise the manuscript accordingly to improve clarity and verifiability.

read point-by-point responses

Referee: [Experimental Results] The central empirical claim (consistent outperformance with faster convergence and superior solution quality) is load-bearing for the paper's contribution, yet the abstract and description supply no information on the number of independent runs, statistical tests, baseline re-implementations, or effect-size reporting. Without these details the data-to-claim link cannot be evaluated and the result remains unverifiable.

Authors: We agree that the abstract and high-level description do not provide these experimental details. In the revised manuscript we will explicitly report the number of independent runs (30 per scenario), the statistical tests employed (Wilcoxon rank-sum test with Holm-Bonferroni correction at p < 0.05), confirmation that baselines were re-implemented from the original sources, and effect-size reporting (Cohen's d). These additions will be placed in both the abstract and the experimental results section to make the empirical claims fully verifiable. revision: yes
Referee: [Method] The description of the Transformer-GP interface (learning elite-heuristic distributions and performing conditional generation) is presented at a high level with no equations, pseudocode, or architectural diagram showing how task conditioning is injected into the GP population or fitness evaluation. This omission makes it impossible to assess whether the claimed two-way interaction is implemented without introducing harmful bias or overfitting to the training task set.

Authors: We acknowledge that the current presentation of the Transformer-GP interface is high-level. In the revision we will add: (i) the mathematical formulation of task-conditioned attention and the loss used to learn the elite-heuristic distribution, (ii) pseudocode for the full TransGP loop including how task embeddings condition both the Transformer generator and the GP population initialization/fitness, and (iii) an architectural diagram illustrating the data flow. These additions will allow readers to evaluate the precise mechanism of the two-way interaction and any risk of bias or overfitting. revision: yes

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper presents TransGP as an empirical hybrid evolutionary framework that integrates a task-conditioned Transformer for distribution learning and conditional heuristic generation with standard GP search. No equations, derivations, or self-referential definitions appear in the abstract or description that would reduce claimed performance gains to fitted parameters by construction, self-citation chains, or renamed inputs. The central claims rest on experimental comparisons against baselines, which are independent of any internal reduction to the method's own outputs. This is a standard empirical proposal with no load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract provides no explicit free parameters, axioms, or invented entities; the method is described at the level of combining existing GP and Transformer components.

pith-pipeline@v0.9.0 · 5558 in / 1056 out tokens · 51096 ms · 2026-05-13T17:13:05.000349+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

The Transformer works in two ways. First, it learns the distribution of elite heuristics... Second, through conditional generation, it produces heuristics tailored to specific tasks
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We propose TransGP, a Task-Conditioned Transformer-Guided GP framework

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

50 extracted references · 50 canonical work pages · 2 internal anchors

[1]

A two- individual based evolutionary algorithm for the flexible job shop schedul- ing problem,

J. Ding, Z. Lü, C.-M. Li, L. Shen, L. Xu, and F. Glover, “A two- individual based evolutionary algorithm for the flexible job shop schedul- ing problem,” inProceedings of the AAAI conference on Artificial Intelligence, vol. 33, no. 01, 2019, pp. 2262–2271

work page 2019
[2]

Genetic programming for dynamic flexible job shop scheduling: Evolution with single individu- als and ensembles,

M. Xu, Y . Mei, F. Zhang, and M. Zhang, “Genetic programming for dynamic flexible job shop scheduling: Evolution with single individu- als and ensembles,”IEEE Transactions on Evolutionary Computation, vol. 28, no. 6, pp. 1761–1775, 2023

work page 2023
[3]

Heuristic and metaheuristic methods for the parallel unrelated machines scheduling problem: a survey,

M. Ðurasevi ´c and D. Jakobovi ´c, “Heuristic and metaheuristic methods for the parallel unrelated machines scheduling problem: a survey,” Artificial Intelligence Review, vol. 56, no. 4, pp. 3181–3289, 2023

work page 2023
[4]

Self-labeling the job shop scheduling problem,

A. Corsini, A. Porrello, S. Calderara, and M. Dell’Amico, “Self-labeling the job shop scheduling problem,”Proceedings of the Advances in Neural Information Processing Systems, vol. 37, pp. 105 528–105 551, 2024

work page 2024
[5]

A bi-level framework for learning to solve combinato- rial optimization on graphs,

R. Wang, Z. Hua, G. Liu, J. Zhang, J. Yan, F. Qi, S. Yang, J. Zhou, and X. Yang, “A bi-level framework for learning to solve combinato- rial optimization on graphs,”Proceedings of the Advances in Neural Information Processing Systems, vol. 34, pp. 21 453–21 466, 2021

work page 2021
[6]

Learn to optimise for job shop scheduling: a survey with comparison between genetic programming and reinforcement learning,

M. Xu, Y . Mei, F. Zhang, and M. Zhang, “Learn to optimise for job shop scheduling: a survey with comparison between genetic programming and reinforcement learning,”Artificial Intelligence Review, vol. 58, no. 6, pp. 1–53, 2025

work page 2025
[7]

An improved deep q-network for dynamic flexible job shop scheduling with limited maintenance re- sources,

W. Yi, N. Chen, Y . Chen, and Z. Pei, “An improved deep q-network for dynamic flexible job shop scheduling with limited maintenance re- sources,”International Journal of Production Research, vol. 63, no. 23, pp. 9112–9133, 2025

work page 2025
[8]

Quality diversity genetic programming for learning scheduling heuristics,

M. Xu, F. Neumann, A. Neumann, and Y . S. Ong, “Quality diversity genetic programming for learning scheduling heuristics,” inProceedings of the Genetic and Evolutionary Computation Conference, 2025, pp. 1090–1098

work page 2025
[9]

Explainable artifi- cial intelligence by genetic programming: A survey,

Y . Mei, Q. Chen, A. Lensen, B. Xue, and M. Zhang, “Explainable artifi- cial intelligence by genetic programming: A survey,”IEEE Transactions on Evolutionary Computation, vol. 27, no. 3, pp. 621–641, 2022

work page 2022
[10]

Multifactorial genetic programming for symbolic regression problems,

J. Zhong, L. Feng, W. Cai, and Y .-S. Ong, “Multifactorial genetic programming for symbolic regression problems,”IEEE transactions on systems, man, and cybernetics: systems, vol. 50, no. 11, pp. 4492–4505, 2018

work page 2018
[11]

Automatic design of scheduling policies for dynamic flexible job shop scheduling via surrogate-assisted cooperative co-evolution genetic programming,

Y . Zhou, J. Yang, and Z. Huang, “Automatic design of scheduling policies for dynamic flexible job shop scheduling via surrogate-assisted cooperative co-evolution genetic programming,”International Journal of Production Research, vol. 58, no. 9, pp. 2561–2580, 2020

work page 2020
[12]

Generative model for decision trees,

R. Guidotti, A. Monreale, M. Setzu, and G. V olpi, “Generative model for decision trees,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 38, no. 19, 2024, pp. 21 116–21 124

work page 2024
[13]

Llm-driven instance-speciﬁc heuristic generation and selection,

S. Zhang, S. Liu, N. Lu, J. Wu, J. Liu, Y .-S. Ong, and K. Tang, “Llm-driven instance-specific heuristic generation and selection,”arXiv preprint arXiv:2506.00490, 2025

work page arXiv 2025
[14]

A knowledge-enhanced evo- lutionary multitasking memetic algorithm for multimodal multiobjective flexible job shop scheduling considering speed,

C. Luo, X. Li, L. Gao, Q. Liu, and Q. Fan, “A knowledge-enhanced evo- lutionary multitasking memetic algorithm for multimodal multiobjective flexible job shop scheduling considering speed,”IEEE Transactions on Cybernetics, 2026

work page 2026
[15]

Attention Is All You Need

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Proceedings of the Advances in Neural Information Processing Systems, vol. 30, 2017

work page internal anchor Pith review 2017
[16]

Review of job shop scheduling research and its new perspectives under industry 4.0,

J. Zhang, G. Ding, Y . Zou, S. Qin, and J. Fu, “Review of job shop scheduling research and its new perspectives under industry 4.0,”Journal of intelligent manufacturing, vol. 30, no. 4, pp. 1809–1830, 2019

work page 2019
[17]

Large- scale dynamic scheduling for flexible job-shop with random arrivals of new jobs by hierarchical reinforcement learning,

K. Lei, P. Guo, Y . Wang, J. Zhang, X. Meng, and L. Qian, “Large- scale dynamic scheduling for flexible job-shop with random arrivals of new jobs by hierarchical reinforcement learning,”IEEE Transactions on Industrial Informatics, vol. 20, no. 1, pp. 1007–1018, 2023

work page 2023
[18]

Optimizing dynamic flexible job shop scheduling using an evolutionary multi-task optimization framework and genetic programming,

X. Chen, J. Li, Z. Wang, Q. Chen, K. Gao, and Q. Pan, “Optimizing dynamic flexible job shop scheduling using an evolutionary multi-task optimization framework and genetic programming,”IEEE Transactions on Evolutionary Computation, 2025

work page 2025
[19]

Task relatedness-based multitask genetic programming for dynamic flexible job shop scheduling,

F. Zhang, Y . Mei, S. Nguyen, K. C. Tan, and M. Zhang, “Task relatedness-based multitask genetic programming for dynamic flexible job shop scheduling,”IEEE Transactions on Evolutionary Computation, vol. 27, no. 6, pp. 1705–1719, 2022. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2015 14

work page 2022
[20]

Generate a single heuristic for multiple dynamic flexible job shop scheduling tasks by genetic programming,

J. Chen, Y . Jia, Y . Bi, and W. Chen, “Generate a single heuristic for multiple dynamic flexible job shop scheduling tasks by genetic programming,” inProceedings of the IEEE Congress on Evolutionary Computation. IEEE, 2024, pp. 1–8

work page 2024
[21]

Surrogate-assisted mul- titask genetic programming for learning scheduling heuristics,

F. Zhang, S. Nguyen, Y . Mei, and M. Zhang, “Surrogate-assisted mul- titask genetic programming for learning scheduling heuristics,”Genetic Programming for Production Scheduling: An Evolutionary Learning Approach, pp. 291–311, 2021

work page 2021
[22]

A comparative study of dispatching rules in dynamic flowshops and jobshops,

C. Rajendran and O. Holthaus, “A comparative study of dispatching rules in dynamic flowshops and jobshops,”European journal of operational research, vol. 116, no. 1, pp. 156–170, 1999

work page 1999
[23]

A comparison between linear and non-linear combinations of priority rules for solving flexible job shop scheduling problem,

A. Teymourifar, J. Li, D. Li, and T. Zheng, “A comparison between linear and non-linear combinations of priority rules for solving flexible job shop scheduling problem,” inGlobal Joint Conference on Industrial Engineering and Its Application Areas. Springer, 2022, pp. 105–117

work page 2022
[24]

A flexible dispatching rule for minimizing tardiness in job shop scheduling,

B. Chen and T. I. Matis, “A flexible dispatching rule for minimizing tardiness in job shop scheduling,”International Journal of Production Economics, vol. 141, no. 1, pp. 360–365, 2013

work page 2013
[25]

A genetic program- ming learning approach to generate dispatching rules for flexible shop scheduling problems,

R. Braune, F. Benda, K. F. Doerner, and R. F. Hartl, “A genetic program- ming learning approach to generate dispatching rules for flexible shop scheduling problems,”International Journal of Production Economics, vol. 243, p. 108342, 2022

work page 2022
[26]

A cooperative coevo- lutionary hyper-heuristic approach to solve lot-sizing and job shop scheduling problems using genetic programming,

Y . Zeiträg, J. Rui Figueira, and G. Figueira, “A cooperative coevo- lutionary hyper-heuristic approach to solve lot-sizing and job shop scheduling problems using genetic programming,”International Journal of Production Research, vol. 62, no. 16, pp. 5850–5877, 2024

work page 2024
[27]

Automatic design of dispatching rules with genetic programming for dynamic job shop scheduling,

S. Shady, T. Kaihara, N. Fujii, and D. Kokuryo, “Automatic design of dispatching rules with genetic programming for dynamic job shop scheduling,” inProceedings of the International Conference on Ad- vances in Production Management Systems. Springer, 2020, pp. 399– 407

work page 2020
[28]

J. R. Koza,Genetic programming III: Darwinian invention and problem solving. Morgan Kaufmann, 1999, vol. 3

work page 1999
[29]

A novel feature selec- tion for evolving compact dispatching rules using genetic programming for dynamic job shop scheduling,

S. Shady, T. Kaihara, N. Fujii, and D. Kokuryo, “A novel feature selec- tion for evolving compact dispatching rules using genetic programming for dynamic job shop scheduling,”International Journal of Production Research, vol. 60, no. 13, pp. 4025–4048, 2022

work page 2022
[30]

A computational study of representations in genetic programming to evolve dispatching rules for the job shop scheduling problem,

S. Nguyen, M. Zhang, M. Johnston, and K. C. Tan, “A computational study of representations in genetic programming to evolve dispatching rules for the job shop scheduling problem,”IEEE Transactions on Evolutionary Computation, vol. 17, no. 5, pp. 621–639, 2012

work page 2012
[31]

An improved genetic pro- gramming hyper-heuristic for the dynamic flexible job shop scheduling problem with reconfigurable manufacturing cells,

H. Guo, J. Liu, Y . Wang, and C. Zhuang, “An improved genetic pro- gramming hyper-heuristic for the dynamic flexible job shop scheduling problem with reconfigurable manufacturing cells,”Journal of Manufac- turing Systems, vol. 74, pp. 252–263, 2024

work page 2024
[32]

Multifactorial evolution: Toward evolutionary multitasking,

A. Gupta, Y .-S. Ong, and L. Feng, “Multifactorial evolution: Toward evolutionary multitasking,”IEEE Transactions on Evolutionary Compu- tation, vol. 20, no. 3, pp. 343–357, 2015

work page 2015
[33]

Learning to dispatch for job shop scheduling via deep reinforcement learning,

C. Zhang, W. Song, Z. Cao, J. Zhang, P. S. Tan, and X. Chi, “Learning to dispatch for job shop scheduling via deep reinforcement learning,” Proceedings of the Advances in Neural Information Processing Dystems, vol. 33, pp. 1621–1632, 2020

work page 2020
[34]

Deep reinforcement learning guided improvement heuristic for job shop scheduling,

C. Zhang, Z. Cao, W. Song, Y . Wu, and J. Zhang, “Deep reinforcement learning guided improvement heuristic for job shop scheduling,” inPro- ceedings of the International Conference on Learning Representations, 2024

work page 2024
[35]

Fast approximations for job shop scheduling: A lagrangian dual deep learning method,

J. Kotary, F. Fioretto, and P. Van Hentenryck, “Fast approximations for job shop scheduling: A lagrangian dual deep learning method,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 7, 2022, pp. 7239–7246

work page 2022
[36]

Genetic programming with reinforcement learning trained transformer for real-world dynamic scheduling problems,

X. Chen, R. Qu, J. Dong, R. Bai, and Y . Jin, “Genetic programming with reinforcement learning trained transformer for real-world dynamic scheduling problems,”arXiv preprint arXiv:2504.07779, 2025

work page arXiv 2025
[37]

Mathematical discoveries from program search with large language models,

B. Romera-Paredes, M. Barekatain, A. Novikov, M. Balog, M. P. Kumar, E. Dupont, F. J. Ruiz, J. S. Ellenberg, P. Wang, O. Fawziet al., “Mathematical discoveries from program search with large language models,”Nature, vol. 625, no. 7995, pp. 468–475, 2024

work page 2024
[38]

Evolution of heuristics: Towards efficient automatic algorithm design using large language model,

F. Liu, X. Tong, M. Yuan, X. Lin, F. Luo, Z. Wang, Z. Lu, and Q. Zhang, “Evolution of heuristics: Towards efficient automatic algorithm design using large language model,”arXiv preprint arXiv:2401.02051, 2024

work page arXiv 2024
[39]

Hsevo: Elevating automatic heuristic design with diversity-driven harmony search and genetic algo- rithm using llms,

P. V . T. Dat, L. Doan, and H. T. T. Binh, “Hsevo: Elevating automatic heuristic design with diversity-driven harmony search and genetic algo- rithm using llms,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 39, no. 25, 2025, pp. 26 931–26 938

work page 2025
[40]

AlphaEvolve: A coding agent for scientific and algorithmic discovery

A. Novikov, N. V ˜u, M. Eisenberger, E. Dupont, P.-S. Huang, A. Z. Wagner, S. Shirobokov, B. Kozlovskii, F. J. Ruiz, A. Mehrabianet al., “Alphaevolve: A coding agent for scientific and algorithmic discovery,” arXiv preprint arXiv:2506.13131, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[41]

Reevo: Large language models as hyper-heuristics with reflective evolution,

H. Ye, J. Wang, Z. Cao, F. Berto, C. Hua, H. Kim, J. Park, and G. Song, “Reevo: Large language models as hyper-heuristics with reflective evolution,”Advances in neural information processing systems, vol. 37, pp. 43 571–43 608, 2024

work page 2024
[42]

Program synthesis with genera- tive pre-trained transformers and grammar-guided genetic programming grammar,

N. Tao, A. Ventresque, and T. Saber, “Program synthesis with genera- tive pre-trained transformers and grammar-guided genetic programming grammar,” inProceedings of the IEEE Latin American Conference on Computational Intelligence. IEEE, 2023, pp. 1–6

work page 2023
[43]

Measuring systematic gen- eralization in neural proof generation with transformers,

N. Gontier, K. Sinha, S. Reddy, and C. Pal, “Measuring systematic gen- eralization in neural proof generation with transformers,”Proceedings of the Advances in Neural Information Processing Systems, vol. 33, pp. 22 231–22 242, 2020

work page 2020
[44]

Molgpt: molecular generation using a transformer-decoder model,

V . Bagal, R. Aggarwal, P. Vinod, and U. D. Priyakumar, “Molgpt: molecular generation using a transformer-decoder model,”Journal of chemical information and modeling, vol. 62, no. 9, pp. 2064–2076, 2021

work page 2064
[45]

Transformer in transformer,

K. Han, A. Xiao, E. Wu, J. Guo, C. Xu, and Y . Wang, “Transformer in transformer,”Proceedings of the Advances in Neural Information Processing Systems, vol. 34, pp. 15 908–15 919, 2021

work page 2021
[46]

Towards improved dispatching rules for complex shop floor scenarios: a genetic program- ming approach,

T. Hildebrandt, J. Heger, and B. Scholz Reiter, “Towards improved dispatching rules for complex shop floor scenarios: a genetic program- ming approach,” inProceedings of the Conference on Genetic and Evolutionary Computation, 2010, pp. 257–264

work page 2010
[47]

Adaptive scheduling on unrelated machines with genetic programming,

M. Ðurasevi ´c, D. Jakobovi ´c, and K. Kneževi ´c, “Adaptive scheduling on unrelated machines with genetic programming,”Applied Soft Computing, vol. 48, pp. 419–430, 2016

work page 2016
[48]

Survey on genetic programming and machine learning techniques for heuristic design in job shop scheduling,

F. Zhang, Y . Mei, S. Nguyen, and M. Zhang, “Survey on genetic programming and machine learning techniques for heuristic design in job shop scheduling,”IEEE Transactions on Evolutionary Computation, vol. 28, no. 1, pp. 147–167, 2023

work page 2023
[49]

Incorporation of clustering effects for the wilcoxon rank sum test: a large-sample approach,

B. Rosner, R. J. Glynn, and M.-L. Ting Lee, “Incorporation of clustering effects for the wilcoxon rank sum test: a large-sample approach,” Biometrics, vol. 59, no. 4, pp. 1089–1098, 2003

work page 2003
[50]

Using of jaccard coefficient for keywords similarity,

S. Niwattanakul, J. Singthongchai, E. Naenudorn, and S. Wanapu, “Using of jaccard coefficient for keywords similarity,” inProceedings of the international multiconference of engineers and computer scientists, vol. 1, no. 6, 2013, pp. 380–384

work page 2013