pith. machine review for the scientific record. sign in

hub

Neural Combinatorial Optimization with Reinforcement Learning

15 Pith papers cite this work. Polarity classification is still indexing.

15 Pith papers citing it
abstract

This paper presents a framework to tackle combinatorial optimization problems using neural networks and reinforcement learning. We focus on the traveling salesman problem (TSP) and train a recurrent network that, given a set of city coordinates, predicts a distribution over different city permutations. Using negative tour length as the reward signal, we optimize the parameters of the recurrent network using a policy gradient method. We compare learning the network parameters on a set of training graphs against learning them on individual test graphs. Despite the computational expense, without much engineering and heuristic designing, Neural Combinatorial Optimization achieves close to optimal results on 2D Euclidean graphs with up to 100 nodes. Applied to the KnapSack, another NP-hard problem, the same method obtains optimal solutions for instances with up to 200 items.

hub tools

years

2026 14 2024 1

representative citing papers

Linear Decision Tree Policies for Integer Linear Programs

math.OC · 2026-05-04 · unverdicted · novelty 7.0

Linear decision trees can represent optimal solution policies for families of integer linear programs, enabling polynomial-time queries after offline synthesis for fixed feasible sets.

ARMATA: Auto-Regressive Multi-Agent Task Assignment

cs.MA · 2026-05-05 · unverdicted · novelty 5.0

ARMATA is a new end-to-end autoregressive model with multi-stage decoding that unifies allocation and routing for multi-agent systems and reports up to 20% better solutions than OR-Tools, CPLEX, and LKH-3 in seconds instead of hours.

citing papers explorer

Showing 15 of 15 citing papers.