FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.
arXiv preprint arXiv:2505.16952 (2025)
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
years
2026 4verdicts
UNVERDICTED 4roles
dataset 1polarities
background 1representative citing papers
NLCO benchmark shows LLMs achieve reasonable feasibility on small natural-language CO tasks but degrade on larger instances, with set-based problems easier than graph-structured or bottleneck-objective ones.
CAM is an unsupervised training method for discrete diffusion models on combinatorial optimization problems that uses discrete adjoint dynamics to supply low-variance trajectory-level signals.
A survey compiling roles, applications, benchmarks, challenges, and future directions for large language models in operations research.
citing papers explorer
-
FrontierOR: Benchmarking LLMs' Capacity for Efficient Algorithm Design in Large-Scale Optimization
FrontierOR benchmark shows frontier LLMs outperform Gurobi on solution quality and efficiency in only 31% of one-shot cases and 50% with test-time evolution on hard large-scale optimization tasks.
-
Reasoning in a Combinatorial and Constrained World: Benchmarking LLMs on Natural-Language Combinatorial Optimization
NLCO benchmark shows LLMs achieve reasonable feasibility on small natural-language CO tasks but degrade on larger instances, with set-based problems easier than graph-structured or bottleneck-objective ones.
-
Unsupervised Diffusion Solver for Combinatorial Optimization via Combinatorial Adjoint Matching
CAM is an unsupervised training method for discrete diffusion models on combinatorial optimization problems that uses discrete adjoint dynamics to supply low-variance trajectory-level signals.
-
Large Language Models for Operations Research: A Comprehensive Survey
A survey compiling roles, applications, benchmarks, challenges, and future directions for large language models in operations research.