arxiv: 2605.14141 · v1 · pith:MV3FB4SEnew · submitted 2026-05-13 · 💻 cs.AI

Distribution-Aware Algorithm Design with LLM Agents

Saharsh Koganti , Priyadarsi Mishra , Pierfrancesco Beneventano , Tomer Galanti This is my paper

Pith reviewed 2026-05-15 04:59 UTC · model grok-4.3

classification 💻 cs.AI

keywords solver hintsdistribution-aware algorithm designLLM code agentscombinatorial optimizationruntime generalizationsample-consistent solversexecutable code learningalgorithm synthesis

0 comments

The pith

The empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime to new instances from the same distribution.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that learning executable solver code, rather than predictors, allows specialization to an unknown task distribution where both solution quality and execution time matter. Given samples, the approach infers reusable solver hints and compiles them into a library of specialized code. A central proof shows that the fastest solver consistent with the samples will remain correct and fast on fresh instances from the deployment distribution. This is instantiated with LLM code agents across 21 combinatorial optimization distributions, producing solvers that match or exceed heuristic quality while running hundreds of times faster.

Core claim

We prove that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime, and that statistically identifiable hints can be recovered and compiled from polynomially many samples. Empirically, LLM agents on 21 target distributions yield solvers with mean normalized quality 0.971 that improve over the average heuristic pool by 0.224 and over the highest-quality heuristic by 0.098, while running 336.9 times faster than the quality-best heuristic and 16.1 times faster than a time-limited exact backend. On PACE 2025 private instances the solver is valid on all graphs and two orders of magnitude faster than top competition entries, with gains,

What carries the argument

Solver hint: reusable structure inferred from samples and compiled into specialized solver code that replaces general-purpose search with distribution-specific computation.

If this is right

Synthesized solvers reach 0.971 mean normalized quality and improve over the best heuristic by 0.098 while being over 300 times faster.
On PACE 2025 Dominating Set private instances the solver is valid on all 100 graphs and runs about two orders of magnitude faster than top competition solvers.
Many performance gains arise from replacing ambient exponential search or general-purpose optimization with compiled distribution-specific computation.
The approach applies across seven problem classes with polynomially many samples sufficing to recover the hints.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If solver hints are not statistically identifiable for a given distribution, the polynomial-sample guarantee would not hold and the library might need to be rebuilt.
The method could be applied to any domain in which executable procedures must trade off correctness against runtime on a recurring distribution.
Replacing the LLM agent with another code-generation technique would preserve the generalization proof as long as the generated solvers remain sample-consistent.
The framework suggests that runtime-aware generalization can be achieved without explicit modeling of the distribution, provided hints are recoverable.

Load-bearing premise

Statistically identifiable solver hints exist for the target distributions and can be recovered from polynomially many samples to form a library whose fastest consistent member generalizes in runtime.

What would settle it

An experiment showing that the empirically fastest sample-consistent solver from the recovered library fails to match the runtime or correctness of a different library member on a fresh, independent test set drawn from the same distribution.

Figures

Figures reproduced from arXiv: 2605.14141 by Pierfrancesco Beneventano, Priyadarsi Mishra, Saharsh Koganti, Tomer Galanti.

**Figure 1.** Figure 1: Three access models for designing solvers against a distribution D. Worst-case design assumes no distributional information, while average-case complexity assumes an analytic specification of D. We study the intermediate sample-access regime: from S ∼ Dn, the learner infers a solver hint hˆ S ∈ H and compiles it into a deployed solver bcS = Comp(hˆ S). Thus the effective search space is the structured subf… view at source ↗

**Figure 2.** Figure 2: Method pipeline. Each candidate contains a hypothesis Hc, an analysis program Ac, and a deployment solver sc. Executing Ac on the public training sample produces a summary ac = Ac(S pub tr ), which serves as the recovered hint. The solver sc(·, ac) is evaluated on public splits, ranked by validation quality, optimality, and runtime, and refined in a diversity-preserving beam. the sample-access regime: give… view at source ↗

read the original abstract

We study learning when the learned object is executable solver code rather than a predictor. In this setting, correctness is not enough: two solvers may both return valid solutions on the deployment distribution while differing substantially in runtime. Given samples from an unknown task distribution, the learner returns code evaluated on fresh instances by both solution quality and execution time. Our central abstraction is a \emph{solver hint}: reusable structure inferred from samples and compiled into specialized solver code. We prove that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime, and that statistically identifiable hints can be recovered and compiled from polynomially many samples. Empirically, we instantiate the framework with LLM code agents on \(21\) structured combinatorial-optimization target distributions across seven problem classes. The synthesized solvers reach mean normalized quality \(0.971\), improve by \(+0.224\) over the average heuristic pool and by \(+0.098\) over the highest-quality heuristic, and are \(336.9\times\), \(342.8\times\), and \(16.1\times\) faster than the quality-best heuristic, Gurobi, and the selected time-limited exact backend, respectively. On released PACE 2025 Dominating Set private instances, the synthesized solver is valid on all \(100\) graphs and runs about two orders of magnitude faster than top competition solvers, with a moderate quality gap. Inspection shows that many gains come from changing the computational scale: replacing ambient exponential search or general-purpose optimization with compiled distribution-specific computation.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

LLM-synthesized solvers deliver real speedups on structured distributions, but the generalization proof needs to address the data-dependent library issue.

read the letter

The main thing to know is that this paper shows LLM agents can turn samples from a target distribution into specialized solver code that keeps high solution quality while running orders of magnitude faster than general-purpose solvers on combinatorial problems. The empirical numbers are the strongest part: across 21 distributions they hit 0.971 normalized quality, beat the best heuristic by 0.098, and deliver 300x+ speedups over Gurobi and heuristics in many cases. The PACE 2025 dominating-set results add credibility, with valid solutions on all 100 graphs and roughly 100x faster runtime than top entries, even with a moderate quality trade-off. The solver-hint abstraction is genuinely new; it separates reusable structure from the code that exploits it, and the claim that polynomially many samples suffice to recover and compile such hints is a clean framing not in the prior LLM code-generation literature. The proof that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime is the theoretical contribution, and if the library really is fixed and independent of the samples, the uniform-convergence argument looks plausible. The experiments also make a useful point about shifting from ambient exponential search to distribution-specific computation. The soft spot is exactly the one the stress test flags. The library is produced by LLM synthesis on the same samples used to pick and evaluate the fastest solver, so the hypothesis class itself depends on the training data. That breaks the standard PAC-style guarantee unless the paper supplies a separate argument for why the data dependence does not inflate the reported gains. Post-hoc selection of the fastest member also needs clearer controls; the abstract acknowledges it but does not quantify its effect. The work is aimed at people doing automated algorithm design or LLM-assisted optimization. It has enough concrete results and a coherent framework to deserve a serious referee, even though the theory-experiment connection will probably need tightening in revision.

Referee Report

3 major / 2 minor

Summary. The paper studies learning executable solver code (rather than predictors) for unknown task distributions using LLM agents. It introduces 'solver hints' as reusable structure inferred from samples and compiled into specialized code. The central theoretical claim is a proof that the empirically fastest sample-consistent solver from a fixed library generalizes in both correctness and runtime, with statistically identifiable hints recoverable from polynomially many samples. Empirically, LLM-synthesized solvers are evaluated on 21 structured combinatorial optimization distributions across seven problem classes, reporting mean normalized quality of 0.971, speedups of 336.9× over the quality-best heuristic, 342.8× over Gurobi, and 16.1× over a time-limited exact backend, plus strong performance on PACE 2025 Dominating Set private instances.

Significance. If the generalization result holds for data-dependent libraries, the work would provide a principled foundation for automated, distribution-aware algorithm design that optimizes both solution quality and runtime. The empirical speedups on held-out instances and PACE benchmarks, achieved by shifting from general-purpose search to compiled distribution-specific computation, indicate potential practical impact in combinatorial optimization. The framework's emphasis on executable code and runtime-aware evaluation distinguishes it from standard learning approaches.

major comments (3)

[§3] §3 (Theoretical Analysis), central theorem: The proof assumes a fixed, pre-specified library independent of the training samples to invoke uniform convergence or PAC-style generalization for the fastest sample-consistent solver. However, the empirical instantiation (§5) uses LLM agents to synthesize solver code directly from the same samples used for selection and evaluation, rendering the hypothesis class data-dependent. This appears to invalidate the stated generalization argument; a revised proof or reduction that accounts for sample-dependent library construction is needed.
[§5.2] §5.2 (Empirical Evaluation), Table 2 and PACE results: The reported speedups (e.g., 336.9× over quality-best heuristic) rely on post-hoc selection of the fastest solver per distribution. While the abstract acknowledges this, the effect size on the claimed gains is not quantified via ablation or cross-validation; without it, the generalization claim from sample-consistent selection to held-out runtime cannot be fully assessed.
[§4] §4 (Solver Hint Recovery): The claim that hints are 'statistically identifiable' and recoverable from polynomially many samples is stated as a theorem, but the proof sketch does not specify the sample complexity bound, the precise notion of identifiability, or how it interacts with the LLM synthesis procedure. This is load-bearing for the polynomial-sample guarantee.

minor comments (2)

[§2] The notation for 'solver hint' is introduced informally in §2; a boxed formal definition would improve clarity when referenced in the theorems.
[Figure 3] Figure 3 (runtime distributions): Axis scaling and legend placement make it difficult to compare the synthesized solver against the exact backend across all 21 distributions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We are grateful to the referee for the thorough review and valuable suggestions. We address the major comments point by point below, providing clarifications and outlining planned revisions to strengthen the manuscript.

read point-by-point responses

Referee: [§3] §3 (Theoretical Analysis), central theorem: The proof assumes a fixed, pre-specified library independent of the training samples to invoke uniform convergence or PAC-style generalization for the fastest sample-consistent solver. However, the empirical instantiation (§5) uses LLM agents to synthesize solver code directly from the same samples used for selection and evaluation, rendering the hypothesis class data-dependent. This appears to invalidate the stated generalization argument; a revised proof or reduction that accounts for sample-dependent library construction is needed.

Authors: We acknowledge that the central theorem in §3 assumes a fixed, pre-specified library to apply uniform convergence results. The empirical work in §5 uses LLM agents to generate the solver library from samples, making it data-dependent. This is a valid concern. To bridge this, we will add a subsection discussing how the LLM synthesis can be modeled as a fixed (non-adaptive) procedure in the theoretical framework, allowing the generalization bounds to apply to the synthesized solvers. We will also note the limitations and suggest future work on fully adaptive libraries. This constitutes a partial revision as a full new proof for arbitrary data-dependent classes may require additional technical machinery. revision: partial
Referee: [§5.2] §5.2 (Empirical Evaluation), Table 2 and PACE results: The reported speedups (e.g., 336.9× over quality-best heuristic) rely on post-hoc selection of the fastest solver per distribution. While the abstract acknowledges this, the effect size on the claimed gains is not quantified via ablation or cross-validation; without it, the generalization claim from sample-consistent selection to held-out runtime cannot be fully assessed.

Authors: The speedups reported do rely on selecting the empirically fastest solver per distribution after synthesis. While this is acknowledged in the abstract, we agree that quantifying the impact via ablation is important for assessing generalization. In the revision, we will include results from a cross-validation protocol where solver selection is performed on a held-out portion of the samples, and report the resulting speedups on the test instances. This will provide a more conservative estimate of the gains. revision: yes
Referee: [§4] §4 (Solver Hint Recovery): The claim that hints are 'statistically identifiable' and recoverable from polynomially many samples is stated as a theorem, but the proof sketch does not specify the sample complexity bound, the precise notion of identifiability, or how it interacts with the LLM synthesis procedure. This is load-bearing for the polynomial-sample guarantee.

Authors: We appreciate the feedback on §4. The proof sketch indeed omits explicit details. We will expand the section to provide the sample complexity bound derived from standard learning theory (polynomial in the dimension of the hint space and 1/ε), specify that identifiability means recovering the true hint with probability 1-δ, and explain that the LLM procedure serves as a practical heuristic for hint recovery that empirically succeeds with the given sample sizes. This will make the theoretical claim more precise and its interaction with the empirical method explicit. revision: yes

Circularity Check

0 steps flagged

No significant circularity; proof applies to fixed library with independent empirical validation

full rationale

The central theorem is stated for a fixed library of solvers and uses standard generalization arguments on fresh instances drawn from the target distribution. Empirical instantiation via LLM synthesis is presented separately and evaluated on held-out instances (including external PACE 2025 data), without any equation or claim reducing the generalization result to a fit performed on the same samples used to build the library. No self-citations, ansatzes, or renamings are load-bearing for the proof. The derivation chain is therefore self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

The central claim rests on the existence of recoverable hints and a fixed solver library; no explicit free parameters are fitted in the reported results.

axioms (1)

domain assumption Statistically identifiable hints exist for the target task distributions and can be recovered from polynomially many samples.
Invoked to support both the recoverability statement and the generalization proof.

invented entities (1)

solver hint no independent evidence
purpose: Reusable structure inferred from samples and compiled into specialized solver code.
New abstraction introduced to connect sample data to executable, distribution-specific solvers.

pith-pipeline@v0.9.0 · 5584 in / 1295 out tokens · 43797 ms · 2026-05-15T04:59:58.226238+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Theorem 5.1 (Runtime-aware generalization for library selection) and Theorem 5.2 (Exact recovery under identifiable structure) formalize sample-to-solver selection and hint recovery.
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

Hidden SAT backdoor model with salience statistic and exponential speedup on DB.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

59 extracted references · 59 canonical work pages

[1]

D. L. Applegate, R. E. Bixby, V . Chvátal, and W. J. Cook.The Traveling Salesman Problem: A Computational Study. Princeton Series in Applied Mathematics. Princeton University Press, 2006

work page 2006
[2]

Avellaneda

F. Avellaneda. A short description of the solver EvalMaxSAT. In F. Bacchus, J. Berg, M. Järvisalo, and R. Martins, editors,MaxSAT Evaluation 2020: Solver and Benchmark Descriptions, pages 8–9, 2020

work page 2020
[3]

Balog, A

M. Balog, A. L. Gaunt, M. Brockschmidt, S. Nowozin, and D. Tarlow. Deepcoder: Learning to write programs. InInternational Conference on Learning Representations, 2017

work page 2017
[4]

Bestuzheva, M

K. Bestuzheva, M. Besançon, W.-K. Chen, A. Chmiela, T. Donkiewicz, J. van Doornmalen, L. Eifler, O. Gaul, G. Gamrath, A. Gleixner, et al. Enabling research through the SCIP optimization suite 8.0.ACM Transactions on Mathematical Software, 49(2):1–21, 2023

work page 2023
[5]

Bogdanov and L

A. Bogdanov and L. Trevisan. Average-case complexity.Foundations and Trends in Theoretical Computer Science, 2(1):1–106, 2006

work page 2006
[6]

D. Brélaz. New methods to color the vertices of a graph.Communications of the ACM, 22(4):251–256, 1979

work page 1979
[7]

V . Chvátal. A greedy heuristic for the set-covering problem.Mathematics of Operations Research, 4(3):233–235, 1979

work page 1979
[8]

J. Coll, C. M. Li, S. Li, D. Habet, and F. Manyà. Solving weighted maximum satisfiability with branch and bound and clause learning.Computers & Operations Research, 183:107195, 2025

work page 2025
[9]

G. D. da Fonseca, F. Feschet, and Y . Gerard. PACE Solver Description: Shadoks Approach to Minimum Hitting Set and Dominating Set. In A. Agrawal and E. J. van Leeuwen, editors, 20th International Symposium on Parameterized and Exact Computation (IPEC 2025), volume 358 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 34:1–34:5, Dagstuhl, ...

work page 2025
[10]

G. B. Dantzig. Discrete-variable extremum problems.Operations Research, 5(2):266–288, 1957

work page 1957
[11]

Davies and F

J. Davies and F. Bacchus. Exploiting the power of MIP solvers in MAXSAT. InTheory and Applications of Satisfiability Testing – SAT 2013, volume 7962 ofLecture Notes in Computer Science, pages 166–181. Springer, 2013

work page 2013
[12]

Davies and F

J. Davies and F. Bacchus. Postponing optimization to speed up MAXSAT solving. InPrinciples and Practice of Constraint Programming – CP 2013, volume 8124 ofLecture Notes in Computer Science, pages 247–262. Springer, 2013

work page 2013
[13]

Devlin, J

J. Devlin, J. Uesato, S. Bhupatiraju, R. Singh, A.-r. Mohamed, and P. Kohli. Robustfill: Neural program learning under noisy i/o. InProceedings of the 34th International Conference on Machine Learning, volume 70 ofProceedings of Machine Learning Research, pages 990–998. PMLR, 2017

work page 2017
[14]

R. G. Downey and M. R. Fellows.Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013

work page 2013
[15]

Fontan and G

F. Fontan and G. Verger. PACE Solver Description: Reductions and Heuristic Search for the Dominating Set Problem and the Hitting Set Problem. In A. Agrawal and E. J. van Leeuwen, editors,20th International Symposium on Parameterized and Exact Computation (IPEC 2025), volume 358 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 36:1–36:3, ...

work page 2025
[16]

Forrest, S

J. Forrest, S. Vigerske, T. Ralphs, L. Hafer, J. Fasano, H. G. Santos, M. Saltzman, A. King, S. Brito, et al. coin-or/Clp: Release releases/1.17.7, 2022

work page 2022
[17]

J. J. Forrest and R. Lougee-Heimer. CBC user guide. InEmerging Theory, Methods, and Applications, pages 257–277. INFORMS, 2005. 10

work page 2005
[18]

Grobler and S

M. Grobler and S. Siebertz. The PACE 2025 Parameterized Algorithms and Computational Experiments Challenge: Dominating Set and Hitting Set. In A. Agrawal and E. J. van Leeuwen, editors,20th International Symposium on Parameterized and Exact Computation (IPEC 2025), volume 358 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 32:1–32:17, D...

work page 2025
[19]

S. Gulwani. Automating string processing in spreadsheets using input-output examples. InPro- ceedings of the 38th Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, pages 317–330. ACM, 2011

work page 2011
[20]

Gurobi optimizer reference manual, 2026

Gurobi Optimization, LLC. Gurobi optimizer reference manual, 2026

work page 2026
[21]

Held and R

M. Held and R. M. Karp. A dynamic programming approach to sequencing problems.Journal of the Society for Industrial and Applied Mathematics, 10(1):196–210, 1962

work page 1962
[22]

Helsgaun

K. Helsgaun. An effective implementation of the lin–kernighan traveling salesman heuristic. European Journal of Operational Research, 126(1):106–130, 2000

work page 2000
[23]

Hendrycks, S

D. Hendrycks, S. Basart, S. Kadavath, M. Mazeika, A. Arora, E. Guo, C. Burns, S. Y . Puranik, H. He, D. Song, and J. Steinhardt. Measuring coding challenge competence with APPS. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021

work page 2021
[24]

HiGHS: High-performance parallel linear optimization software, 2024

HiGHS Development Team. HiGHS: High-performance parallel linear optimization software, 2024

work page 2024
[25]

Huang, J

Q. Huang, J. V ora, P. Liang, and J. Leskovec. MLAgentBench: Evaluating language agents on machine learning experimentation. InProceedings of the 41st International Conference on Machine Learning, volume 235 ofProceedings of Machine Learning Research, pages 20271–20309. PMLR, 2024

work page 2024
[26]

Huangfu and J

Q. Huangfu and J. A. J. Hall. Parallelizing the dual revised simplex method.Mathematical Programming Computation, 10(1):119–142, 2018

work page 2018
[27]

Ignatiev, A

A. Ignatiev, A. Morgado, and J. Marques-Silva. RC2: An efficient MaxSAT solver.Journal on Satisfiability, Boolean Modeling and Computation, 11(1):53–64, 2019

work page 2019
[28]

Impagliazzo

R. Impagliazzo. A personal view of average-case complexity. InProceedings of the Tenth Annual Structure in Complexity Theory Conference, pages 134–147. IEEE Computer Society, 1995

work page 1995
[29]

C. E. Jimenez, J. Yang, A. Wettig, S. Yao, K. Pei, O. Press, and K. R. Narasimhan. SWE- bench: Can language models resolve real-world GitHub issues? InThe Twelfth International Conference on Learning Representations, 2024

work page 2024
[30]

Kerschke, H

P. Kerschke, H. H. Hoos, F. Neumann, and H. Trautmann. Automated algorithm selection: Survey and perspectives.Evolutionary Computation, 27(1):3–45, 2019

work page 2019
[31]

Kotthoff

L. Kotthoff. Algorithm selection for combinatorial search problems: A survey.AI Magazine, 35(3):48–60, 2014

work page 2014
[32]

S. Lamm, C. Schulz, D. Strash, R. Williger, and H. Zhang. Exactly solving the maximum weight independent set problem on large real-world graphs. InProceedings of the Twenty-First Workshop on Algorithm Engineering and Experiments, pages 144–158. SIAM, 2019

work page 2019
[33]

A. H. Land and A. G. Doig. An automatic method of solving discrete programming problems. Econometrica, 28(3):497–520, 1960

work page 1960
[34]

L. A. Levin. Average case complete problems.SIAM Journal on Computing, 15(1):285–286, 1986

work page 1986
[35]

Y . Li, D. Choi, J. Chung, N. Kushman, J. Schrittwieser, R. Leblond, T. Eccles, J. Keeling, F. Gimeno, A. Dal Lago, T. Hubert, P. Choy, C. de Masson d’Autume, I. Babuschkin, X. Chen, P.-S. Huang, J. Welbl, S. Gowal, A. Cherepanov, J. Molloy, D. J. Mankowitz, E. S. Robson, P. Kohli, N. de Freitas, K. Kavukcuoglu, and O. Vinyals. Competition-level code gene...

work page 2022
[36]

S. Lin. Computer solutions of the traveling salesman problem.Bell System Technical Journal, 44(10):2245–2269, 1965

work page 1965
[37]

Lindauer, H

M. Lindauer, H. H. Hoos, F. Hutter, and T. Schaub. Autofolio: An automatically configured algorithm selector.Journal of Artificial Intelligence Research, 53:745–778, 2015

work page 2015
[38]

C. Luo, Q. Zhang, Z. Su, and Z. Lü. PACE Solver Description: Weighting-Based Local Search Heuristic for the Hitting Set Problem. In A. Agrawal and E. J. van Leeuwen, editors,20th International Symposium on Parameterized and Exact Computation (IPEC 2025), volume 358 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 40:1–40:4, Dagstuhl, Ger...

work page 2025
[39]

Lykouris and S

T. Lykouris and S. Vassilvitskii. Competitive caching with machine learned advice. InProceed- ings of the 35th International Conference on Machine Learning, volume 80 ofProceedings of Machine Learning Research, pages 3296–3305. PMLR, 2018

work page 2018
[40]

Martins, V

R. Martins, V . Manquinho, and I. Lynce. Open-WBO: A modular MaxSAT solver. InTheory and Applications of Satisfiability Testing – SAT 2014, volume 8561 ofLecture Notes in Computer Science, pages 438–445. Springer, 2014

work page 2014
[41]

Mitzenmacher and S

M. Mitzenmacher and S. Vassilvitskii. Algorithms with predictions.Communications of the ACM, 65(7):33–35, 2022

work page 2022
[42]

M. Nye, L. Hewitt, J. B. Tenenbaum, and A. Solar-Lezama. Learning to infer program sketches. InProceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4861–4870. PMLR, 2019

work page 2019
[43]

PACE 2025 Results

PACE Challenge. PACE 2025 Results. https://pacechallenge.org/2025/results/,

work page 2025
[44]

Perron, F

L. Perron, F. Didier, and S. Gay. The CP-SAT-LP solver. In29th International Conference on Principles and Practice of Constraint Programming, volume 280 ofLeibniz International Proceedings in Informatics, pages 3:1–3:2. Schloss Dagstuhl – Leibniz-Zentrum für Informatik, 2023

work page 2023
[45]

Perron and V

L. Perron and V . Furnon. OR-Tools, 2025

work page 2025
[46]

M. Piotrów. UWrMaxSat: Efficient solver for MaxSAT and pseudo-boolean problems. In2020 IEEE 32nd International Conference on Tools with Artificial Intelligence, pages 132–136, 2020

work page 2020
[47]

J. R. Rice. The algorithm selection problem.Advances in Computers, 15:65–118, 1976

work page 1976
[48]

D. J. Rosenkrantz, R. E. Stearns, and P. M. Lewis. An analysis of several heuristics for the traveling salesman problem.SIAM Journal on Computing, 6(3):563–581, 1977

work page 1977
[49]

Roughgarden

T. Roughgarden. Beyond worst-case analysis.Communications of the ACM, 62(3):88–96, 2019

work page 2019
[50]

Shalev-Shwartz and S

S. Shalev-Shwartz and S. Ben-David.Understanding Machine Learning: From Theory to Algorithms. Cambridge University Press, 2014

work page 2014
[51]

Singhal, P

S. Singhal, P. Mishra, E. Malach, and T. Galanti. LLM priors for ERM over programs, 2025

work page 2025
[52]

D. A. Spielman and S.-H. Teng. Smoothed analysis of algorithms: Why the simplex algorithm usually takes polynomial time.Journal of the ACM, 51(3):385–463, 2004

work page 2004
[53]

S. Swat. PACE Solver Description: HitS&DoSeS - Exact and Heuristic Solvers for the Dom- inating Set and Hitting Set Problems. In A. Agrawal and E. J. van Leeuwen, editors,20th International Symposium on Parameterized and Exact Computation (IPEC 2025), volume 358 ofLeibniz International Proceedings in Informatics (LIPIcs), pages 38:1–38:4, Dagstuhl, German...

work page 2025
[54]

L. G. Valiant. A theory of the learnable. InProceedings of the Sixteenth Annual ACM Symposium on Theory of Computing, pages 436–445. ACM, 1984

work page 1984
[55]

V . N. Vapnik.Statistical Learning Theory. Wiley, 1998. 12

work page 1998
[56]

V . N. Vapnik and A. Y . Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities.Theory of Probability and Its Applications, 16(2):264–280, 1971

work page 1971
[57]

Williams, C

R. Williams, C. P. Gomes, and B. Selman. Backdoors to typical case complexity. InProceedings of the Eighteenth International Joint Conference on Artificial Intelligence, pages 1173–1178. Morgan Kaufmann, 2003

work page 2003
[58]

L. Xu, H. H. Hoos, and K. Leyton-Brown. Hydra: Automatically configuring algorithms for portfolio-based selection. InProceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence, pages 210–216, 2010

work page 2010
[59]

best heuristic

L. Xu, F. Hutter, H. H. Hoos, and K. Leyton-Brown. SATzilla: Portfolio-based algorithm selection for SAT.Journal of Artificial Intelligence Research, 32:565–606, 2008. 13 A Broader Impacts This work studies methods for learning distribution-specific solvers from examples. Potential positive impacts include reducing the computational cost of repeated optim...

work page arXiv 2008