Unsupervised Diffusion Solver for Combinatorial Optimization via Combinatorial Adjoint Matching

Shengyu Feng; Tarun Suresh; Yiming Yang

arxiv: 2605.30920 · v2 · pith:NGZKRJTVnew · submitted 2026-05-29 · 💻 cs.LG

Unsupervised Diffusion Solver for Combinatorial Optimization via Combinatorial Adjoint Matching

Shengyu Feng , Tarun Suresh , Yiming Yang This is my paper

Pith reviewed 2026-06-28 23:43 UTC · model grok-4.3

classification 💻 cs.LG

keywords diffusion modelscombinatorial optimizationunsupervised learningadjoint methodsdiscrete diffusionstochastic controlcontinuous-time Markov chains

0 comments

The pith

Diffusion solvers for combinatorial optimization can be trained without optimal solution labels by matching discrete adjoint trajectories on continuous-time Markov chains.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to remove the need for supervised collections of near-optimal solutions when training diffusion models on combinatorial problems. It recasts the diffusion process as stochastic control over continuous-time Markov chains and derives discrete adjoint dynamics that carry optimization signals backward along sampled trajectories. These signals power Combinatorial Adjoint Matching, an unsupervised objective that supplies low-variance, trajectory-level gradients. A sympathetic reader would care because many combinatorial tasks lack ready-made optimal examples, so an unsupervised route that still reaches competitive accuracy would widen the practical reach of neural solvers.

Core claim

By formulating diffusion-based combinatorial optimization as a stochastic control problem over Continuous-Time Markov Chains and introducing discrete adjoint dynamics that propagate optimization signals through discrete generative trajectories, Combinatorial Adjoint Matching supplies a structured unsupervised training framework whose performance exceeds prior unsupervised diffusion baselines and matches strong supervised diffusion solvers as well as traditional solvers on diverse combinatorial problems.

What carries the argument

Discrete adjoint dynamics that propagate optimization signals through discrete generative trajectories inside the stochastic control formulation of diffusion-based combinatorial optimization.

If this is right

Unsupervised diffusion solvers become competitive with supervised diffusion solvers and traditional solvers on multiple combinatorial optimization tasks.
Training no longer requires large collections of near-optimal solutions.
The same adjoint-matching procedure works across diverse combinatorial problems without task-specific supervision.
Structured low-variance trajectory-level signals replace the high-variance estimators used in prior unsupervised diffusion baselines.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same discrete adjoint construction could be applied to other discrete generative models whose trajectories are defined on finite state spaces.
If the adjoint computation scales linearly with trajectory length, the method may enable training on larger combinatorial instances than label-dependent approaches allow.
Removing the label requirement could let the same architecture be reused across related combinatorial problems whose optimal solutions are not jointly available.

Load-bearing premise

The discrete adjoint dynamics on continuous-time Markov chains produce optimization signals that remain well-defined and do not introduce prohibitive bias or variance.

What would settle it

An experiment in which the variance or bias of the trajectory-level signals produced by the discrete adjoints grows large enough to make training diverge or produce solutions measurably worse than the supervised baseline on the same problem instances.

Figures

Figures reproduced from arXiv: 2605.30920 by Shengyu Feng, Tarun Suresh, Yiming Yang.

**Figure 1.** Figure 1: Comparison between CAM and baselines under different inference-time sampling budgets. Performance Across Different Sampling Steps. Figure 1 compares CAM with several baselines under different inference-time sampling budgets on ER-[9000–11000]. CAM consistently achieves the best performance across all sampling budgets and exhibits a substantially stronger inference-time scaling effect. In particular, even… view at source ↗

read the original abstract

Diffusion-based neural solvers have shown strong promise for combinatorial optimization (CO), but existing methods typically rely on supervised training with large collections of near-optimal solutions. In this work, we extend adjoint-based trajectory optimization methods to discrete combinatorial domains. We formulate diffusion-based CO as a stochastic control problem over Continuous-Time Markov Chains and introduce discrete adjoint dynamics for propagating optimization signals through discrete generative trajectories. Building on this formulation, we propose Combinatorial Adjoint Matching (CAM), an unsupervised training framework for discrete diffusion solvers with structured and low-variance trajectory-level optimization signals. Empirically, CAM consistently outperforms existing unsupervised diffusion baselines and achieves performance competitive with strong supervised diffusion solvers and even traditional solvers across diverse combinatorial optimization problems. Our code is available at https://github.com/Shengyu-Feng/CAM.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper defines discrete adjoint dynamics on CTMCs to create an unsupervised CAM objective for training diffusion solvers on combinatorial optimization.

read the letter

The core advance is extending adjoint trajectory optimization to discrete diffusion by modeling combinatorial optimization as stochastic control over continuous-time Markov chains, then deriving discrete adjoint dynamics that supply trajectory-level signals for unsupervised training. This produces the Combinatorial Adjoint Matching objective, which the abstract presents as a direct consequence of that formulation.

The work does two things cleanly. It gives a concrete technical route that removes the need for large sets of near-optimal solutions, and it reports that CAM beats prior unsupervised diffusion baselines while reaching performance levels close to supervised diffusion methods and some traditional solvers across multiple problem classes. The code release supports checking the implementation.

The central assumption—that the CTMC stochastic control setup admits discrete adjoints yielding low-variance, low-bias signals—holds together internally with no visible contradictions or circular fitting in the abstract. The main uncertainty is how much approximation or variance the discretization step actually introduces on harder instances; the provided material does not include the full error analysis or scaling experiments that would settle this.

This paper is for researchers already working on neural combinatorial optimization who care about lowering data requirements. It has enough new machinery and empirical breadth to justify sending it to peer review rather than desk rejection, though any referee will need to examine the derivations and experimental controls closely.

Referee Report

0 major / 0 minor

Summary. The manuscript introduces Combinatorial Adjoint Matching (CAM), an unsupervised training framework for diffusion-based neural solvers on combinatorial optimization (CO) tasks. It formulates diffusion-based CO as a stochastic control problem over Continuous-Time Markov Chains (CTMCs), derives discrete adjoint dynamics to propagate optimization signals through discrete generative trajectories, and uses these to obtain structured, low-variance trajectory-level signals for training without access to near-optimal solutions. The authors report that CAM outperforms existing unsupervised diffusion baselines while achieving performance competitive with strong supervised diffusion solvers and traditional CO methods across diverse problems, and they release code at https://github.com/Shengyu-Feng/CAM.

Significance. If the claimed well-defined discrete adjoint dynamics produce unbiased, low-variance signals as stated, the work would meaningfully advance unsupervised methods for CO by removing dependence on expensive supervised datasets of high-quality solutions. The explicit stochastic-control formulation, code release, and empirical breadth across problem classes constitute concrete strengths that support reproducibility and extension.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary of the manuscript, recognition of the stochastic-control formulation and code release, and recommendation to accept. We are pleased that the potential impact on unsupervised diffusion solvers for combinatorial optimization is acknowledged.

Circularity Check

0 steps flagged

No significant circularity detected in derivation

full rationale

The paper presents a new formulation of diffusion-based combinatorial optimization as a stochastic control problem over Continuous-Time Markov Chains, followed by the introduction of discrete adjoint dynamics to enable unsupervised training via Combinatorial Adjoint Matching (CAM). No load-bearing step reduces by construction to a fitted parameter renamed as a prediction, a self-definitional loop, or a self-citation chain that substitutes for independent justification. The central construction is stated explicitly in the abstract as an extension of adjoint methods to discrete domains, with empirical performance claims resting on external comparisons rather than internal re-derivations of the same quantities. The derivation chain is therefore self-contained on its own terms.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The central claim rests on the validity of modeling diffusion CO as stochastic control on CTMCs and on the existence of usable discrete adjoint dynamics; no free parameters or invented entities are mentioned in the abstract.

axioms (2)

domain assumption Diffusion-based combinatorial optimization can be formulated as a stochastic control problem over Continuous-Time Markov Chains
Explicitly stated as the starting point for the discrete adjoint extension.
ad hoc to paper Discrete adjoint dynamics exist and can propagate optimization signals through discrete generative trajectories with low variance
This is the key technical step introduced to enable unsupervised training.

pith-pipeline@v0.9.1-grok · 5662 in / 1345 out tokens · 25034 ms · 2026-06-28T23:43:54.925255+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

16 extracted references · 8 canonical work pages

[1]

Applegate, D., Bixby, R., Chv ´atal, V ., and Cook, W

doi: https://doi.org/10.1007/978-1-4612-3038-0. Applegate, D., Bixby, R., Chv ´atal, V ., and Cook, W. Concorde TSP solver. https://www.math. uwaterloo.ca/tsp/concorde/index.html,

work page doi:10.1007/978-1-4612-3038-0
[2]

Barab´asi, A.-L

Software package. Barab´asi, A.-L. and Albert, R. Emergence of scal- ing in random networks.Science, 286(5439): 509–512, 1999. doi: 10.1126/science.286.5439

work page doi:10.1126/science.286.5439 1999
[3]

Bellman, R

URL https://www.science.org/doi/ abs/10.1126/science.286.5439.509. Bellman, R. The theory of dynamic programming.Bul- letin of the American Mathematical Society, 60:503–515,

work page doi:10.1126/science.286.5439.509
[4]

org/CorpusID:120598356

URL https://api.semanticscholar. org/CorpusID:120598356. B¨other, M., Kißig, O., Taraz, M., Cohen, S., Seidel, K., and Friedrich, T. What’s wrong with deep learning in tree search for combinatorial optimization. InIn- ternational Conference on Learning Representations,
[5]

Cai, T., Luo, S., Xu, K., He, D., Liu, T.-Y ., and Wang, L

URL https://openreview.net/forum? id=mk0HzdqY7i1. Cai, T., Luo, S., Xu, K., He, D., Liu, T.-Y ., and Wang, L. Graphnorm: A principled approach to accelerating graph neural network training. In Meila, M. and Zhang, T. (eds.),Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pp. 1204...

2021
[6]

Feng, S., Sun, W., Li, S., Talwalkar, A., and Yang, Y

URL https://openreview.net/forum? id=bbJ0QCujU4. Feng, S., Sun, W., Li, S., Talwalkar, A., and Yang, Y . A comprehensive evaluation of contemporary ml-based solvers for combinatorial optimization, 2025. URL https://arxiv.org/abs/2505.16952. Fleming, W. H. and Rishel, R. Deterministic and stochastic optimal control. 1975. URL https: //api.semanticscholar.o...

work page arXiv 2025
[7]

cc/paper_files/paper/2020/file/ 10 Combinatorial Adjoint Matching 4c5bcfec8584af0d967f1ab10179ca4b-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2020/file/ 10 Combinatorial Adjoint Matching 4c5bcfec8584af0d967f1ab10179ca4b-Paper. pdf. Joshi, C. K., Laurent, T., and Bresson, X. An efficient graph convolutional network technique for the travelling sales- man problem, 2019. URL https://arxiv.org/ abs/1906.01227. Kappen, H. J. Path integrals and sy...

work page doi:10.1088/1742-5468/2005/11/p11011 2020
[8]

cc/paper_files/paper/2020/file/ f231f2107df69eab0a3862d50018a9b2-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2020/file/ f231f2107df69eab0a3862d50018a9b2-Paper. pdf. Li, H., Yuan, H., Zhang, H., Lin, J., Ge, D., Wang, M., and Ye, Y . Fmip: Joint continuous-integer flow for mixed-integer linear programming, 2025. URL https: //arxiv.org/abs/2507.23390. Li, Y ., Guo, J., Wang, R., and Yan, J. From distribution le...

work page arXiv 2020
[9]

Li, Z., Chen, Q., and Koltun, V

URL https://openreview.net/forum? id=9JM03CQwzC. Li, Z., Chen, Q., and Koltun, V . Combinatorial op- timization with graph convolutional networks and guided tree search. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.),Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.,
[10]

cc/paper_files/paper/2018/file/ 8d3bba7425e7c98c50f52ca1b52d3735-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2018/file/ 8d3bba7425e7c98c50f52ca1b52d3735-Paper. pdf. Lipman, Y ., Chen, R. T. Q., Ben-Hamu, H., Nickel, M., and Le, M. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Represen- tations, 2023. URL https://openreview.net/ forum?id=PqvMRDCJT9t. Liu, L., Jiang,...

2018
[11]

Sanokowski, S., Hochreiter, S., and Lehner, S

URL https://openreview.net/forum? id=9u05zr0nhx. Sanokowski, S., Hochreiter, S., and Lehner, S. A diffu- sion model framework for unsupervised neural combi- natorial optimization. InICML, 2024. URL https: //openreview.net/forum?id=AFfXlKFHXJ. Sanokowski, S., Berghammer, W. F., Wang, H. P., En- nemoser, M., Hochreiter, S., and Lehner, S. Scalable dis- cret...

work page doi:10.1007/0-387-29903-3 2024
[12]

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S

URL https://openreview.net/forum? id=VXB4xxAgOf. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S. Deep unsupervised learning using nonequi- librium thermodynamics. In Bach, F. and Blei, D. (eds.), Proceedings of the 32nd International Conference on Ma- chine Learning, volume 37 ofProceedings of Machine Learning Research, pp. 2256–2265, ...

2015
[13]

Sun, H., Dai, H., Xia, W., and Ramamurthy, A

URL https://openreview.net/forum? id=PxTIG12RRHS. Sun, H., Dai, H., Xia, W., and Ramamurthy, A. Path auxiliary proposal for MCMC in discrete space. In International Conference on Learning Representations,
[14]

URL https://openreview.net/forum? id=JSR-YDImK95. Sun, Z. and Yang, Y . DIFUSCO: Graph-based diffusion solvers for combinatorial optimization. InThirty-seventh Conference on Neural Information Processing Systems,
[15]

Vidal, T., Crainic, T

URL https://openreview.net/forum? id=JV8Ff0lgVV. Vidal, T., Crainic, T. G., Gendreau, M., Lahrichi, N., and Rei, W. A hybrid genetic algorithm for multidepot and periodic vehicle routing problems.Operations Research, 60(3):611–624, 2012. doi: 10.1287/opre.1120.1048. URL https://doi.org/10.1287/opre.1120. 1048. Welling, M. and Teh, Y . W. Bayesian learning...

work page doi:10.1287/opre.1120.1048 2012
[16]

org/CorpusID:2407346

URL https://api.semanticscholar. org/CorpusID:2407346. Ziebart, B. D., Maas, A., Bagnell, J. A., and Dey, A. K. Maximum entropy inverse reinforcement learning. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3, AAAI’08, pp. 1433–1438. AAAI Press, 2008. ISBN 9781577353683. 12 Combinatorial Adjoint Matching A. Proofs of Th...

2008

[1] [1]

Applegate, D., Bixby, R., Chv ´atal, V ., and Cook, W

doi: https://doi.org/10.1007/978-1-4612-3038-0. Applegate, D., Bixby, R., Chv ´atal, V ., and Cook, W. Concorde TSP solver. https://www.math. uwaterloo.ca/tsp/concorde/index.html,

work page doi:10.1007/978-1-4612-3038-0

[2] [2]

Barab´asi, A.-L

Software package. Barab´asi, A.-L. and Albert, R. Emergence of scal- ing in random networks.Science, 286(5439): 509–512, 1999. doi: 10.1126/science.286.5439

work page doi:10.1126/science.286.5439 1999

[3] [3]

Bellman, R

URL https://www.science.org/doi/ abs/10.1126/science.286.5439.509. Bellman, R. The theory of dynamic programming.Bul- letin of the American Mathematical Society, 60:503–515,

work page doi:10.1126/science.286.5439.509

[4] [4]

org/CorpusID:120598356

URL https://api.semanticscholar. org/CorpusID:120598356. B¨other, M., Kißig, O., Taraz, M., Cohen, S., Seidel, K., and Friedrich, T. What’s wrong with deep learning in tree search for combinatorial optimization. InIn- ternational Conference on Learning Representations,

[5] [5]

Cai, T., Luo, S., Xu, K., He, D., Liu, T.-Y ., and Wang, L

URL https://openreview.net/forum? id=mk0HzdqY7i1. Cai, T., Luo, S., Xu, K., He, D., Liu, T.-Y ., and Wang, L. Graphnorm: A principled approach to accelerating graph neural network training. In Meila, M. and Zhang, T. (eds.),Proceedings of the 38th International Conference on Machine Learning, volume 139 ofProceedings of Machine Learning Research, pp. 1204...

2021

[6] [6]

Feng, S., Sun, W., Li, S., Talwalkar, A., and Yang, Y

URL https://openreview.net/forum? id=bbJ0QCujU4. Feng, S., Sun, W., Li, S., Talwalkar, A., and Yang, Y . A comprehensive evaluation of contemporary ml-based solvers for combinatorial optimization, 2025. URL https://arxiv.org/abs/2505.16952. Fleming, W. H. and Rishel, R. Deterministic and stochastic optimal control. 1975. URL https: //api.semanticscholar.o...

work page arXiv 2025

[7] [7]

cc/paper_files/paper/2020/file/ 10 Combinatorial Adjoint Matching 4c5bcfec8584af0d967f1ab10179ca4b-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2020/file/ 10 Combinatorial Adjoint Matching 4c5bcfec8584af0d967f1ab10179ca4b-Paper. pdf. Joshi, C. K., Laurent, T., and Bresson, X. An efficient graph convolutional network technique for the travelling sales- man problem, 2019. URL https://arxiv.org/ abs/1906.01227. Kappen, H. J. Path integrals and sy...

work page doi:10.1088/1742-5468/2005/11/p11011 2020

[8] [8]

cc/paper_files/paper/2020/file/ f231f2107df69eab0a3862d50018a9b2-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2020/file/ f231f2107df69eab0a3862d50018a9b2-Paper. pdf. Li, H., Yuan, H., Zhang, H., Lin, J., Ge, D., Wang, M., and Ye, Y . Fmip: Joint continuous-integer flow for mixed-integer linear programming, 2025. URL https: //arxiv.org/abs/2507.23390. Li, Y ., Guo, J., Wang, R., and Yan, J. From distribution le...

work page arXiv 2020

[9] [9]

Li, Z., Chen, Q., and Koltun, V

URL https://openreview.net/forum? id=9JM03CQwzC. Li, Z., Chen, Q., and Koltun, V . Combinatorial op- timization with graph convolutional networks and guided tree search. In Bengio, S., Wallach, H., Larochelle, H., Grauman, K., Cesa-Bianchi, N., and Garnett, R. (eds.),Advances in Neural Information Processing Systems, volume 31. Curran Associates, Inc.,

[10] [10]

cc/paper_files/paper/2018/file/ 8d3bba7425e7c98c50f52ca1b52d3735-Paper

URL https://proceedings.neurips. cc/paper_files/paper/2018/file/ 8d3bba7425e7c98c50f52ca1b52d3735-Paper. pdf. Lipman, Y ., Chen, R. T. Q., Ben-Hamu, H., Nickel, M., and Le, M. Flow matching for generative modeling. InThe Eleventh International Conference on Learning Represen- tations, 2023. URL https://openreview.net/ forum?id=PqvMRDCJT9t. Liu, L., Jiang,...

2018

[11] [11]

Sanokowski, S., Hochreiter, S., and Lehner, S

URL https://openreview.net/forum? id=9u05zr0nhx. Sanokowski, S., Hochreiter, S., and Lehner, S. A diffu- sion model framework for unsupervised neural combi- natorial optimization. InICML, 2024. URL https: //openreview.net/forum?id=AFfXlKFHXJ. Sanokowski, S., Berghammer, W. F., Wang, H. P., En- nemoser, M., Hochreiter, S., and Lehner, S. Scalable dis- cret...

work page doi:10.1007/0-387-29903-3 2024

[12] [12]

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S

URL https://openreview.net/forum? id=VXB4xxAgOf. Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., and Ganguli, S. Deep unsupervised learning using nonequi- librium thermodynamics. In Bach, F. and Blei, D. (eds.), Proceedings of the 32nd International Conference on Ma- chine Learning, volume 37 ofProceedings of Machine Learning Research, pp. 2256–2265, ...

2015

[13] [13]

Sun, H., Dai, H., Xia, W., and Ramamurthy, A

URL https://openreview.net/forum? id=PxTIG12RRHS. Sun, H., Dai, H., Xia, W., and Ramamurthy, A. Path auxiliary proposal for MCMC in discrete space. In International Conference on Learning Representations,

[14] [14]

URL https://openreview.net/forum? id=JSR-YDImK95. Sun, Z. and Yang, Y . DIFUSCO: Graph-based diffusion solvers for combinatorial optimization. InThirty-seventh Conference on Neural Information Processing Systems,

[15] [15]

Vidal, T., Crainic, T

URL https://openreview.net/forum? id=JV8Ff0lgVV. Vidal, T., Crainic, T. G., Gendreau, M., Lahrichi, N., and Rei, W. A hybrid genetic algorithm for multidepot and periodic vehicle routing problems.Operations Research, 60(3):611–624, 2012. doi: 10.1287/opre.1120.1048. URL https://doi.org/10.1287/opre.1120. 1048. Welling, M. and Teh, Y . W. Bayesian learning...

work page doi:10.1287/opre.1120.1048 2012

[16] [16]

org/CorpusID:2407346

URL https://api.semanticscholar. org/CorpusID:2407346. Ziebart, B. D., Maas, A., Bagnell, J. A., and Dey, A. K. Maximum entropy inverse reinforcement learning. In Proceedings of the 23rd National Conference on Artificial Intelligence - Volume 3, AAAI’08, pp. 1433–1438. AAAI Press, 2008. ISBN 9781577353683. 12 Combinatorial Adjoint Matching A. Proofs of Th...

2008