Recognition: unknown
QUBO-Based Calibration for Regression Trees
Pith reviewed 2026-05-08 03:08 UTC · model grok-4.3
The pith
Reformulating split selection as a QUBO problem lets solvers find higher-quality splits for regression trees than CART's greedy heuristics while keeping comparable accuracy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that the fractional combinatorial optimization problem arising in CART split selection for categorical variables can be exactly transformed via Dinkelbach's algorithm into a QUBO formulation. When this QUBO is solved with current solvers, the constructed regression trees achieve out-of-sample performance comparable to standard CART while returning split solutions that are superior in quality to those produced by the usual greedy heuristics.
What carries the argument
Dinkelbach's algorithm, which iteratively replaces the fractional objective of split selection with a sequence of QUBO problems whose solutions converge to the optimum of the original problem.
Load-bearing premise
Dinkelbach's reformulation preserves the exact optimal solution of the original fractional split-selection objective and current QUBO solvers reliably return higher-quality splits than CART's greedy heuristics.
What would settle it
Run both methods on the same collection of datasets with high-cardinality categorical predictors, measure the objective value achieved at each split, and check whether the QUBO solutions consistently yield strictly better objective values while test-set performance stays within statistical noise of CART.
Figures
read the original abstract
Tree-based regression models are widely used in supervised learning, with the Classification and Regression Tree (CART) algorithm serving as a standard reference. CART construction involves solving a sequence of split-selection optimization problems. For categorical predictors, this problem can be formulated as a combinatorial fractional optimization problem. This structure makes the exact optimization computationally challenging and leads to standard implementations that rely on greedy heuristics, which may result in suboptimal splits. In this work, we reformulate this fractional problem and apply Dinkelbach (1967) algorithm to convert it into a Quadratic Unconstrained Binary Optimization (QUBO) problem. Using state-of-the-art QUBO solvers, we obtain QUBO-based regression trees with predictive performance comparable to standard CART while yielding higher-quality split solutions. These results highlight the potential of QUBO formulations for improving tree-based learning methods and open perspectives for future hybrid classical--quantum implementations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims to reformulate the split-selection optimization problem for categorical predictors in regression trees as a combinatorial fractional optimization problem, which is then converted into a QUBO using Dinkelbach's algorithm. Using state-of-the-art QUBO solvers, the resulting trees are claimed to have predictive performance comparable to standard CART while providing higher-quality splits.
Significance. If the empirical claims hold, this work would be significant as it introduces a novel optimization-based approach to improve upon the greedy heuristics in CART for handling categorical features, potentially leading to better tree structures. It also suggests future directions for hybrid classical-quantum implementations in machine learning.
major comments (2)
- Abstract: The assertions regarding comparable predictive performance and higher-quality split solutions lack any quantitative support, such as metrics on specific datasets, comparisons to baselines, or analysis of split quality. This is critical as the central contribution rests on these unverified claims.
- Dinkelbach reformulation: The manuscript does not discuss the requirement for exact global optima in each iteration of Dinkelbach's algorithm to preserve the original fractional objective. Since QUBO solvers are heuristic and may not guarantee optimality, especially for large category cardinalities, the claimed superiority of splits may not materialize.
Simulated Author's Rebuttal
We thank the referee for their constructive comments and the opportunity to improve the manuscript. We provide point-by-point responses to the major comments below, indicating revisions where appropriate.
read point-by-point responses
-
Referee: Abstract: The assertions regarding comparable predictive performance and higher-quality split solutions lack any quantitative support, such as metrics on specific datasets, comparisons to baselines, or analysis of split quality. This is critical as the central contribution rests on these unverified claims.
Authors: The abstract provides a high-level overview, while the full manuscript contains detailed experimental results on multiple datasets, including quantitative comparisons of predictive performance (e.g., MSE values) against standard CART and metrics for split quality based on the objective function. To address the concern, we will revise the abstract to incorporate brief quantitative highlights drawn from the experimental section, such as average performance metrics and improvements in split quality. revision: yes
-
Referee: Dinkelbach reformulation: The manuscript does not discuss the requirement for exact global optima in each iteration of Dinkelbach's algorithm to preserve the original fractional objective. Since QUBO solvers are heuristic and may not guarantee optimality, especially for large category cardinalities, the claimed superiority of splits may not materialize.
Authors: We appreciate this observation. Dinkelbach's algorithm converges to the global optimum of the fractional program when each subproblem is solved exactly. Our approach employs heuristic QUBO solvers, which may return approximate solutions. In the revised manuscript, we have added a dedicated paragraph in the methods section discussing this aspect, including notes on the theoretical implications of approximation and empirical evidence from our experiments showing that the obtained splits remain higher-quality than those from the greedy CART heuristic, as evaluated by the objective value, even for the tested category cardinalities. revision: yes
Circularity Check
No circularity: direct mathematical reformulation using external algorithm
full rationale
The paper's core step is reformulating the categorical split-selection problem (a combinatorial fractional program) via Dinkelbach's 1967 algorithm into an equivalent QUBO, then invoking external QUBO solvers. This transformation is a standard parametric reduction whose correctness is independent of any fitted parameters, self-citations, or redefinitions within the paper. No claimed result (split quality or tree performance) is shown to equal its own inputs by construction; the equivalence follows from the cited external theorem rather than from any internal loop or renaming. The derivation chain therefore remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Dinkelbach's algorithm converts the fractional optimization problem for categorical splits into an equivalent QUBO
Reference graph
Works this paper leans on
-
[1]
Quantum computing based hybrid solution strategies for large-scale discrete-continuous optimization problems
Akshay Ajagekar, Travis Humble, and Fengqi You. Quantum computing based hybrid solution strategies for large-scale discrete-continuous optimization problems. Computers & Chemical Engineering, 132: 0 106630, 2020
2020
-
[2]
Aoki and M
T. Aoki and M. Ohzeki. A qubo formulation of the k-medoids problem. Scientific Reports, 2021
2021
-
[3]
Translating constraints into qubos for the quadratic knapsack problem
Tariq Bontekoe, Frank Phillipson, and Ward van der Schoot. Translating constraints into qubos for the quadratic knapsack problem. In International Conference on Computational Science, pages 90--107. Springer, 2023
2023
-
[4]
Pruning decision trees with misclassification costs
Jeffrey P Bradford, Clayton Kunz, Ron Kohavi, Cliff Brunk, and Carla E Brodley. Pruning decision trees with misclassification costs. In European Conference on Machine Learning, pages 131--136. Springer, 1998
1998
-
[5]
Breiman, J
L. Breiman, J. Friedman, R. Olshen, and C. Stone. Classification and Regression Trees. Chapman & Hall, New York, 1984
1984
-
[6]
Regression tree credibility model
Liqun Diao and Chengguo Weng. Regression tree credibility model. North American Actuarial Journal, 23 0 (2): 0 169--196, 2019
2019
-
[7]
Dinkelbach
W. Dinkelbach. On nonlinear fractional programming. Management Science, 13 0 (7): 0 492--498, 1967
1967
-
[8]
Quantum bridge analytics i: a tutorial on formulating and using qubo models
Fred Glover, Gary Kochenberger, Rick Hennig, and Yu Du. Quantum bridge analytics i: a tutorial on formulating and using qubo models. Annals of Operations Research, 314 0 (1): 0 141--183, 2022
2022
-
[9]
The elements of statistical learning: data mining, inference, and prediction, volume 2
Trevor Hastie, Robert Tibshirani, Jerome H Friedman, and Jerome H Friedman. The elements of statistical learning: data mining, inference, and prediction, volume 2. Springer, 2009
2009
-
[10]
An introduction to statistical learning with applications in r, 2013
Gareth James. An introduction to statistical learning with applications in r, 2013
2013
-
[11]
Bagging and regression trees in individual claims reserving: J
Jan Janou s ek and Michal Pe s ta. Bagging and regression trees in individual claims reserving: J. janou s ek and m. pe s ta. Statistical Papers, 66 0 (4): 0 89, 2025
2025
-
[12]
Quantum annealing in the transverse Ising model
Tadashi Kadowaki and Hidetoshi Nishimori. Quantum annealing in the transverse Ising model. Physical Review E, 58 0 (5): 0 5355, 1998
1998
-
[13]
An exploratory technique for investigating large quantities of categorical data
Gordon V Kass. An exploratory technique for investigating large quantities of categorical data. Journal of the royal statistical society: series c (Applied statistics), 29 0 (2): 0 119--127, 1980
1980
-
[14]
The unconstrained binary quadratic programming problem: a survey
Gary Kochenberger, Jin-Kao Hao, Fred Glover, Mark Lewis, Zhipeng L \"u , Haibo Wang, and Yang Wang. The unconstrained binary quadratic programming problem: a survey. Journal of Combinatorial Optimization, 28 0 (1): 0 58--81, 2014
2014
-
[15]
S. B. Kotsiantis. Supervised machine learning: A review of classification techniques. Informatica, 31: 0 249--268, 2007
2007
-
[16]
Fifty years of classification and regression trees
Wei-Yin Loh. Fifty years of classification and regression trees. International Statistical Review, 82 0 (3): 0 329--348, 2014
2014
-
[17]
Tree-based censored regression with applications in insurance
Olivier Lopez, Xavier Milhaud, and Pierre-E Th \'e rond. Tree-based censored regression with applications in insurance. Electronic Journal of Statistics, 10: 0 2685–2716, 2016
2016
-
[18]
Ising formulations of many NP problems
Andrew Lucas. Ising formulations of many NP problems. Frontiers in Physics, 2: 0 5, 2014
2014
-
[19]
Distance-based clustering using qubo formulations
Nasa Matsumoto, Yohei Hamakawa, Kosuke Tatsumura, and Kazue Kudo. Distance-based clustering using qubo formulations. Scientific reports, 12 0 (1): 0 2669, 2022
2022
-
[20]
Otsuka and S
H. Otsuka and S. Tanaka. A qubo model for gaussian process variance reduction. Quantum Information Processing, 2020
2020
-
[21]
Portfolio optimisation using the d-wave quantum annealer
Frank Phillipson and Harshil Singh Bhatia. Portfolio optimisation using the d-wave quantum annealer. In International Conference on Computational Science, pages 45--59. Springer, 2021
2021
-
[22]
J. R. Quinlan. Induction of decision trees. Machine Learning, 1 0 (1): 0 81--106, 1986
1986
-
[23]
Ross Quinlan
J. Ross Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993
1993
-
[24]
Fractional programming
Siegfried Schaible. Fractional programming. i, duality. Management science, 22 0 (8): 0 858--867, 1976
1976
-
[25]
Performance comparison of typical binary-integer encodings in an ising machine
Kensuke Tamura, Tatsuhiko Shirai, Hosho Katsura, Shu Tanaka, and Nozomu Togawa. Performance comparison of typical binary-integer encodings in an ising machine. IEEE Access, 9: 0 81032--81039, 2021
2021
-
[26]
User written splitting functions for rpart
Terry Therneau et al. User written splitting functions for rpart. Technical report, Technical report, Mayo Clinic, 2019
2019
-
[27]
An introduction to recursive partitioning using the rpart routines
Terry M Therneau, Elizabeth J Atkinson, et al. An introduction to recursive partitioning using the rpart routines. Technical report, Technical report Mayo Foundation, 1997
1997
-
[28]
P. M. Xavier, Y. Park, D. Noble, I. Santana, M. Paredes, and D. E. B. Neira. A performance comparison of variable encoding techniques for QUIO and QUBO problems. In Frontiers in Industrial Engineering. Purdue Quantum AI, Communications in Computer and Information Science. Springer, 2026. ISBN 978-981-95782-8-3. In press
2026
-
[29]
Qubo decision tree: Annealing machine extends decision tree splitting
Koichiro Yawata, Yoshihiro Osakabe, Takuya Okuyama, and Akinori Asahara. Qubo decision tree: Annealing machine extends decision tree splitting. In 2022 IEEE international conference on knowledge graph (ICKG), pages 355--364. IEEE, 2022
2022
-
[30]
Dinkelbach's algorithm as an efficient method to solve a class of minlp models for large-scale cyclic scheduling problems
Fengqi You, Pedro M Castro, and Ignacio E Grossmann. Dinkelbach's algorithm as an efficient method to solve a class of minlp models for large-scale cyclic scheduling problems. Computers & Chemical Engineering, 33 0 (11): 0 1879--1889, 2009
2009
-
[31]
Globally convergent exact and inexact parametric algorithms for solving large-scale mixed-integer fractional programs and applications in process systems engineering
Zhixia Zhong and Fengqi You. Globally convergent exact and inexact parametric algorithms for solving large-scale mixed-integer fractional programs and applications in process systems engineering. Computers & Chemical Engineering, 61: 0 90--101, 2014
2014
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.