CART impurity criteria are reframed as Bregman divergences from convex functions, unifying node representatives, split rules, and consistency results for exponential-family models.
Re-examination of Bregman functions and new properties of their divergences
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
The Bregman divergence (Bregman distance, Bregman measure of distance) is a certain useful substitute for a distance, obtained from a well-chosen function (the "Bregman function"). Bregman functions and divergences have been extensively investigated during the last decades and have found applications in optimization, operations research, information theory, nonlinear analysis, machine learning and more. This paper re-examines various aspects related to the theory of Bregman functions and divergences. In particular, it presents many sufficient conditions which allow the construction of Bregman functions in a general setting and introduces new Bregman functions (such as a negative iterated log entropy). Moreover, it sheds new light on several known Bregman functions such as quadratic entropies, the negative Havrda-Charv\'at-Tsallis entropy, and the negative Boltzmann-Gibbs-Shannon entropy, and it shows that the negative Burg entropy, which is not a Bregman function according to the classical theory but nevertheless is known to have "Bregmanian properties", can, by our re-examination of the theory, be considered as a Bregman function. Our analysis yields several by-products of independent interest such as the introduction of the concept of relative uniform convexity (a certain generalization of uniform convexity), new properties of uniformly and strongly convex functions, and results in Banach space theory.
fields
stat.ML 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
A Bregman Perspective on Classification and Regression Trees
CART impurity criteria are reframed as Bregman divergences from convex functions, unifying node representatives, split rules, and consistency results for exponential-family models.