Optimal Explanations of Linear Models

Arthur Delarue; Dimitris Bertsimas; Patrick Jaillet; Sebastien Martin

arxiv: 1907.04669 · v1 · pith:3OBIAI2Xnew · submitted 2019-07-08 · 💻 cs.LG · stat.ML

Optimal Explanations of Linear Models

Dimitris Bertsimas , Arthur Delarue , Patrick Jaillet , Sebastien Martin This is my paper

Pith reviewed 2026-05-25 01:20 UTC · model grok-4.3

classification 💻 cs.LG stat.ML

keywords linear modelsmodel interpretabilityexplanationsoptimizationcoordinate updatesaccuracy-interpretability tradeoffmachine learning

0 comments

The pith

Linear models can be explained by decomposing them into a sequence of simpler models through successive coordinate updates on their coefficients.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper sets out a general optimization framework that turns any linear model into an ordered chain of models, each one more complex than the last. The chain is built by repeatedly updating a single coefficient at a time until the original model is recovered. A reader would care because the method converts the loose idea of interpretability into a concrete, solvable problem that comes with exact algorithms and fast heuristics. It also supplies a tunable family of metrics for how easy a model is to explain while keeping its predictions intact.

Core claim

The central claim is that explanations for linear models are obtained by solving an optimization problem whose solution is a sequence of coordinate updates; this sequence yields a parametrized family of interpretability metrics and makes the accuracy-interpretability tradeoff explicit and computable.

What carries the argument

The decomposition of a linear model into a sequence of models of increasing complexity obtained by coordinate updates on the coefficients, computed by exact algorithms or scalable heuristics.

If this is right

The framework produces a family of interpretability metrics that generalize typical proxies used in practice.
Exact and heuristic algorithms exist to compute the optimal decomposition for any given linear model.
The approach quantifies the tradeoff between how interpretable a model is and how much predictive accuracy it retains.
The resulting explanations can be used to support complex decisions by revealing the model's reasoning step by step.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same decomposition idea could be tested on generalized linear models to see whether the coordinate-update sequence still yields useful explanations.
Organizations could adopt the derived metrics as a standardized way to audit the explainability of deployed linear models.
If the sequences align with human judgments on benchmark tasks, they might reduce vulnerability to adversarial attacks by making the model's logic transparent at each step.

Load-bearing premise

That the sequence of models produced by coordinate updates on the coefficients forms a meaningful explanation of the original model's reasoning rather than an arbitrary reordering of terms.

What would settle it

Apply the method to a small linear regression on a dataset whose intuitive feature ordering is already known and check whether the generated sequence deviates from that ordering on more than a small fraction of runs.

Figures

Figures reproduced from arXiv: 1907.04669 by Arthur Delarue, Dimitris Bertsimas, Patrick Jaillet, Sebastien Martin.

**Figure 2.** Figure 2: Pareto front between interpretability loss [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Example of a Pareto-efficient coordinate path. On the left we see the benefits of each coefficient modification. [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗

**Figure 4.** Figure 4: Comparison of interpretable path and sequence of models of increasing sparsity selected by LASSO. Ftemp is [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

read the original abstract

When predictive models are used to support complex and important decisions, the ability to explain a model's reasoning can increase trust, expose hidden biases, and reduce vulnerability to adversarial attacks. However, attempts at interpreting models are often ad hoc and application-specific, and the concept of interpretability itself is not well-defined. We propose a general optimization framework to create explanations for linear models. Our methodology decomposes a linear model into a sequence of models of increasing complexity using coordinate updates on the coefficients. Computing this decomposition optimally is a difficult optimization problem for which we propose exact algorithms and scalable heuristics. By solving this problem, we can derive a parametrized family of interpretability metrics for linear models that generalizes typical proxies, and study the tradeoff between interpretability and predictive accuracy.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper turns linear-model explanations into an optimization problem that decomposes the model via coordinate updates on coefficients, but the step from that decomposition to genuine explanatory power is not strongly justified.

read the letter

The core contribution is a framing that treats explanation as the task of recovering a sequence of partial linear models, built one coordinate at a time, that approximate the original predictor while trading off some accuracy measure. They then give exact and heuristic algorithms for that decomposition and show it induces a parametrized family of interpretability scores that include several existing proxies as special cases.

Referee Report

1 major / 2 minor

Summary. The manuscript proposes a general optimization framework for explaining linear models. It decomposes a given linear model into a sequence of models of increasing complexity by performing coordinate updates on the coefficients. Exact algorithms and scalable heuristics are developed to solve the resulting optimization problem. From the optimal decomposition the authors derive a parametrized family of interpretability metrics that generalize common proxies and examine the resulting interpretability-accuracy tradeoff.

Significance. If the coordinate-update decomposition can be shown to produce sequences that capture model reasoning in a non-arbitrary way, the framework would supply a principled, optimization-based route to interpretability metrics for linear models together with concrete algorithms. The algorithmic component and the explicit parametrization of metrics are concrete strengths.

major comments (1)

[Abstract and §3] Abstract and §3 (methodology): the central claim that the coordinate-update sequence constitutes an 'explanation' of the original model's reasoning rests on the unstated assumption that the ordering induced by successive coordinate updates aligns with semantic reasoning steps. No linking criterion, invariance property, or validation is supplied showing why this ordering is explanatory rather than an arbitrary reordering of coefficients; this assumption is load-bearing for the claim that the framework produces explanations rather than merely parametrized orderings.

minor comments (2)

Notation for the parametrized interpretability metrics is introduced without an explicit list of the free parameters or a table showing how common proxies arise as special cases.
The abstract states that the decomposition is computed 'optimally,' yet the distinction between the exact algorithm and the scalable heuristic is not quantified in terms of approximation guarantees or empirical gap on the objective.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive review. We respond to the major comment below.

read point-by-point responses

Referee: [Abstract and §3] Abstract and §3 (methodology): the central claim that the coordinate-update sequence constitutes an 'explanation' of the original model's reasoning rests on the unstated assumption that the ordering induced by successive coordinate updates aligns with semantic reasoning steps. No linking criterion, invariance property, or validation is supplied showing why this ordering is explanatory rather than an arbitrary reordering of coefficients; this assumption is load-bearing for the claim that the framework produces explanations rather than merely parametrized orderings.

Authors: The framework defines an explanation as the sequence obtained by solving the stated optimization problem, which yields a decomposition that is optimal with respect to an objective trading off fidelity to the original model against a complexity penalty. This optimality supplies a non-arbitrary criterion for the induced ordering. The parametrized interpretability metrics are then derived directly from properties of any such optimal sequence and are shown to generalize standard proxies. The manuscript does not claim or require that the ordering matches semantic reasoning steps of a human interpreter; its contribution is the optimization-based construction itself and the resulting metrics for the interpretability-accuracy tradeoff. We will revise the abstract and §3 to state this grounding explicitly and to remove any phrasing that could suggest semantic alignment. revision: yes

Circularity Check

0 steps flagged

No circularity: forward proposal of optimization framework with independent derivation of metrics

full rationale

The paper introduces a new optimization-based decomposition of linear models via coordinate updates on coefficients and then derives a parametrized family of interpretability metrics from that construction. No step reduces a claimed prediction, uniqueness result, or metric to a fitted quantity or self-citation defined inside the paper; the central claim is a methodological proposal whose validity rests on external interpretability criteria rather than internal redefinition. The derivation chain is self-contained against the stated inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; free parameters, axioms, and invented entities cannot be enumerated from the given text.

pith-pipeline@v0.9.0 · 5655 in / 867 out tokens · 16913 ms · 2026-05-25T01:20:56.000979+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references · 30 canonical work pages · 4 internal anchors

[1]

Does machine learning automate moral hazard and error? American Economic Review, 107(5):476–480, 2017

Sendhil Mullainathan and Ziad Obermeyer. Does machine learning automate moral hazard and error? American Economic Review, 107(5):476–480, 2017

work page 2017
[2]

Human decisions and machine predictions

Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Human decisions and machine predictions. The quarterly journal of economics, 133(1):237–293, 2017

work page 2017
[3]

An impact assessment of machine learning risk forecasts on parole board decisions and recidivism

Richard Berk. An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. Journal of Experimental Criminology, 13(2):193–216, 2017

work page 2017
[4]

Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them

Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, 64(3):1155–1170, nov 2016

work page 2016
[5]

Alex A. Freitas. Comprehensible classiﬁcation models. ACM SIGKDD Explorations Newsletter, 15(1):1–10, 2014

work page 2014
[6]

Zachary C. Lipton. The Mythos of Model Interpretability. arXiv preprint arXiv:1606.03490, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016
[7]

Selected techniques for data mining in medicine

Nada Lavra ˇc. Selected techniques for data mining in medicine. Artiﬁcial intelligence in medicine, 16(1):3–23, 1999

work page 1999
[8]

right to explanation

Bryce Goodman and Seth Flaxman. European Union regulations on algorithmic decision-making and a "right to explanation". pages 1–9, 2016

work page 2016
[9]

Statistical learning with sparsity: the lasso and generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwright. Statistical learning with sparsity: the lasso and generalizations. CRC press, 2015

work page 2015
[10]

Classiﬁcation and regression trees

Leo Breiman. Classiﬁcation and regression trees. New York: Routledge, 1984

work page 1984
[11]

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classiﬁcation

Been Kim, Cynthia Rudin, and Julie Shah. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classiﬁcation. In Neural Information Processing Systems (NIPS) 2014, 2014

work page 2014
[12]

McCormick, and David Madigan

Benjamin Letham, Cynthia Rudin, Tyler H. McCormick, and David Madigan. Interpretable classiﬁers using rules and bayesian analysis: Building a better stroke prediction model. Annals of Applied Statistics, 9(3):1350–1371, 2015

work page 2015
[13]

Scalable Bayesian Rule Lists

Hongyu Yang, Cynthia Rudin, and Margo Seltzer. Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning, 2017

work page 2017
[14]

Why Should I Trust You?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why Should I Trust You?” Explaining the Predictions of Any Classiﬁer. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

work page 2016
[15]

The elements of statistical learning

Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Springer series in statistics New York, NY , USA:, 2001

work page 2001
[16]

Algorithmic Transparency via Quantitative Input Inﬂuence :

Anupam Datta, Shayak Sen, and Yair Zick. Algorithmic Transparency via Quantitative Input Inﬂuence :. In 2016 IEEE Symposium on Security and Privacy, 2016

work page 2016
[17]

Interpreting Blackbox Models via Model Extraction

Hamsa Bastani, Osbert Bastani, and Carolyn Kim. Interpreting Predictive Models for Human-in-the-Loop Analytics. arXiv preprint arXiv:1705.08504, pages 1–45, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[18]

Interpretable & Explorable Approximations of Black Box Models

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. Interpretable & Explorable Approximations of Black Box Models. FAT/ML, jul 2017

work page 2017
[19]

Model compression

Cristian Bucilˇa, Rich Caruana, and Alexandru Niculescu-Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’06, page 535, New York, New York, USA, 2006. ACM, ACM Press

work page 2006
[20]

Tibshirani

Robert J. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996

work page 1996
[21]

Tibshirani

Jonathan Taylor and Robert J. Tibshirani. Statistical learning and selective inference. Proceedings of the National Academy of Sciences, 112(25):7629–7634, jun 2015

work page 2015
[22]

Least Angle Regression

Bradley Efron, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. Least Angle Regression. Annals of Statistics, 32(2):407– 499, apr 2004

work page 2004
[23]

Best subset selection via a modern optimization lens

Dimitris Bertsimas, Angela King, and Rahul Mazumder. Best subset selection via a modern optimization lens. Annals of Statistics, 44(2):813–852, 2016

work page 2016
[24]

Simple rules for complex decisions

Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, and Daniel G Goldstein. Simple rules for complex decisions. feb 2017

work page 2017
[25]

Supersparse linear integer models for optimized medical scoring systems

Berk Ustun and Cynthia Rudin. Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 102(3):349–391, 2016. 9 A PREPRINT - JULY 11, 2019

work page 2016
[26]

Statistical modeling: The two cultures

Leo Breiman. Statistical modeling: The two cultures. Statistical science, 16(3):199–231, 2001

work page 2001
[27]

Explaining Explanations: An Overview of Interpretability of Machine Learning

Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. Explaining Explanations : An Approach to Evaluating Interpretability of Machine Learning. arXiv preprint arXiv:1806.00069, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018
[28]

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez and Been Kim. Towards A Rigorous Science of Interpretable Machine Learning. arXiv preprint arXiv:1702.08608, (Ml):1–13, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[29]

I. Y . Kim and O. L. De Weck. Adaptive weighted-sum method for bi-objective optimization: Pareto front generation.Structural and Multidisciplinary Optimization, 29(2):149–158, 2005

work page 2005
[30]

Event labeling combining ensemble detectors and background knowledge.Progress in Artiﬁcial Intelligence, 2(2-3):113–127, 2014

Hadi Fanaee-T and Joao Gama. Event labeling combining ensemble detectors and background knowledge.Progress in Artiﬁcial Intelligence, 2(2-3):113–127, 2014. A Proof of Theorem 1 Proof of part (a). Asc(·) is bounded, we havecmax∈ R such that 0<c (·)≤cmax. Let m+∈P (m+) be a path of optimal length to the modelm+, i.e.,|m+| =Lc(m+). . Let m−∈P (m−) be any pat...

work page 2014

[1] [1]

Does machine learning automate moral hazard and error? American Economic Review, 107(5):476–480, 2017

Sendhil Mullainathan and Ziad Obermeyer. Does machine learning automate moral hazard and error? American Economic Review, 107(5):476–480, 2017

work page 2017

[2] [2]

Human decisions and machine predictions

Jon Kleinberg, Himabindu Lakkaraju, Jure Leskovec, Jens Ludwig, and Sendhil Mullainathan. Human decisions and machine predictions. The quarterly journal of economics, 133(1):237–293, 2017

work page 2017

[3] [3]

An impact assessment of machine learning risk forecasts on parole board decisions and recidivism

Richard Berk. An impact assessment of machine learning risk forecasts on parole board decisions and recidivism. Journal of Experimental Criminology, 13(2):193–216, 2017

work page 2017

[4] [4]

Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them

Berkeley J Dietvorst, Joseph P Simmons, and Cade Massey. Overcoming algorithm aversion: People will use imperfect algorithms if they can (even slightly) modify them. Management Science, 64(3):1155–1170, nov 2016

work page 2016

[5] [5]

Alex A. Freitas. Comprehensible classiﬁcation models. ACM SIGKDD Explorations Newsletter, 15(1):1–10, 2014

work page 2014

[6] [6]

Zachary C. Lipton. The Mythos of Model Interpretability. arXiv preprint arXiv:1606.03490, 2016

work page internal anchor Pith review Pith/arXiv arXiv 2016

[7] [7]

Selected techniques for data mining in medicine

Nada Lavra ˇc. Selected techniques for data mining in medicine. Artiﬁcial intelligence in medicine, 16(1):3–23, 1999

work page 1999

[8] [8]

right to explanation

Bryce Goodman and Seth Flaxman. European Union regulations on algorithmic decision-making and a "right to explanation". pages 1–9, 2016

work page 2016

[9] [9]

Statistical learning with sparsity: the lasso and generalizations

Trevor Hastie, Robert Tibshirani, and Martin Wainwright. Statistical learning with sparsity: the lasso and generalizations. CRC press, 2015

work page 2015

[10] [10]

Classiﬁcation and regression trees

Leo Breiman. Classiﬁcation and regression trees. New York: Routledge, 1984

work page 1984

[11] [11]

The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classiﬁcation

Been Kim, Cynthia Rudin, and Julie Shah. The Bayesian Case Model: A Generative Approach for Case-Based Reasoning and Prototype Classiﬁcation. In Neural Information Processing Systems (NIPS) 2014, 2014

work page 2014

[12] [12]

McCormick, and David Madigan

Benjamin Letham, Cynthia Rudin, Tyler H. McCormick, and David Madigan. Interpretable classiﬁers using rules and bayesian analysis: Building a better stroke prediction model. Annals of Applied Statistics, 9(3):1350–1371, 2015

work page 2015

[13] [13]

Scalable Bayesian Rule Lists

Hongyu Yang, Cynthia Rudin, and Margo Seltzer. Scalable Bayesian Rule Lists. In Proceedings of the 34th International Conference on Machine Learning, 2017

work page 2017

[14] [14]

Why Should I Trust You?

Marco Tulio Ribeiro, Sameer Singh, and Carlos Guestrin. “Why Should I Trust You?” Explaining the Predictions of Any Classiﬁer. In Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, pages 1135–1144, 2016

work page 2016

[15] [15]

The elements of statistical learning

Jerome Friedman, Trevor Hastie, and Robert Tibshirani. The elements of statistical learning. Springer series in statistics New York, NY , USA:, 2001

work page 2001

[16] [16]

Algorithmic Transparency via Quantitative Input Inﬂuence :

Anupam Datta, Shayak Sen, and Yair Zick. Algorithmic Transparency via Quantitative Input Inﬂuence :. In 2016 IEEE Symposium on Security and Privacy, 2016

work page 2016

[17] [17]

Interpreting Blackbox Models via Model Extraction

Hamsa Bastani, Osbert Bastani, and Carolyn Kim. Interpreting Predictive Models for Human-in-the-Loop Analytics. arXiv preprint arXiv:1705.08504, pages 1–45, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[18] [18]

Interpretable & Explorable Approximations of Black Box Models

Himabindu Lakkaraju, Ece Kamar, Rich Caruana, and Jure Leskovec. Interpretable & Explorable Approximations of Black Box Models. FAT/ML, jul 2017

work page 2017

[19] [19]

Model compression

Cristian Bucilˇa, Rich Caruana, and Alexandru Niculescu-Mizil. Model compression. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD ’06, page 535, New York, New York, USA, 2006. ACM, ACM Press

work page 2006

[20] [20]

Tibshirani

Robert J. Tibshirani. Regression shrinkage and selection via the lasso. Journal of the Royal Statistical Society. Series B (Methodological), pages 267–288, 1996

work page 1996

[21] [21]

Tibshirani

Jonathan Taylor and Robert J. Tibshirani. Statistical learning and selective inference. Proceedings of the National Academy of Sciences, 112(25):7629–7634, jun 2015

work page 2015

[22] [22]

Least Angle Regression

Bradley Efron, Trevor Hastie, Iain Johnstone, and Robert Tibshirani. Least Angle Regression. Annals of Statistics, 32(2):407– 499, apr 2004

work page 2004

[23] [23]

Best subset selection via a modern optimization lens

Dimitris Bertsimas, Angela King, and Rahul Mazumder. Best subset selection via a modern optimization lens. Annals of Statistics, 44(2):813–852, 2016

work page 2016

[24] [24]

Simple rules for complex decisions

Jongbin Jung, Connor Concannon, Ravi Shroff, Sharad Goel, and Daniel G Goldstein. Simple rules for complex decisions. feb 2017

work page 2017

[25] [25]

Supersparse linear integer models for optimized medical scoring systems

Berk Ustun and Cynthia Rudin. Supersparse linear integer models for optimized medical scoring systems. Machine Learning, 102(3):349–391, 2016. 9 A PREPRINT - JULY 11, 2019

work page 2016

[26] [26]

Statistical modeling: The two cultures

Leo Breiman. Statistical modeling: The two cultures. Statistical science, 16(3):199–231, 2001

work page 2001

[27] [27]

Explaining Explanations: An Overview of Interpretability of Machine Learning

Leilani H Gilpin, David Bau, Ben Z Yuan, Ayesha Bajwa, Michael Specter, and Lalana Kagal. Explaining Explanations : An Approach to Evaluating Interpretability of Machine Learning. arXiv preprint arXiv:1806.00069, 2018

work page internal anchor Pith review Pith/arXiv arXiv 2018

[28] [28]

Towards A Rigorous Science of Interpretable Machine Learning

Finale Doshi-Velez and Been Kim. Towards A Rigorous Science of Interpretable Machine Learning. arXiv preprint arXiv:1702.08608, (Ml):1–13, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[29] [29]

I. Y . Kim and O. L. De Weck. Adaptive weighted-sum method for bi-objective optimization: Pareto front generation.Structural and Multidisciplinary Optimization, 29(2):149–158, 2005

work page 2005

[30] [30]

Event labeling combining ensemble detectors and background knowledge.Progress in Artiﬁcial Intelligence, 2(2-3):113–127, 2014

Hadi Fanaee-T and Joao Gama. Event labeling combining ensemble detectors and background knowledge.Progress in Artiﬁcial Intelligence, 2(2-3):113–127, 2014. A Proof of Theorem 1 Proof of part (a). Asc(·) is bounded, we havecmax∈ R such that 0<c (·)≤cmax. Let m+∈P (m+) be a path of optimal length to the modelm+, i.e.,|m+| =Lc(m+). . Let m−∈P (m−) be any pat...

work page 2014