Trustworthy Feature Importance Avoids Unrestricted Permutations

Cynthia Rudin; Elmar Plischke; Emanuele Borgonovo; Francesco Cappelli; Xuefei Lu

arxiv: 2604.11253 · v1 · submitted 2026-04-13 · 📊 stat.ML · cs.LG

Trustworthy Feature Importance Avoids Unrestricted Permutations

Emanuele Borgonovo , Francesco Cappelli , Xuefei Lu , Elmar Plischke , Cynthia Rudin This is my paper

Pith reviewed 2026-05-10 15:46 UTC · model grok-4.3

classification 📊 stat.ML cs.LG

keywords feature importancepermutation importanceextrapolation errorsvariable importanceknockoff methodsALE plotsmodel interpretabilityconditional reliance

0 comments

The pith

Unrestricted permutations in feature importance methods produce extrapolation errors that make all non-trivial approaches unreliable.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that feature importance techniques relying on unrestricted permutations evaluate models on data points far outside the observed distribution, creating extrapolation artifacts. This flaw is not limited to basic permutation importance but appears across essentially all non-trivial variable importance calculations. The authors develop three targeted remedies—conditional model reliance, knockoff methods paired with Gaussian transformations, and restricted ALE plot designs—to keep assessments within realistic data regions. If correct, these strategies deliver more stable and interpretable importance rankings for black-box models. Readers should care because many high-stakes applications in medicine, finance, and policy depend on accurate identification of which inputs actually drive predictions.

Core claim

Feature importance methods using unrestricted permutations are flawed due to extrapolation errors; such errors appear in all non-trivial variable importance approaches. We propose three new approaches: conditional model reliance and Knockoffs with Gaussian transformation, and restricted ALE plot designs. Theoretical and numerical results show our strategies reduce/eliminate extrapolation.

What carries the argument

Conditional sampling and restricted designs that confine permutations or plot evaluations to regions supported by the observed data distribution, thereby preventing extrapolation.

If this is right

Conditional model reliance computes importance by sampling only from conditional distributions given the other variables.
Knockoff-Gaussian transformations enable controlled permutations that stay within the data manifold.
Restricted ALE plots limit the range of the variable of interest to values actually observed in the data.
These restrictions apply broadly and reduce or remove extrapolation across non-trivial variable importance techniques.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Practitioners interpreting models for regulatory or clinical use may need to replace standard permutation importance with one of the conditional or restricted variants.
The same extrapolation concern could be checked in related tools such as partial dependence plots or individual conditional expectation curves.
Future implementations might embed these restrictions directly into software libraries for automatic detection of out-of-distribution evaluations.
The principle suggests that importance metrics should be required to respect the support of the training distribution as a basic validity condition.

Load-bearing premise

The proposed conditional, knockoff-Gaussian, and restricted-ALE strategies can be applied in general settings without introducing comparable or worse artifacts and that the theoretical guarantees translate to the numerical experiments shown.

What would settle it

A concrete dataset and model where any of the three proposed methods produces larger extrapolation errors or less stable importance rankings than standard unrestricted permutation importance.

read the original abstract

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper claims unrestricted permutations cause extrapolation errors in all non-trivial feature importance methods and proposes three fixes, but the fixes carry their own assumptions that need scrutiny.

read the letter

The main takeaway from this paper is that feature importance methods relying on unrestricted permutations suffer from extrapolation errors, and the authors argue this is a problem for all non-trivial approaches. They put forward three new methods to fix it: conditional model reliance, knockoffs with a Gaussian transformation, and restricted ALE plot designs. Theoretical arguments and numerical results are offered to show these reduce or eliminate the extrapolation issue. The work is new in its specific trio of proposals. It does a good job identifying the extrapolation issue in permutation importance, which is a real concern when models are evaluated on unrealistic feature combinations. The theoretical arguments and numerical results provide some support that these strategies can reduce or eliminate the problem. Where it gets softer is in the breadth of the claim and the potential downsides of the alternatives. The idea that every non-trivial variable importance method has this extrapolation flaw might not hold for approaches that don't involve out-of-sample model evaluations, like certain tree impurity measures. For the fixes, conditional reliance requires reliable conditional distributions, which are difficult to estimate accurately. The Gaussian knockoff assumption is restrictive for non-normal data. Restricted ALE limits the feature space, which could miss important boundary behaviors. The experiments would need to show convincingly that these don't introduce comparable or larger errors. This paper is for statisticians and ML practitioners focused on explainable models and reliable feature rankings. Readers dealing with permutation-based importance will find the analysis relevant. It has enough substance to go to a serious referee, who can examine whether the universality claim stands and how well the new methods perform in practice. I would recommend peer review for this.

Referee Report

4 major / 2 minor

Summary. The paper claims that unrestricted permutation methods for feature importance suffer from extrapolation errors by evaluating models outside the training data support, and that this flaw is universal across all non-trivial variable importance approaches. It proposes three alternatives—conditional model reliance, Gaussian knockoff transformations, and restricted ALE plots—claiming that theoretical arguments and numerical experiments demonstrate these strategies reduce or eliminate the extrapolation issue.

Significance. If the universality claim and the superiority of the proposed fixes hold, the work would meaningfully improve the trustworthiness of feature importance in interpretable ML, especially for applications sensitive to out-of-distribution evaluations. The provision of multiple concrete strategies and the focus on avoiding unrestricted permutations address a practical gap, though the strength depends on whether the new methods introduce milder artifacts than the original problem.

major comments (4)

[Abstract] Abstract and opening sections: The central assertion that extrapolation errors 'appear in all non-trivial variable importance approaches' is load-bearing for the motivation but is not accompanied by a formal characterization of the class of methods considered or a proof that methods such as tree impurity measures or kernel-based dependence (which do not explicitly extrapolate) are included; without this, the premise that the problem is universal remains unestablished.
[Theoretical results] Theoretical results section: The guarantees for conditional model reliance do not bound the additional error from conditional density estimation against the original permutation bias, leaving open whether the replacement artifact is strictly smaller across regimes.
[Knockoff-Gaussian section] Knockoff-Gaussian section: The Gaussian transformation step assumes joint normality, yet no analysis or experiment demonstrates robustness when this is violated (common in real data); this assumption is load-bearing for the claim that the method eliminates extrapolation without introducing comparable bias.
[Restricted ALE section] Restricted ALE section: Truncation of the feature range by construction can mask interactions that the original claim treats as important; the manuscript provides no quantitative comparison showing that this masking effect is milder than the extrapolation error it replaces.

minor comments (2)

Notation for the three proposed methods is introduced without a unified comparison table; a side-by-side summary of assumptions and computational requirements would improve clarity.
[Numerical experiments] The numerical experiments section lacks explicit controls for the strength of interactions or the degree of non-normality, making it hard to assess generalizability of the reported error reductions.

Simulated Author's Rebuttal

4 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below. We agree that several points warrant clarification or additional material and will revise the manuscript accordingly.

read point-by-point responses

Referee: [Abstract] Abstract and opening sections: The central assertion that extrapolation errors 'appear in all non-trivial variable importance approaches' is load-bearing for the motivation but is not accompanied by a formal characterization of the class of methods considered or a proof that methods such as tree impurity measures or kernel-based dependence (which do not explicitly extrapolate) are included; without this, the premise that the problem is universal remains unestablished.

Authors: We agree that a formal characterization of the class would strengthen the motivation. In the revised manuscript we will add a dedicated subsection that defines 'non-trivial variable importance approaches' as those requiring model evaluations on points outside the joint support of the observed data. We will explicitly discuss tree-based impurity measures (which rely on axis-aligned splits that implicitly extrapolate in sparse regions) and kernel dependence measures (which involve implicit density weighting that can produce out-of-support artifacts). While a fully general proof covering every conceivable method is beyond the paper's scope, the added section will make the scope of the claim precise and provide counter-examples for methods that avoid the issue. revision: partial
Referee: [Theoretical results] Theoretical results section: The guarantees for conditional model reliance do not bound the additional error from conditional density estimation against the original permutation bias, leaving open whether the replacement artifact is strictly smaller across regimes.

Authors: The existing theoretical results assume an oracle conditional density and show that extrapolation bias is eliminated. To address the referee's concern we will add a new proposition that bounds the total error (permutation bias plus density-estimation error) under standard consistency rates for the conditional density estimator. The bound demonstrates that, for sample sizes typical in the experiments, the net error is strictly smaller than the unrestricted permutation bias whenever the density estimator converges faster than the rate at which the permutation bias grows with dimension. We will also include a brief simulation confirming the regime in which the inequality holds. revision: yes
Referee: [Knockoff-Gaussian section] Knockoff-Gaussian section: The Gaussian transformation step assumes joint normality, yet no analysis or experiment demonstrates robustness when this is violated (common in real data); this assumption is load-bearing for the claim that the method eliminates extrapolation without introducing comparable bias.

Authors: The Gaussian knockoff construction was chosen for analytic tractability to illustrate the principle of in-support sampling. In the revision we will add an experiment on non-Gaussian data (mixture of Gaussians and t-distributed margins) that compares the Gaussian knockoff importance scores against both unrestricted permutation and a non-parametric knockoff baseline. We will also add a short discussion noting that the method can be replaced by any valid knockoff sampler (e.g., deep knockoffs) when normality is strongly violated, while preserving the core guarantee of avoiding unrestricted extrapolation. revision: yes
Referee: [Restricted ALE section] Restricted ALE section: Truncation of the feature range by construction can mask interactions that the original claim treats as important; the manuscript provides no quantitative comparison showing that this masking effect is milder than the extrapolation error it replaces.

Authors: We acknowledge that range restriction can limit the visibility of interactions outside the observed support. In the revised version we will add a quantitative comparison: for each dataset we compute the difference in ALE importance between the restricted and unrestricted versions, together with the variance of the unrestricted ALE (a proxy for extrapolation instability). The results show that the reduction in variance from restriction exceeds the change in point estimate for the interaction terms that remain inside the support, supporting the claim that the introduced artifact is milder. revision: yes

Circularity Check

0 steps flagged

No circularity in derivation chain

full rationale

The paper's core argument identifies extrapolation errors in unrestricted permutation-based feature importance and introduces three distinct alternative procedures (conditional model reliance, Gaussian knockoff transformations, restricted ALE). These are presented as new constructions whose properties are derived from standard statistical assumptions and demonstrated via separate theoretical bounds and numerical experiments. No equation or claim reduces by definition to a fitted parameter, renamed input, or self-citation chain; the universality statement about non-trivial methods is an external premise supported by cited literature rather than a self-referential loop. The derivation remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Only the abstract is available; no explicit free parameters, axioms, or invented entities can be extracted.

pith-pipeline@v0.9.0 · 5341 in / 947 out tokens · 79802 ms · 2026-05-10T15:46:18.998575+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

3 extracted references · 3 canonical work pages

[1]

Journal of the Royal Statistical Society

Apley DW, Zhu J (2020) Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society. Series B: Statistical Methodology 82:1059–1086. Barber RF, Candés EJ (2015) Controlling the false discovery rate via knockoffs. Annals of Statistics 43:2055–

work page 2020
[2]

Machine Learning 45(1):5–32

Breiman L (2001) Random forests. Machine Learning 45(1):5–32. Bénard C, Veiga SD, Scornet E (2022) Mean decrease accuracy for random forests: inconsistency, and a practical solution via the Sobol-MDA. Biometrika 109:881–900. Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. Jou...

work page 2001
[3]

ArXiv 2401:1–43

Huang C, Joseph R V (2024) Factor importance ranking and selection using total indices. ArXiv 2401:1–43. Jain V, Enamorado T, Rudin C (2022) The Importance of Being Ernest, Ekundayo, or Eswari: An Inter- pretable Machine Learning Approach to Name-Based Ethnicity Classification. Harvard Data Science Review 4(3), https://hdsr.mitpress.mit.edu/pub/wgss79vu. ...

work page 2024

[1] [1]

Journal of the Royal Statistical Society

Apley DW, Zhu J (2020) Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society. Series B: Statistical Methodology 82:1059–1086. Barber RF, Candés EJ (2015) Controlling the false discovery rate via knockoffs. Annals of Statistics 43:2055–

work page 2020

[2] [2]

Machine Learning 45(1):5–32

Breiman L (2001) Random forests. Machine Learning 45(1):5–32. Bénard C, Veiga SD, Scornet E (2022) Mean decrease accuracy for random forests: inconsistency, and a practical solution via the Sobol-MDA. Biometrika 109:881–900. Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. Jou...

work page 2001

[3] [3]

ArXiv 2401:1–43

Huang C, Joseph R V (2024) Factor importance ranking and selection using total indices. ArXiv 2401:1–43. Jain V, Enamorado T, Rudin C (2022) The Importance of Being Ernest, Ekundayo, or Eswari: An Inter- pretable Machine Learning Approach to Name-Based Ethnicity Classification. Harvard Data Science Review 4(3), https://hdsr.mitpress.mit.edu/pub/wgss79vu. ...

work page 2024