Trustworthy Feature Importance Avoids Unrestricted Permutations
Pith reviewed 2026-05-10 15:46 UTC · model grok-4.3
The pith
Unrestricted permutations in feature importance methods produce extrapolation errors that make all non-trivial approaches unreliable.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Feature importance methods using unrestricted permutations are flawed due to extrapolation errors; such errors appear in all non-trivial variable importance approaches. We propose three new approaches: conditional model reliance and Knockoffs with Gaussian transformation, and restricted ALE plot designs. Theoretical and numerical results show our strategies reduce/eliminate extrapolation.
What carries the argument
Conditional sampling and restricted designs that confine permutations or plot evaluations to regions supported by the observed data distribution, thereby preventing extrapolation.
If this is right
- Conditional model reliance computes importance by sampling only from conditional distributions given the other variables.
- Knockoff-Gaussian transformations enable controlled permutations that stay within the data manifold.
- Restricted ALE plots limit the range of the variable of interest to values actually observed in the data.
- These restrictions apply broadly and reduce or remove extrapolation across non-trivial variable importance techniques.
Where Pith is reading between the lines
- Practitioners interpreting models for regulatory or clinical use may need to replace standard permutation importance with one of the conditional or restricted variants.
- The same extrapolation concern could be checked in related tools such as partial dependence plots or individual conditional expectation curves.
- Future implementations might embed these restrictions directly into software libraries for automatic detection of out-of-distribution evaluations.
- The principle suggests that importance metrics should be required to respect the support of the training distribution as a basic validity condition.
Load-bearing premise
The proposed conditional, knockoff-Gaussian, and restricted-ALE strategies can be applied in general settings without introducing comparable or worse artifacts and that the theoretical guarantees translate to the numerical experiments shown.
What would settle it
A concrete dataset and model where any of the three proposed methods produces larger extrapolation errors or less stable importance rankings than standard unrestricted permutation importance.
read the original abstract
Feature importance methods using unrestricted permutations are flawed due to extrapolation errors; such errors appear in all non-trivial variable importance approaches. We propose three new approaches: conditional model reliance and Knockoffs with Gaussian transformation, and restricted ALE plot designs. Theoretical and numerical results show our strategies reduce/eliminate extrapolation.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper claims that unrestricted permutation methods for feature importance suffer from extrapolation errors by evaluating models outside the training data support, and that this flaw is universal across all non-trivial variable importance approaches. It proposes three alternatives—conditional model reliance, Gaussian knockoff transformations, and restricted ALE plots—claiming that theoretical arguments and numerical experiments demonstrate these strategies reduce or eliminate the extrapolation issue.
Significance. If the universality claim and the superiority of the proposed fixes hold, the work would meaningfully improve the trustworthiness of feature importance in interpretable ML, especially for applications sensitive to out-of-distribution evaluations. The provision of multiple concrete strategies and the focus on avoiding unrestricted permutations address a practical gap, though the strength depends on whether the new methods introduce milder artifacts than the original problem.
major comments (4)
- [Abstract] Abstract and opening sections: The central assertion that extrapolation errors 'appear in all non-trivial variable importance approaches' is load-bearing for the motivation but is not accompanied by a formal characterization of the class of methods considered or a proof that methods such as tree impurity measures or kernel-based dependence (which do not explicitly extrapolate) are included; without this, the premise that the problem is universal remains unestablished.
- [Theoretical results] Theoretical results section: The guarantees for conditional model reliance do not bound the additional error from conditional density estimation against the original permutation bias, leaving open whether the replacement artifact is strictly smaller across regimes.
- [Knockoff-Gaussian section] Knockoff-Gaussian section: The Gaussian transformation step assumes joint normality, yet no analysis or experiment demonstrates robustness when this is violated (common in real data); this assumption is load-bearing for the claim that the method eliminates extrapolation without introducing comparable bias.
- [Restricted ALE section] Restricted ALE section: Truncation of the feature range by construction can mask interactions that the original claim treats as important; the manuscript provides no quantitative comparison showing that this masking effect is milder than the extrapolation error it replaces.
minor comments (2)
- Notation for the three proposed methods is introduced without a unified comparison table; a side-by-side summary of assumptions and computational requirements would improve clarity.
- [Numerical experiments] The numerical experiments section lacks explicit controls for the strength of interactions or the degree of non-normality, making it hard to assess generalizability of the reported error reductions.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed comments. We address each major comment point by point below. We agree that several points warrant clarification or additional material and will revise the manuscript accordingly.
read point-by-point responses
-
Referee: [Abstract] Abstract and opening sections: The central assertion that extrapolation errors 'appear in all non-trivial variable importance approaches' is load-bearing for the motivation but is not accompanied by a formal characterization of the class of methods considered or a proof that methods such as tree impurity measures or kernel-based dependence (which do not explicitly extrapolate) are included; without this, the premise that the problem is universal remains unestablished.
Authors: We agree that a formal characterization of the class would strengthen the motivation. In the revised manuscript we will add a dedicated subsection that defines 'non-trivial variable importance approaches' as those requiring model evaluations on points outside the joint support of the observed data. We will explicitly discuss tree-based impurity measures (which rely on axis-aligned splits that implicitly extrapolate in sparse regions) and kernel dependence measures (which involve implicit density weighting that can produce out-of-support artifacts). While a fully general proof covering every conceivable method is beyond the paper's scope, the added section will make the scope of the claim precise and provide counter-examples for methods that avoid the issue. revision: partial
-
Referee: [Theoretical results] Theoretical results section: The guarantees for conditional model reliance do not bound the additional error from conditional density estimation against the original permutation bias, leaving open whether the replacement artifact is strictly smaller across regimes.
Authors: The existing theoretical results assume an oracle conditional density and show that extrapolation bias is eliminated. To address the referee's concern we will add a new proposition that bounds the total error (permutation bias plus density-estimation error) under standard consistency rates for the conditional density estimator. The bound demonstrates that, for sample sizes typical in the experiments, the net error is strictly smaller than the unrestricted permutation bias whenever the density estimator converges faster than the rate at which the permutation bias grows with dimension. We will also include a brief simulation confirming the regime in which the inequality holds. revision: yes
-
Referee: [Knockoff-Gaussian section] Knockoff-Gaussian section: The Gaussian transformation step assumes joint normality, yet no analysis or experiment demonstrates robustness when this is violated (common in real data); this assumption is load-bearing for the claim that the method eliminates extrapolation without introducing comparable bias.
Authors: The Gaussian knockoff construction was chosen for analytic tractability to illustrate the principle of in-support sampling. In the revision we will add an experiment on non-Gaussian data (mixture of Gaussians and t-distributed margins) that compares the Gaussian knockoff importance scores against both unrestricted permutation and a non-parametric knockoff baseline. We will also add a short discussion noting that the method can be replaced by any valid knockoff sampler (e.g., deep knockoffs) when normality is strongly violated, while preserving the core guarantee of avoiding unrestricted extrapolation. revision: yes
-
Referee: [Restricted ALE section] Restricted ALE section: Truncation of the feature range by construction can mask interactions that the original claim treats as important; the manuscript provides no quantitative comparison showing that this masking effect is milder than the extrapolation error it replaces.
Authors: We acknowledge that range restriction can limit the visibility of interactions outside the observed support. In the revised version we will add a quantitative comparison: for each dataset we compute the difference in ALE importance between the restricted and unrestricted versions, together with the variance of the unrestricted ALE (a proxy for extrapolation instability). The results show that the reduction in variance from restriction exceeds the change in point estimate for the interaction terms that remain inside the support, supporting the claim that the introduced artifact is milder. revision: yes
Circularity Check
No circularity in derivation chain
full rationale
The paper's core argument identifies extrapolation errors in unrestricted permutation-based feature importance and introduces three distinct alternative procedures (conditional model reliance, Gaussian knockoff transformations, restricted ALE). These are presented as new constructions whose properties are derived from standard statistical assumptions and demonstrated via separate theoretical bounds and numerical experiments. No equation or claim reduces by definition to a fitted parameter, renamed input, or self-citation chain; the universality statement about non-trivial methods is an external premise supported by cited literature rather than a self-referential loop. The derivation remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Journal of the Royal Statistical Society
Apley DW, Zhu J (2020) Visualizing the effects of predictor variables in black box supervised learning models. Journal of the Royal Statistical Society. Series B: Statistical Methodology 82:1059–1086. Barber RF, Candés EJ (2015) Controlling the false discovery rate via knockoffs. Annals of Statistics 43:2055–
work page 2020
-
[2]
Breiman L (2001) Random forests. Machine Learning 45(1):5–32. Bénard C, Veiga SD, Scornet E (2022) Mean decrease accuracy for random forests: inconsistency, and a practical solution via the Sobol-MDA. Biometrika 109:881–900. Candès E, Fan Y, Janson L, Lv J (2018) Panning for gold: ‘model-x’ knockoffs for high dimensional controlled variable selection. Jou...
work page 2001
-
[3]
Huang C, Joseph R V (2024) Factor importance ranking and selection using total indices. ArXiv 2401:1–43. Jain V, Enamorado T, Rudin C (2022) The Importance of Being Ernest, Ekundayo, or Eswari: An Inter- pretable Machine Learning Approach to Name-Based Ethnicity Classification. Harvard Data Science Review 4(3), https://hdsr.mitpress.mit.edu/pub/wgss79vu. ...
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.