arxiv: 2604.21120 · v1 · submitted 2026-04-22 · 💻 cs.LG · cs.CL

Recognition: unknown

TabSHAP

Aryan Chaudhary , Prateek Agarwal , Tejasvi Alladi

Authors on Pith no claims yet

Pith reviewed 2026-05-10 00:00 UTC · model grok-4.3

classification 💻 cs.LG cs.CL

keywords TabSHAPLLM interpretabilityfeature attributionShapley valuesJensen-Shannon divergencetabular classificationmodel explanationsserialized prompts

0 comments

The pith

TabSHAP adapts a Shapley-style estimator with Jensen-Shannon divergence on masked key-value fields to attribute how each feature shifts the full class distribution in LLM tabular classifiers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops TabSHAP to explain local decisions of large language models that have been fine-tuned on tabular data serialized into text prompts. It estimates each feature's role by sampling coalitions and measuring Jensen-Shannon divergence between the model's output distribution on the complete prompt and on the prompt with one key-value field removed. This matters because prior methods often track only whether a prediction flips or rely on unrelated linear proxies, which can miss how features affect the model's uncertainty across all classes. Masking operates at the level of whole serialized fields to respect the structure of tabular input rather than breaking it into subword tokens. Tests on the Adult Income and Heart Disease datasets show the resulting attributions achieve higher faithfulness under deletion metrics than random baselines or explanations transferred from XGBoost models.

Core claim

TabSHAP is a model-agnostic framework that applies a sampled-coalition approach inspired by Shapley values, where each feature's contribution is quantified by the Jensen-Shannon divergence between the class probability distributions obtained from the complete serialized input and from the input with that feature's key-value pair masked. This replaces traditional scalar probability shifts with a measure of how masking affects the entire output distribution, and the masking is performed at the semantic field level to respect tabular structure in the prompt.

What carries the argument

TabSHAP, a Shapley-style sampled-coalition estimator that computes Jensen-Shannon divergence between the full-input class distribution and the masked-input class distribution for each serialized key:value field.

If this is right

Attributions capture impact on the model's full probabilistic output rather than only on the top prediction.
Field-level masking aligns with the semantic units present in serialized tabular prompts.
The method records higher deletion faithfulness than random baselines and XGBoost proxies on the Adult Income and Heart Disease benchmarks.
The same test instances and settings can be reused to compare attribution quality when Jensen-Shannon divergence is replaced by KL divergence or L1 distance.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The distributional focus could help detect when an LLM relies on features that shift uncertainty without changing the top label.
Practitioners might use the same masking procedure to compare how different serialization formats affect feature attributions for identical tables.
The ablation results imply that metric choice matters and that Jensen-Shannon divergence may better preserve faithfulness than alternatives in this setting.

Load-bearing premise

Masking entire serialized key-value fields produces a faithful representation of the LLM's internal decision logic and Jensen-Shannon divergence on the resulting class distributions isolates causal feature contributions without prompt-specific artifacts.

What would settle it

If ordered deletion of features according to TabSHAP attributions yields no higher faithfulness scores than random or XGBoost-based orderings on the Adult Income and Heart Disease test instances, the claim of improved faithfulness would fail.

Figures

Figures reproduced from arXiv: 2604.21120 by Aryan Chaudhary, Prateek Agarwal, Tejasvi Alladi.

**Figure 1.** Figure 1: Workflow of TABSHAP: prompt serialization, LLM inference, top-K class aggregation, atomic feature omission, and sampled-coalition attribution with Jensen–Shannon divergence. 3 METHODOLOGY We propose TABSHAP, a post-hoc interpretability framework for LLMs fine-tuned on tabular classification tasks. Building on the sampled-coalition procedure introduced by TokenSHAP (Horovicz & Goldshmidt, 2024), our approa… view at source ↗

**Figure 3.** Figure 3: Faithfulness on Heart Disease. TABSHAP isolates diagnostic features under the same deletion protocol. 4.3 DISTANCE METRIC ABLATION STUDY We hold the test subset and TabSHAP hyperparameters fixed, load JSD attributions from cache, and compute KL- and L1-based TabSHAP rankings on the same indices (caching each metric’s rankings after the first run). We then run the identical deletion procedure as in §4.2, b… view at source ↗

**Figure 5.** Figure 5: Heart Disease: distance-metric ablation. Same comparison on the Heart benchmark. 4.4 COMPARATIVE ANALYSIS: LLM VS. XGBOOST We compare the global feature rankings derived from TABSHAP against those from XGBoost (TreeSHAP) on the Adult Income dataset. Rank Correlation. We observe a moderate Spearman rank correlation (ρ ≈ 0.65) between the two methods. This suggests that while the fine-tuned LLM captures t… view at source ↗

read the original abstract

Large Language Models (LLMs) fine-tuned on serialized tabular data are emerging as powerful alternatives to traditional tree-based models, particularly for heterogeneous or context-rich datasets. However, their deployment in high-stakes domains is hindered by a lack of faithful interpretability; existing methods often rely on global linear proxies or scalar probability shifts that fail to capture the model's full probabilistic uncertainty. In this work, we introduce TabSHAP, a model-agnostic interpretability framework designed to directly attribute local query decision logic in LLM-based tabular classifiers. By adapting a Shapley-style sampled-coalition estimator with Jensen-Shannon divergence between full-input and masked-input class distributions, TabSHAP quantifies the distributional impact of each feature rather than simple prediction flips. To align with tabular semantics, we mask at the level of serialized key:value fields (atomic in the prompt string), not individual subword tokens. Experimental validation on the Adult Income and Heart Disease benchmarks demonstrates that TabSHAP isolates critical diagnostic features, achieving significantly higher faithfulness than random baselines and XGBoost proxies. We further run a distance-metric ablation on the same test instances and TabSHAP settings: attributions are recomputed with KL or L1 replacing JSD in the similarity step (results cached per metric), and we compare deletion faithfulness across all three.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper introduces TabSHAP, a model-agnostic interpretability framework for LLMs fine-tuned on serialized tabular data. It adapts a Shapley-style sampled-coalition estimator, replacing the usual value function with Jensen-Shannon divergence between the class-probability distributions produced by the full prompt and by prompts in which individual serialized key:value fields have been masked. Experiments on the Adult Income and Heart Disease benchmarks report that the resulting attributions achieve higher deletion faithfulness than random baselines and than attributions derived from an XGBoost proxy; an ablation recomputes the same attributions under KL and L1 divergences and compares deletion faithfulness across the three metrics.

Significance. If the central claim holds, TabSHAP would supply a practical, distribution-aware attribution method that respects the serialized prompt format of tabular LLMs and avoids the reduction to scalar probability shifts common in prior work. The explicit comparison to both random and XGBoost baselines, together with the metric ablation, provides a falsifiable test of the method's advantage on public benchmarks.

major comments (3)

[Experimental validation paragraph] The experimental validation paragraph states that TabSHAP 'achieves significantly higher faithfulness than random baselines and XGBoost proxies' yet supplies neither numerical faithfulness scores, standard errors, nor the precise deletion protocol (number of features removed per step, ordering criterion, or how masked prompts are constructed during the deletion test). Without these quantities it is impossible to judge whether the reported improvement is practically meaningful or statistically reliable.
[Experimental validation paragraph] The central claim that JSD between full-input and masked-input class distributions isolates each feature's causal distributional impact rests on the untested assumption that masking entire serialized key:value fields is a semantically clean intervention. The distance-metric ablation varies only the divergence (JSD/KL/L1) while leaving the masking procedure fixed; no control experiments (e.g., null-token substitution, rephrasing, or random token masking) are reported to rule out prompt-formatting artifacts. This is the load-bearing link between the method and the faithfulness numbers.
[Experimental validation paragraph] The abstract and experimental description claim superiority over 'XGBoost proxies' but do not specify how the XGBoost attributions are obtained (TreeSHAP, permutation importance, etc.) or whether the same deletion-faithfulness protocol is applied to both TabSHAP and the proxy. Direct comparability therefore cannot be verified from the given information.

minor comments (2)

The phrase 'atomic in the prompt string' is used to justify field-level masking; a short clarifying sentence on tokenization boundaries would prevent readers from wondering whether subword splits ever cross field boundaries.
The ablation description states that 'results [are] cached per metric'; a brief note on whether the same coalition samples are reused across the three divergences would clarify computational cost and reproducibility.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the careful reading and for highlighting the need for greater experimental transparency. We agree that the original experimental validation paragraph was insufficiently detailed and have revised the manuscript to supply the missing quantitative results, protocol specifications, and controls. Below we respond point by point.

read point-by-point responses

Referee: The experimental validation paragraph states that TabSHAP 'achieves significantly higher faithfulness than random baselines and XGBoost proxies' yet supplies neither numerical faithfulness scores, standard errors, nor the precise deletion protocol (number of features removed per step, ordering criterion, or how masked prompts are constructed during the deletion test). Without these quantities it is impossible to judge whether the reported improvement is practically meaningful or statistically reliable.

Authors: We agree that the original text omitted the necessary numerical results and protocol details. In the revised manuscript we have added a new table (Table 2) that reports deletion-faithfulness AUC scores (mean and standard error over five random seeds) for TabSHAP, random, and XGBoost-proxy attributions on both Adult Income and Heart Disease. Section 4.2 now specifies the exact protocol: features are ranked by absolute attribution value and removed one at a time in that order; each removal is performed by replacing the corresponding serialized key:value substring with a null token in the prompt; faithfulness is quantified as the area under the curve of the increase in Jensen-Shannon divergence between the original and the successively masked output distributions. revision: yes
Referee: The central claim that JSD between full-input and masked-input class distributions isolates each feature's causal distributional impact rests on the untested assumption that masking entire serialized key:value fields is a semantically clean intervention. The distance-metric ablation varies only the divergence (JSD/KL/L1) while leaving the masking procedure fixed; no control experiments (e.g., null-token substitution, rephrasing, or random token masking) are reported to rule out prompt-formatting artifacts. This is the load-bearing link between the method and the faithfulness numbers.

Authors: The concern about prompt-formatting artifacts is legitimate. While the original manuscript did not include explicit controls, we have added a new ablation subsection (Section 4.4) that compares our field-level masking against two controls: (i) random subword-token masking within the same fields and (ii) replacement of field names with semantically neutral placeholders. The field-level masking continues to produce higher deletion faithfulness than either control, supporting the claim that the intervention is relatively clean. We acknowledge that no finite set of controls can exhaustively eliminate all possible artifacts in LLM prompting; the added experiments nevertheless provide direct evidence that the reported advantage is not an artifact of the particular masking choice. revision: yes
Referee: The abstract and experimental description claim superiority over 'XGBoost proxies' but do not specify how the XGBoost attributions are obtained (TreeSHAP, permutation importance, etc.) or whether the same deletion-faithfulness protocol is applied to both TabSHAP and the proxy. Direct comparability therefore cannot be verified from the given information.

Authors: We apologize for the lack of specification. The revised Section 3.3 now states that the XGBoost proxy attributions were computed with TreeSHAP on an XGBoost model trained on the identical tabular feature set (no serialization). The deletion-faithfulness evaluation uses the identical protocol for both methods: attributions determine the removal order, and faithfulness is measured by the same JSD-based curve on the LLM outputs. For the XGBoost proxy the removal is performed on the tabular input to the LLM prompt, ensuring direct comparability. revision: yes

Circularity Check

0 steps flagged

No significant circularity in TabSHAP derivation chain

full rationale

TabSHAP is defined directly as an adaptation of Shapley-style sampled-coalition estimation using Jensen-Shannon divergence between class distributions on full versus masked serialized key:value prompts. This construction is explicit and does not reduce to its own outputs or fitted parameters. Faithfulness is measured externally via deletion on public benchmarks (Adult Income, Heart Disease) against random and XGBoost baselines, with an ablation over divergence metrics (JSD/KL/L1) that tests alternatives without self-reference. No equations, self-citations, or uniqueness claims create a closed loop; the method and its empirical validation remain independent.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The abstract introduces no new free parameters, invented entities, or ad-hoc axioms; the method rests on standard Shapley sampling and the domain assumption that JSD on class distributions is a suitable importance measure.

axioms (2)

standard math Shapley values can be approximated by sampling coalitions
Invoked in the description of the sampled-coalition estimator
domain assumption Jensen-Shannon divergence between full and masked class distributions quantifies feature impact
Central to the attribution step and the choice of similarity metric

pith-pipeline@v0.9.0 · 5524 in / 1399 out tokens · 40193 ms · 2026-05-10T00:00:19.698013+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

12 extracted references · 4 canonical work pages · 1 internal anchor

[1]

Proceedings of AISTATS , year=

TabLLM: Few-shot Classification of Tabular Data with Large Language Models , author=. Proceedings of AISTATS , year=
[2]

Advances in Neural Information Processing Systems , volume=

Lift: Language-interfaced fine-tuning for non-language machine learning tasks , author=. Advances in Neural Information Processing Systems , volume=
[3]

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

Deep inside convolutional networks: Visualising image classification models and saliency maps , author=. arXiv preprint arXiv:1312.6034 , year=

work page Pith review arXiv
[4]

International conference on machine learning , pages=

Axiomatic attribution for deep networks , author=. International conference on machine learning , pages=. 2017 , organization=

2017
[5]

arXiv preprint arXiv:2407.10114 , year=

TokenSHAP: Interpreting Large Language Models with Monte Carlo Shapley Value Estimation , author=. arXiv preprint arXiv:2407.10114 , year=

work page arXiv
[6]

Building Trust Workshop at ICLR , year=

SPEX: Scaling Feature Interaction Explanations for LLMs , author=. Building Trust Workshop at ICLR , year=
[7]

arXiv preprint arXiv:2312.01279 , year=

TextGenSHAP: Scalable post-hoc explanations in text generation , author=. arXiv preprint arXiv:2312.01279 , year=

work page arXiv
[8]

Advances in neural information processing systems , volume=

A unified approach to interpreting model predictions , author=. Advances in neural information processing systems , volume=
[9]

Advances in Neural Information Processing Systems , volume=

QLoRA: Efficient Finetuning of Quantized LLMs , author=. Advances in Neural Information Processing Systems , volume=
[10]

LLaMA: Open and Efficient Foundation Language Models

Llama: Open and Efficient Foundation Language Models , author=. arXiv preprint arXiv:2302.13971 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[11]

Attention, Please!

Goldshmidt, Roni and others , journal=. Attention, Please!
[12]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Tablebench: A comprehensive and complex benchmark for table question answering , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=