arxiv: 2604.16689 · v1 · submitted 2026-04-17 · 💻 cs.AI

Recognition: unknown

The Query Channel: Information-Theoretic Limits of Masking-Based Explanations

Erciyes Karakaya, Ozgur Ercetin

Authors on Pith no claims yet

Pith reviewed 2026-05-10 08:06 UTC · model grok-4.3

classification 💻 cs.AI

keywords query channelmasking-based explanationsinformation theoryidentification capacityLIMEKernelSHAPstrong converseblack-box explanations

0 comments

The pith

Masking-based explanations can be recovered reliably only when their rate stays below the query channel's identification capacity.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper recasts the masking procedure used by LIME and KernelSHAP as transmission of a message (the explanation) over a query channel whose uses are randomized perturbations of the black-box input. It measures explanation complexity by the entropy of the hypothesis class and shows that the channel supplies information at a fixed identification capacity per query. A strong converse proves that any explainer whose rate exceeds this capacity must fail to recover the explanation exactly with probability approaching one. An achievability result shows that a sparse maximum-likelihood decoder succeeds below capacity. The framework therefore supplies a fundamental limit on how complex an explanation can be for a given query budget.

Core claim

By treating each masked evaluation as a channel use, the paper proves that reliable recovery of the latent explanation is possible if and only if the rate (entropy per query) lies below the identification capacity of the query channel. Above capacity a strong converse shows that the error probability converges to one for every sequence of explainers and decoders; below capacity a sparse maximum-likelihood decoder attains vanishing error. A Monte Carlo mutual-information estimator supplies a practical benchmark that reveals operating regimes where information theory permits exact recovery yet Lasso- and OLS-based surrogates still fail.

What carries the argument

The query channel, in which the latent explanation is the message and each randomized masking evaluation of the black-box model is one channel use, whose identification capacity sets the maximum rate for reliable recovery.

If this is right

Super-pixel or token resolutions function as source-coding choices that set the entropy of the explanation and must be matched to the query capacity.
Gaussian noise and nonlinear curvature in the black-box reduce the effective capacity and produce waterfall and error-floor behavior.
There exist finite query budgets where information theory guarantees reliable recovery is possible but current convex surrogates still produce large errors.
High-resolution explanations become unattainable once the entropy induced by fine-grained masking exceeds the per-query capacity.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

New explanation algorithms could be designed to approach the capacity bound by replacing convex surrogates with decoders closer to the maximum-likelihood rule.
The same channel model could be applied to non-masking explanation techniques to derive comparable rate limits.
Measuring the effective capacity on large language models would quantify how much query budget is wasted by tokenization choices.

Load-bearing premise

Randomized masking perturbations supply information at a well-defined identification capacity per query that does not depend on the particular black-box model in a way that would invalidate the uniform converse.

What would settle it

A concrete experiment in which an explainer and decoder recover explanations whose entropy exceeds the Monte Carlo estimated capacity with error probability that remains bounded away from one.

Figures

Figures reproduced from arXiv: 2604.16689 by Erciyes Karakaya, Ozgur Ercetin.

**Figure 2.** Figure 2: Unified information-theoretic and algorithmic behavior for the sparse [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

**Figure 3.** Figure 3: Sample complexity at fixed query budget and varying resolution. Top: conceptual illustration comparing pixel-level SHAP at very high dimension [PITH_FULL_IMAGE:figures/full_fig_p009_3.png] view at source ↗

**Figure 4.** Figure 4: Block error probability for dense (Ridge / OLS proxy) and sparse [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Empirical relationship between information content [PITH_FULL_IMAGE:figures/full_fig_p012_5.png] view at source ↗

**Figure 6.** Figure 6: Effect of segmentation resolution for texts on prediction performance. Left: Next-token prediction accuracy on an in-context learning (ICL) task as a function of resolution, where resolution denotes the maximum allowed subword length under greedy encoding. Both oversegmentation (small resolution) and undersegmentation (large resolution) degrade performance, with an optimal region emerging at intermediate r… view at source ↗

read the original abstract

Masking-based post-hoc explanation methods, such as KernelSHAP and LIME, estimate local feature importance by querying a black-box model under randomized perturbations. This paper formulates this procedure as communication over a query channel, where the latent explanation acts as a message and each masked evaluation is a channel use. Within this framework, the complexity of the explanation is captured by the entropy of the hypothesis class, while the query interface supplies information at a rate determined by an identification capacity per query. We derive a strong converse showing that, if the explanation rate exceeds this capacity, the probability of exact recovery necessarily converges to one in error for any sequence of explainers and decoders. We also prove an achievability result establishing that a sparse maximum-likelihood decoder attains reliable recovery when the rate lies below capacity. A Monte Carlo estimator of mutual information yields a non-asymptotic query benchmark that we use to compare optimal decoding with Lasso- and OLS-based procedures that mirror LIME and KernelSHAP. Experiments reveal a range of query budgets where information theory permits reliable explanations but standard convex surrogates still fail. Finally, we interpret super-pixel resolution and tokenization for neural language models as a source-coding choice that sets the entropy of the explanation and show how Gaussian noise and nonlinear curvature degrade the query channel, induce waterfall and error-floor behavior, and render high-resolution explanations unattainable.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper models masking explanations as transmission over a query channel and derives capacity bounds plus Monte Carlo benchmarks, but the claimed uniform strong converse may not survive model dependence.

read the letter

This paper treats the masking queries in methods like LIME and KernelSHAP as uses of a communication channel whose capacity limits how complex an explanation you can recover reliably. The main new piece is the application of strong converse theorems to show that exceeding the rate (hypothesis entropy over queries) forces error probability to one, plus an achievability result for a sparse ML decoder below capacity. They also supply a Monte Carlo mutual-information estimator as a non-asymptotic benchmark and run it against Lasso and OLS surrogates that mimic the standard methods. Experiments indicate regimes where the information-theoretic limit allows recovery but the convex proxies still fail, and they tie super-pixel or token choices to source-coding rate limits while noting how noise and curvature create error floors. That framing is fresh and gives a clean way to think about why high-resolution explanations are hard in practice. The soft spot is the uniformity assumption: capacity is induced by the unknown black-box response to masks, so it varies with the model. If some black-boxes make the channel more informative, the threshold is not a universal ceiling that applies to every explainer, which undercuts the strong converse claim for arbitrary cases. The abstract states the proofs exist but the provided text gives no channel definitions or derivation steps, so verification is impossible here. The Monte Carlo setup also lacks enough detail on black-box choices and sampling to judge robustness. This is for readers already working at the intersection of information theory and XAI who want bounds instead of new heuristics. A practitioner looking for immediate fixes will find little; someone wanting to organize evaluation of explanation methods will see value. The work is coherent enough on its own terms to deserve referee time, even with the open questions on assumptions and proofs.

Referee Report

2 major / 2 minor

Summary. The paper models masking-based post-hoc explanation methods such as LIME and KernelSHAP as communication over a query channel, where the latent explanation is the message and each randomized masked evaluation of the black-box is a channel use. Explanation complexity is measured by the entropy of the hypothesis class, and the query interface provides information at an identification capacity per query. The central results are a strong converse showing that if the explanation rate (entropy divided by number of queries) exceeds this capacity then the probability of exact recovery converges to 1 for any sequence of explainers and decoders, plus an achievability result that a sparse maximum-likelihood decoder succeeds reliably below capacity. A Monte Carlo mutual-information estimator supplies a non-asymptotic benchmark used to compare optimal decoding against Lasso- and OLS-based procedures; experiments identify query-budget regimes where information theory permits reliable recovery but the convex surrogates fail. The work also interprets super-pixel resolution and tokenization as source-coding choices and analyzes how Gaussian noise and nonlinear curvature degrade the channel.

Significance. If the results hold, the paper supplies the first information-theoretic characterization of fundamental query limits for masking-based explanations, which could guide the design of query-efficient explainers and clarify when high-resolution explanations are information-theoretically impossible. The Monte Carlo benchmark and direct comparisons to LIME/KernelSHAP-style procedures are practical strengths, as is the explicit treatment of resolution choices as source coding. The derivation of both a strong converse and an achievability result, together with the reproducible Monte Carlo estimator, constitutes a solid technical contribution.

major comments (2)

[Abstract] Abstract and model definition: the strong converse is stated to apply uniformly 'for any sequence of explainers and decoders' and to constrain all practical masking explainers, yet the channel law P(black-box output | mask) is induced by the unknown black-box function itself. It is therefore unclear whether the resulting identification capacity remains bounded independently of the black-box or can be made arbitrarily large by suitable choice of f, which would prevent the rate threshold from serving as a universal limit. This assumption is load-bearing for the universality claim.
[Abstract] Abstract: the manuscript asserts derivation of a 'strong converse' and an 'achievability result' together with Monte Carlo experiments, but the provided text does not include the full channel definitions, the precise statement of the capacity, or the proof sketches. Without these, the central claims cannot be verified at the level required for a journal.

minor comments (2)

Notation for the query channel, identification capacity, and explanation rate should be introduced with explicit equations in the main text rather than relying on the abstract.
The Monte Carlo estimator of mutual information is described only at a high level; adding pseudocode or implementation details would improve reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive review. We address each major comment below and indicate the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Abstract] Abstract and model definition: the strong converse is stated to apply uniformly 'for any sequence of explainers and decoders' and to constrain all practical masking explainers, yet the channel law P(black-box output | mask) is induced by the unknown black-box function itself. It is therefore unclear whether the resulting identification capacity remains bounded independently of the black-box or can be made arbitrarily large by suitable choice of f, which would prevent the rate threshold from serving as a universal limit. This assumption is load-bearing for the universality claim.

Authors: The identification capacity is defined for the specific channel law induced by the fixed but arbitrary black-box f. The strong converse establishes that, for any such f, if the explanation rate exceeds the capacity of the induced query channel, then the probability of exact recovery converges to zero for every sequence of explainers and decoders. The result is therefore universal with respect to the choice of explainer and decoder, while the numerical value of the capacity naturally depends on f. This dependence is expected, as different black-box functions convey different amounts of information per masked query. We do not assert a single numerical bound independent of f. To remove any ambiguity, we will revise the abstract and the model section to state explicitly that the channel law and capacity are induced by f and to emphasize that the converse applies uniformly over explainers for any given f. revision: yes
Referee: [Abstract] Abstract: the manuscript asserts derivation of a 'strong converse' and an 'achievability result' together with Monte Carlo experiments, but the provided text does not include the full channel definitions, the precise statement of the capacity, or the proof sketches. Without these, the central claims cannot be verified at the level required for a journal.

Authors: The full manuscript defines the query channel and the induced channel law in Section 2, states the identification capacity, presents the strong converse and achievability theorems in Section 3 with proof sketches, and supplies complete proofs in the appendix. The Monte Carlo mutual-information estimator and its use as a benchmark are detailed in Section 4. Nevertheless, we agree that the abstract is highly condensed. We will expand the abstract to include a concise statement of the channel model and capacity, and we will ensure that the introduction highlights the theorem statements and proof outlines for easier verification. revision: yes

Circularity Check

0 steps flagged

No circularity: standard channel coding theorems applied to query channel model

full rationale

The paper models masking-based explanations as transmission over a query channel, with explanation rate defined as hypothesis-class entropy divided by number of queries and capacity defined via mutual information between latent explanation and black-box outputs under masks. The strong converse (error probability to 1 above capacity) and achievability (reliable recovery below capacity via sparse ML decoder) are obtained by direct application of classical information-theoretic results on identification capacity; these theorems are external and do not reduce to any fitted parameter, self-definition, or ansatz within the paper. The Monte Carlo MI estimator is used solely for non-asymptotic benchmarking against LIME/KernelSHAP surrogates and plays no role in the asymptotic derivation. No load-bearing self-citations, uniqueness theorems imported from the authors, or renaming of known results are present. The derivation chain is therefore self-contained and independent of its own inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 1 invented entities

The central claim rests on modeling masking explanations as a memoryless query channel whose capacity is given by identification rate, plus standard information-theoretic theorems; no explicit free parameters are fitted to data in the theoretical claims.

axioms (2)

standard math Strong converse theorem for discrete memoryless channels
Invoked to prove that rate above capacity forces error probability to 1.
domain assumption Achievability of reliable recovery below capacity with sparse ML decoder
Assumes the hypothesis class and query responses allow the decoder to attain the rate.

invented entities (1)

Query channel no independent evidence
purpose: Abstract model of information transfer from randomized masked evaluations to explanation recovery
New modeling abstraction introduced to apply channel coding tools to post-hoc explanations.

pith-pipeline@v0.9.0 · 5545 in / 1617 out tokens · 40126 ms · 2026-05-10T08:06:36.153392+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

37 extracted references · 3 canonical work pages · 1 internal anchor

[1]

Compressed sensing,

D. L. Donoho, “Compressed sensing,”IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006

2006
[2]

Robust uncertainty principles: Ex- act signal reconstruction from highly incomplete frequency information,

E. J. Cand `es, J. Romberg, and T. Tao, “Robust uncertainty principles: Ex- act signal reconstruction from highly incomplete frequency information,” IEEE Transactions on Information Theory, vol. 52, no. 2, pp. 489–509, 2006

2006
[3]

Message-passing algo- rithms for compressed sensing,

D. L. Donoho, A. Maleki, and A. Montanari, “Message-passing algo- rithms for compressed sensing,”Proceedings of the National Academy of Sciences, vol. 106, no. 45, pp. 18 914–18 919, 2009

2009
[4]

Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting,

M. J. Wainwright, “Information-theoretic limits on sparsity recovery in the high-dimensional and noisy setting,”IEEE Transactions on Information Theory, vol. 55, no. 12, pp. 5728–5741, 2009

2009
[5]

Compressed sensing phase transitions: Rigorous bounds versus replica predictions,

G. Reeves and M. Gastpar, “Compressed sensing phase transitions: Rigorous bounds versus replica predictions,” in2012 46th Annual Conference on Information Sciences and Systems (CISS). IEEE, 2012, pp. 1–6

2012
[6]

Optimal phase transitions in compressed sensing,

Y . Wu and S. Verd´u, “Optimal phase transitions in compressed sensing,” IEEE Transactions on Information Theory, vol. 56, no. 8, pp. 3721–3748, 2010

2010
[7]

Information-theoretic limits on sparse signal recovery: Dense versus sparse measurement matrices,

W. Wang, M. J. Wainwright, and K. Ramchandran, “Information-theoretic limits on sparse signal recovery: Dense versus sparse measurement matrices,”IEEE Transactions on Information Theory, vol. 56, no. 6, pp. 2967–2979, 2010

2010
[8]

Information-theoretic analysis of generalization capability of learning algorithms,

A. Xu and M. Raginsky, “Information-theoretic analysis of generalization capability of learning algorithms,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017

2017
[9]

Information dropout: Learning optimal representations through noisy computation,

A. Achille and S. Soatto, “Information dropout: Learning optimal representations through noisy computation,” inIEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 12, 2018, pp. 2897–2905. 16

2018
[10]

The information bottleneck method

N. Tishby, F. C. Pereira, and W. Bialek, “The information bottleneck method,”arXiv preprint physics/0004057, 2000

work page Pith review arXiv 2000
[11]

A unified approach to interpreting model predictions,

S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 30, 2017, pp. 4765–4774

2017
[12]

Explaining prediction models and individual predictions with feature contributions,

E. Strumbelj and I. Kononenko, “Explaining prediction models and individual predictions with feature contributions,”Knowledge and Information Systems, vol. 41, no. 3, pp. 647–665, 2014

2014
[13]

“why should i trust you?

M. T. Ribeiro, S. Singh, and C. Guestrin, ““why should i trust you?”: Explaining the predictions of any classifier,” inProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016, pp. 1135–1144

2016
[14]

On the robustness of interpretability methods,

D. Alvarez-Melis and T. S. Jaakkola, “On the robustness of interpretability methods,” inProceedings of the ICML Workshop on Human Interpretabil- ity in Machine Learning (WHI), 2018

2018
[15]

Fooling lime and shap: Adversarial attacks on post-hoc explanation methods,

D. Slack, S. Hilgard, E. Jia, S. Singh, and H. Lakkaraju, “Fooling lime and shap: Adversarial attacks on post-hoc explanation methods,” in Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society, 2020, pp. 180–186

2020
[16]

Interpretation of neural networks is fragile,

A. Ghorbani, A. Abid, and J. Zou, “Interpretation of neural networks is fragile,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 3681–3688

2019
[17]

Explaining the explainer: A first theoretical analysis of lime,

D. Garreau and U. von Luxburg, “Explaining the explainer: A first theoretical analysis of lime,”Journal of Machine Learning Research, vol. 22, no. 128, pp. 1–31, 2021

2021
[18]

Sanity checks for saliency maps,

J. Adebayo, J. Gilmer, M. Muelly, I. Goodfellow, M. Hardt, and B. Kim, “Sanity checks for saliency maps,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 31, 2018

2018
[19]

On the (in)fidelity and sensitivity of explanations,

C.-K. Yeh, C.-Y . Hsieh, A. Suggala, D. I. Inouye, and P. Ravikumar, “On the (in)fidelity and sensitivity of explanations,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019

2019
[20]

A benchmark for interpretability methods in deep neural networks,

S. Hooker, D. Erhan, P.-J. Kindermans, and B. Kim, “A benchmark for interpretability methods in deep neural networks,” inAdvances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019

2019
[21]

Improving kernelshap: Practical shapley value estimation using linear regression,

I. Covert and S.-I. Lee, “Improving kernelshap: Practical shapley value estimation using linear regression,” inInternational Conference on Artificial Intelligence and Statistics (AISTATS). PMLR, 2021, pp. 3457– 3465

2021
[22]

A mathematical theory of communication,

C. E. Shannon, “A mathematical theory of communication,”The Bell System Technical Journal, vol. 27, no. 3, pp. 379–423, 1948

1948
[23]

The coding of messages subject to chance errors,

J. Wolfowitz, “The coding of messages subject to chance errors,”Illinois Journal of Mathematics, vol. 1, no. 4, pp. 591–606, 1957

1957
[24]

T. M. Cover and J. A. Thomas,Elements of information theory. John Wiley & Sons, 2006

2006
[25]

The limits of ai explainability: An algorithmic information theory approach,

S. Rao, “The limits of ai explainability: An algorithmic information theory approach,” 2025. [Online]. Available: https://arxiv.org/abs/2504.20676

work page arXiv 2025
[26]

Slic superpixels compared to state-of-the-art superpixel methods,

R. Achanta, A. Shaji, K. Smith, A. Lucchi, P. Fua, and S. S ¨usstrunk, “Slic superpixels compared to state-of-the-art superpixel methods,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 34, no. 11, pp. 2274–2282, 2012

2012
[27]

Ridge regression: Biased estimation for nonorthogonal problems,

A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,”Technometrics, vol. 12, no. 1, pp. 55–67, 1970

1970
[28]

An analysis of tokenization: Transformers under markov data,

N. Rajaraman, J. Jiao, and K. Ramchandran, “An analysis of tokenization: Transformers under markov data,”Advances in Neural Information Processing Systems, vol. 37, pp. 62 503–62 556, 2024

2024
[29]

Why don’t people use character-level machine translation?

J. Libovick `y, H. Schmid, and A. Fraser, “Why don’t people use character-level machine translation?” inFindings of the Association for Computational Linguistics: ACL 2022, 2022, pp. 2470–2485

2022
[30]

Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing,

T. Kudo and J. Richardson, “Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing,” inProceedings of the 2018 conference on empirical methods in natural language processing: System demonstrations, 2018, pp. 66–71

2018
[31]

Byt5: Towards a token-free future with pre-trained byte-to-byte models,

L. Xue, A. Barua, N. Constant, R. Al-Rfou, S. Narang, M. Kale, A. Roberts, and C. Raffel, “Byt5: Towards a token-free future with pre-trained byte-to-byte models,”Transactions of the Association for Computational Linguistics, vol. 10, pp. 291–306, 2022

2022
[32]

Neural machine translation of rare words with subword units,

R. Sennrich, B. Haddow, and A. Birch, “Neural machine translation of rare words with subword units,” inProceedings of the 54th annual meeting of the association for computational linguistics (volume 1: long papers), 2016, pp. 1715–1725

2016
[33]

Tokenizer choice for llm training: Negligible or crucial?

M. Ali, M. Fromm, K. Thellmann, R. Rutmann, M. L¨ubbering, J. Leveling, K. Klug, J. Ebert, N. Doll, J. Buschhoffet al., “Tokenizer choice for llm training: Negligible or crucial?” inFindings of the Association for Computational Linguistics: NAACL 2024, 2024, pp. 3907–3924

2024
[34]

Scaling Laws for Neural Language Models

J. Kaplan, S. McCandlish, T. Henighan, T. B. Brown, B. Chess, R. Child, S. Gray, A. Radford, J. Wu, and D. Amodei, “Scaling laws for neural language models,”arXiv preprint arXiv:2001.08361, 2020

work page internal anchor Pith review Pith/arXiv arXiv 2001
[35]

microgpt,

A. Karpathy, “microgpt,” https://karpathy.github.io/2026/02/12/microgpt/, 2026, accessed: 2026-04-12

2026
[36]

Large vocabulary size improves large language models,

S. Takase, R. Ri, S. Kiyono, and T. Kato, “Large vocabulary size improves large language models,” inFindings of the Association for Computational Linguistics: ACL 2025, 2025, pp. 1015–1026

2025
[37]

Csisz´ar and J

I. Csisz´ar and J. K ¨orner,Information theory: coding theorems for discrete memoryless systems. Cambridge University Press, 2011

2011