Zero-shot Concept Bottleneck Models

Daiki Chijiwa; Kosuke Nishida; Shin'ya Yamaguchi; Yasutoshi Ida

arxiv: 2502.09018 · v2 · submitted 2025-02-13 · 💻 cs.LG · cs.AI· cs.CV

Zero-shot Concept Bottleneck Models

Shin'ya Yamaguchi , Kosuke Nishida , Daiki Chijiwa , Yasutoshi Ida This is my paper

Pith reviewed 2026-05-23 03:35 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CV

keywords zero-shot learningconcept bottleneck modelsinterpretabilityconcept retrievalcross-modal searchsparse linear regressionexplainable AImachine learning

0 comments

The pith

Zero-shot concept bottleneck models predict concepts and labels without training by retrieving from a web-scale concept bank.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to remove the training step from concept bottleneck models entirely. Standard CBMs learn input-to-concept and concept-to-label mappings on a target dataset, which costs data collection and compute. Z-CBMs instead keep a fixed bank of millions of web-extracted concepts, retrieve the most relevant ones for any input through cross-modal search, and then use sparse linear regression to pick the few concepts that best predict the label. This yields both the final prediction and an explicit list of concepts that a user can inspect or edit, all without ever training on the target task.

Core claim

Z-CBMs predict concepts and labels in a fully zero-shot manner without training neural networks. They utilize a large-scale concept bank composed of millions of vocabulary items extracted from the web, map inputs to concepts via cross-modal concept retrieval, and infer labels via concept regression with sparse linear regression on the retrieved concepts.

What carries the argument

Large-scale web concept bank that supports dynamic retrieval of input-related concepts by cross-modal search followed by sparse linear regression to select essential concepts for label prediction.

If this is right

Any new classification task can be addressed immediately without collecting or labeling a target dataset.
The model outputs an explicit list of activated concepts that explain each prediction.
A user can intervene by forcing selected concepts on or off and observe the changed label prediction.
The same fixed concept bank and retrieval machinery works across multiple unrelated domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The approach could be combined with existing large vision-language models to improve the quality of the initial concept retrieval step.
Domains with very fine-grained or technical terminology may require an expanded or curated bank beyond the general web extraction.
Because no parameters are updated on the target task, the method sidesteps catastrophic forgetting when moving between tasks.

Load-bearing premise

A web-extracted bank of millions of vocabulary items is comprehensive and accurately searchable by cross-modal methods for arbitrary inputs in any domain.

What would settle it

On a specialized domain the retrieved concepts produce label predictions no better than chance or fail to match human-interpretable features that actually drive the label.

Figures

Figures reproduced from arXiv: 2502.09018 by Daiki Chijiwa, Kosuke Nishida, Shin'ya Yamaguchi, Yasutoshi Ida.

**Figure 2.** Figure 2: Concept retrieval and concept regression. (a) Concept retrieval searches concept candidates close to an input image [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Concept Deletion (Bird) 2 4 6 8 10 Number of Inserted Concepts / Sample 51.75 52.00 52.25 52.50 Top-1 Accuracy (%) Intervened Not Intervened [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗

**Figure 4.** Figure 4: Concept Insertion (Bird) cepts per sample increased. This indicates that Z-CBMs can correct the final output by modifying the concept of interest through intervention. 5.4. Qualitative Evaluation of Predicted Concepts We demonstrate the qualitative evaluation of predicted concepts by Label-free CBMs and Z-CBMs when inputting the ImageNet validation examples in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative evaluation of predicted concepts on the ImageNet validation set. While Label-free CBMs sometimes [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗

**Figure 7.** Figure 7: Effects of varying λ in Eq. 3 with quantitative and qualitative evaluations. For quantitative evaluation, we measured the L2 distance between image-label features and concept-label features as the modality gap by following (Liang et al., 2022). The L2 distances were 1.74×10−3 in image-to-label and 0.86 × 10−3 in concept-to-label, demonstrating that Z-CBMs largely reduce the modality gap by concept regressi… view at source ↗

**Figure 8.** Figure 8: Accuracy vs. inference time by varying retrieved concept number [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗

read the original abstract

Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models, which explain their final label prediction by the intermediate prediction of high-level semantic concepts. However, they require target task training to learn input-to-concept and concept-to-label mappings, incurring target dataset collections and training resources. In this paper, we present zero-shot concept bottleneck models (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without training neural networks. Z-CBMs utilize a large-scale concept bank, which is composed of millions of vocabulary extracted from the web, to describe arbitrary input in various domains. For the input-to-concept mapping, we introduce concept retrieval, which dynamically finds input-related concepts by the cross-modal search on the concept bank. In the concept-to-label inference, we apply concept regression to select essential concepts from the retrieved concepts by sparse linear regression. Through extensive experiments, we confirm that our Z-CBMs provide interpretable and intervenable concepts without any additional training. Code will be available at https://github.com/yshinya6/zcbm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Z-CBMs swap neural training for web retrieval plus sparse regression on target labels, so the fully zero-shot claim only covers the neural part.

read the letter

The paper's main move is to build concept bottleneck models without target-task neural training. It pulls a million-scale concept bank from the web, retrieves input-related concepts via cross-modal search, and then uses sparse linear regression to pick which concepts drive the label prediction. That combination is new relative to earlier CBM work, which trained both the concept predictor and the label mapper on task data. The retrieval step is genuinely zero-shot and the bank size gives reasonable coverage for multiple domains without new data collection. Experiments reportedly confirm that the resulting concepts remain interpretable and intervenable. The regression step is the clear soft spot. Sparse linear regression fits coefficients to target labels, so labeled task examples are still required. The abstract stresses operation in a fully zero-shot manner without any additional training, yet this fitting is training—just linear rather than neural. That narrows the practical difference from standard CBMs more than the headline suggests. The paper also assumes the web bank is both comprehensive and accurately searchable for arbitrary inputs; no error analysis on retrieval failures appears in the abstract. Citation pattern is standard and points to prior CBM literature without obvious gaps. This is worth a serious referee for the XAI crowd interested in reducing data needs, even if the zero-shot wording needs tightening. Send it to review.

Referee Report

1 major / 2 minor

Summary. The paper proposes zero-shot concept bottleneck models (Z-CBMs) that achieve interpretable and intervenable predictions by constructing a large-scale web-extracted concept bank, performing input-to-concept mapping via cross-modal retrieval, and performing concept-to-label mapping via sparse linear regression, all without training any neural networks on the target task.

Significance. If the zero-shot claim can be sustained, the approach would eliminate the need for target-task data collection and NN training that standard CBMs require, enabling rapid deployment of concept-based models across domains while preserving intervention capabilities.

major comments (1)

[Abstract] Abstract: the central claim that Z-CBMs operate 'in a fully zero-shot manner without training neural networks' and 'without any additional training' is directly contradicted by the concept regression step, which applies sparse linear regression to fit coefficients that predict class labels from retrieved concept activations; this fitting necessarily uses labeled target-task examples and therefore constitutes task-specific training.

minor comments (2)

[Abstract] The description of the concept bank construction (millions of vocabulary items extracted from the web) lacks detail on filtering, deduplication, or domain coverage guarantees that would be needed to support the 'arbitrary input in various domains' claim.
[Abstract] No quantitative comparison is provided in the abstract against baselines that also avoid NN training (e.g., zero-shot CLIP with post-hoc linear probes), making it difficult to isolate the contribution of the concept bank and retrieval steps.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for highlighting an important point of clarification regarding our zero-shot claims. We address the major comment below.

read point-by-point responses

Referee: [Abstract] Abstract: the central claim that Z-CBMs operate 'in a fully zero-shot manner without training neural networks' and 'without any additional training' is directly contradicted by the concept regression step, which applies sparse linear regression to fit coefficients that predict class labels from retrieved concept activations; this fitting necessarily uses labeled target-task examples and therefore constitutes task-specific training.

Authors: We agree that the phrasing in the abstract is imprecise and could be misleading. The sparse linear regression step for concept-to-label mapping does require fitting coefficients on labeled target-task examples and therefore represents task-specific training (albeit a lightweight linear model rather than a neural network). The input-to-concept mapping via cross-modal retrieval on the web-scale concept bank requires no target-task training or neural network optimization, which is the primary distinction from standard CBMs. We will revise the abstract (and related claims in the introduction) to state that Z-CBMs require no neural network training on the target task while explicitly noting the use of sparse linear regression on target labels for the final mapping. This revision will be incorporated in the next version. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses external web data and standard off-the-shelf components without self-referential reduction

full rationale

The paper's Z-CBM pipeline consists of (1) retrieving concepts from a web-scale external bank via cross-modal search and (2) applying sparse linear regression on retrieved activations to predict labels. Neither step is defined in terms of the other, nor does any claimed 'prediction' reduce by construction to a parameter fitted from the target result itself. No self-citation chain, uniqueness theorem, or ansatz smuggling is invoked to justify the core mappings. The method is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach depends on the quality and coverage of the externally sourced concept bank and on the reliability of existing cross-modal models for retrieval; no new entities are postulated.

free parameters (1)

sparsity regularization strength in concept regression
Sparse linear regression requires a hyperparameter that controls how many concepts are retained; its value is not derived from first principles.

axioms (1)

domain assumption Cross-modal similarity search on the web concept bank reliably surfaces input-relevant concepts
This premise is invoked directly in the input-to-concept mapping step described in the abstract.

pith-pipeline@v0.9.0 · 5727 in / 1280 out tokens · 29889 ms · 2026-05-23T03:35:48.727409+00:00 · methodology

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Sparse Concept Anchoring for Interpretable and Controllable Neural Representations
cs.LG 2025-12 unverdicted novelty 6.0

Sparse Concept Anchoring biases neural latent spaces toward targeted concepts using under 0.1% labels per concept, enabling reversible steering via projection and permanent removal via weight ablation with minimal sid...

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 1 Pith paper · 2 internal anchors

[1]

Nltk: the natural language toolkit

Bird, S. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69–72,

work page 2006
[2]

The Faiss library

Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazar´e, P.-E., Lomeli, M., Hosseini, L., and J´egou, H. The faiss library. arXiv preprint arXiv:2401.08281,

work page internal anchor Pith review Pith/arXiv arXiv
[3]

Clipscore: A reference-free evaluation metric for im- age captioning

Hessel, J., Holtzman, A., Forbes, M., Le Bras, R., and Choi, Y . Clipscore: A reference-free evaluation metric for im- age captioning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7514–7528,

work page 2021
[4]

Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery

Rao, S., Mahajan, S., B¨ohle, M., and Schiele, B. Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery. InProceedings of the European Conference on Computer Vision, 2024a. Rao, S., Mahajan, S., B ¨ohle, M., and Schiele, B. Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery. arXiv prepr...

work page arXiv
[5]

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Soomro, K. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402,

work page internal anchor Pith review Pith/arXiv arXiv
[6]

Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance

Srivastava, D., Yan, G., and Weng, T.-W. Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance. arXiv preprint arXiv:2408.01432,

work page arXiv
[7]

Explain via any concept: Concept bottleneck model with open vocabulary concepts

Tan, A., Zhou, F., and Chen, H. Explain via any concept: Concept bottleneck model with open vocabulary concepts. arXiv preprint arXiv:2408.02265,

work page arXiv
[8]

skscope: Fast sparsity- constrained optimization in python

10 Zero-shot Concept Bottleneck Models Wang, Z., Zhu, J., Chen, P., Peng, H., Zhang, X., Wang, A., Zheng, Y ., Zhu, J., and Wang, X. skscope: Fast sparsity- constrained optimization in python. arXiv preprint arXiv:2403.18540,

work page arXiv
[9]

A., Oliva, A., and Torralba, A

Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3485–3492. IEEE,

work page 2010
[10]

a photo of [class name]

11 Zero-shot Concept Bottleneck Models Table 7: CLIP-Score on 12 classification datasets. We compute the averaged CLIP-Scores between images and concepts with top-10 absolute coefficients. Method Air Bird Cal Car DTD Euro Flo Food IN Pet SUN UCF Avg. Label-free CBM 0.6824 0.7818 0.7023 0.7106 0.6552 0.6179 0.6988 0.6959 0.7202 0.7119 0.7327 0.6688 0.6982 ...

work page 2023
[11]

3 with quantitative and qualitative evaluations

To confirm this, we conduct a deeper analysis of the effects of Z-CBMs on the modality gap 12 Zero-shot Concept Bottleneck Models 0.8 0.6 0.4 0.2 0.0 0.2 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4 Concept Features Image Features Label Features Reconstructed Features Figure 6: PCA feature visualization of Z-CBMs 1e-3 1e-4 1e-5 1e-6 1e-7 1e-8 Lasso λ 0 10 20 30 40 50 ...

work page 2022
[12]

Reproducibility Statement

if the data source can be biased. Reproducibility Statement. As described in Sec. 4 and 5 , the implementation of the proposed method uses a publicly available code base. For example, the VLMs backbones are publicly available in the OpenAI CLIP 2 and Open CLIP 3 GitHub repositories. All datasets are also available on the web; see the references in Sec. 5....

work page 2048

[1] [1]

Nltk: the natural language toolkit

Bird, S. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69–72,

work page 2006

[2] [2]

The Faiss library

Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazar´e, P.-E., Lomeli, M., Hosseini, L., and J´egou, H. The faiss library. arXiv preprint arXiv:2401.08281,

work page internal anchor Pith review Pith/arXiv arXiv

[3] [3]

Clipscore: A reference-free evaluation metric for im- age captioning

Hessel, J., Holtzman, A., Forbes, M., Le Bras, R., and Choi, Y . Clipscore: A reference-free evaluation metric for im- age captioning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7514–7528,

work page 2021

[4] [4]

Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery

Rao, S., Mahajan, S., B¨ohle, M., and Schiele, B. Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery. InProceedings of the European Conference on Computer Vision, 2024a. Rao, S., Mahajan, S., B ¨ohle, M., and Schiele, B. Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery. arXiv prepr...

work page arXiv

[5] [5]

UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

Soomro, K. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402,

work page internal anchor Pith review Pith/arXiv arXiv

[6] [6]

Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance

Srivastava, D., Yan, G., and Weng, T.-W. Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance. arXiv preprint arXiv:2408.01432,

work page arXiv

[7] [7]

Explain via any concept: Concept bottleneck model with open vocabulary concepts

Tan, A., Zhou, F., and Chen, H. Explain via any concept: Concept bottleneck model with open vocabulary concepts. arXiv preprint arXiv:2408.02265,

work page arXiv

[8] [8]

skscope: Fast sparsity- constrained optimization in python

10 Zero-shot Concept Bottleneck Models Wang, Z., Zhu, J., Chen, P., Peng, H., Zhang, X., Wang, A., Zheng, Y ., Zhu, J., and Wang, X. skscope: Fast sparsity- constrained optimization in python. arXiv preprint arXiv:2403.18540,

work page arXiv

[9] [9]

A., Oliva, A., and Torralba, A

Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3485–3492. IEEE,

work page 2010

[10] [10]

a photo of [class name]

11 Zero-shot Concept Bottleneck Models Table 7: CLIP-Score on 12 classification datasets. We compute the averaged CLIP-Scores between images and concepts with top-10 absolute coefficients. Method Air Bird Cal Car DTD Euro Flo Food IN Pet SUN UCF Avg. Label-free CBM 0.6824 0.7818 0.7023 0.7106 0.6552 0.6179 0.6988 0.6959 0.7202 0.7119 0.7327 0.6688 0.6982 ...

work page 2023

[11] [11]

3 with quantitative and qualitative evaluations

To confirm this, we conduct a deeper analysis of the effects of Z-CBMs on the modality gap 12 Zero-shot Concept Bottleneck Models 0.8 0.6 0.4 0.2 0.0 0.2 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4 Concept Features Image Features Label Features Reconstructed Features Figure 6: PCA feature visualization of Z-CBMs 1e-3 1e-4 1e-5 1e-6 1e-7 1e-8 Lasso λ 0 10 20 30 40 50 ...

work page 2022

[12] [12]

Reproducibility Statement

if the data source can be biased. Reproducibility Statement. As described in Sec. 4 and 5 , the implementation of the proposed method uses a publicly available code base. For example, the VLMs backbones are publicly available in the OpenAI CLIP 2 and Open CLIP 3 GitHub repositories. All datasets are also available on the web; see the references in Sec. 5....

work page 2048