pith. sign in

arxiv: 2502.09018 · v2 · submitted 2025-02-13 · 💻 cs.LG · cs.AI· cs.CV

Zero-shot Concept Bottleneck Models

Pith reviewed 2026-05-23 03:35 UTC · model grok-4.3

classification 💻 cs.LG cs.AIcs.CV
keywords zero-shot learningconcept bottleneck modelsinterpretabilityconcept retrievalcross-modal searchsparse linear regressionexplainable AImachine learning
0
0 comments X

The pith

Zero-shot concept bottleneck models predict concepts and labels without training by retrieving from a web-scale concept bank.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows how to remove the training step from concept bottleneck models entirely. Standard CBMs learn input-to-concept and concept-to-label mappings on a target dataset, which costs data collection and compute. Z-CBMs instead keep a fixed bank of millions of web-extracted concepts, retrieve the most relevant ones for any input through cross-modal search, and then use sparse linear regression to pick the few concepts that best predict the label. This yields both the final prediction and an explicit list of concepts that a user can inspect or edit, all without ever training on the target task.

Core claim

Z-CBMs predict concepts and labels in a fully zero-shot manner without training neural networks. They utilize a large-scale concept bank composed of millions of vocabulary items extracted from the web, map inputs to concepts via cross-modal concept retrieval, and infer labels via concept regression with sparse linear regression on the retrieved concepts.

What carries the argument

Large-scale web concept bank that supports dynamic retrieval of input-related concepts by cross-modal search followed by sparse linear regression to select essential concepts for label prediction.

If this is right

  • Any new classification task can be addressed immediately without collecting or labeling a target dataset.
  • The model outputs an explicit list of activated concepts that explain each prediction.
  • A user can intervene by forcing selected concepts on or off and observe the changed label prediction.
  • The same fixed concept bank and retrieval machinery works across multiple unrelated domains.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The approach could be combined with existing large vision-language models to improve the quality of the initial concept retrieval step.
  • Domains with very fine-grained or technical terminology may require an expanded or curated bank beyond the general web extraction.
  • Because no parameters are updated on the target task, the method sidesteps catastrophic forgetting when moving between tasks.

Load-bearing premise

A web-extracted bank of millions of vocabulary items is comprehensive and accurately searchable by cross-modal methods for arbitrary inputs in any domain.

What would settle it

On a specialized domain the retrieved concepts produce label predictions no better than chance or fail to match human-interpretable features that actually drive the label.

Figures

Figures reproduced from arXiv: 2502.09018 by Daiki Chijiwa, Kosuke Nishida, Shin'ya Yamaguchi, Yasutoshi Ida.

Figure 1
Figure 1. Figure 1: Zero-shot concept bottleneck models (Z-CBMs). Z-CBMs predict concepts for input by retrieving them from a [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Concept retrieval and concept regression. (a) Concept retrieval searches concept candidates close to an input image [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Concept Deletion (Bird) 2 4 6 8 10 Number of Inserted Concepts / Sample 51.75 52.00 52.25 52.50 Top-1 Accuracy (%) Intervened Not Intervened [PITH_FULL_IMAGE:figures/full_fig_p006_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Concept Insertion (Bird) cepts per sample increased. This indicates that Z-CBMs can correct the final output by modifying the concept of interest through intervention. 5.4. Qualitative Evaluation of Predicted Concepts We demonstrate the qualitative evaluation of predicted con￾cepts by Label-free CBMs and Z-CBMs when inputting the ImageNet validation examples in [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Qualitative evaluation of predicted concepts on the ImageNet validation set. While Label-free CBMs sometimes [PITH_FULL_IMAGE:figures/full_fig_p007_5.png] view at source ↗
Figure 7
Figure 7. Figure 7: Effects of varying λ in Eq. 3 with quantitative and qualitative evaluations. For quantitative evaluation, we measured the L2 distance between image-label features and concept-label features as the modality gap by following (Liang et al., 2022). The L2 distances were 1.74×10−3 in image-to-label and 0.86 × 10−3 in concept-to-label, demonstrating that Z-CBMs largely reduce the modality gap by concept regressi… view at source ↗
Figure 8
Figure 8. Figure 8: Accuracy vs. inference time by varying retrieved concept number [PITH_FULL_IMAGE:figures/full_fig_p014_8.png] view at source ↗
read the original abstract

Concept bottleneck models (CBMs) are inherently interpretable and intervenable neural network models, which explain their final label prediction by the intermediate prediction of high-level semantic concepts. However, they require target task training to learn input-to-concept and concept-to-label mappings, incurring target dataset collections and training resources. In this paper, we present zero-shot concept bottleneck models (Z-CBMs), which predict concepts and labels in a fully zero-shot manner without training neural networks. Z-CBMs utilize a large-scale concept bank, which is composed of millions of vocabulary extracted from the web, to describe arbitrary input in various domains. For the input-to-concept mapping, we introduce concept retrieval, which dynamically finds input-related concepts by the cross-modal search on the concept bank. In the concept-to-label inference, we apply concept regression to select essential concepts from the retrieved concepts by sparse linear regression. Through extensive experiments, we confirm that our Z-CBMs provide interpretable and intervenable concepts without any additional training. Code will be available at https://github.com/yshinya6/zcbm.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper proposes zero-shot concept bottleneck models (Z-CBMs) that achieve interpretable and intervenable predictions by constructing a large-scale web-extracted concept bank, performing input-to-concept mapping via cross-modal retrieval, and performing concept-to-label mapping via sparse linear regression, all without training any neural networks on the target task.

Significance. If the zero-shot claim can be sustained, the approach would eliminate the need for target-task data collection and NN training that standard CBMs require, enabling rapid deployment of concept-based models across domains while preserving intervention capabilities.

major comments (1)
  1. [Abstract] Abstract: the central claim that Z-CBMs operate 'in a fully zero-shot manner without training neural networks' and 'without any additional training' is directly contradicted by the concept regression step, which applies sparse linear regression to fit coefficients that predict class labels from retrieved concept activations; this fitting necessarily uses labeled target-task examples and therefore constitutes task-specific training.
minor comments (2)
  1. [Abstract] The description of the concept bank construction (millions of vocabulary items extracted from the web) lacks detail on filtering, deduplication, or domain coverage guarantees that would be needed to support the 'arbitrary input in various domains' claim.
  2. [Abstract] No quantitative comparison is provided in the abstract against baselines that also avoid NN training (e.g., zero-shot CLIP with post-hoc linear probes), making it difficult to isolate the contribution of the concept bank and retrieval steps.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their careful reading of the manuscript and for highlighting an important point of clarification regarding our zero-shot claims. We address the major comment below.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the central claim that Z-CBMs operate 'in a fully zero-shot manner without training neural networks' and 'without any additional training' is directly contradicted by the concept regression step, which applies sparse linear regression to fit coefficients that predict class labels from retrieved concept activations; this fitting necessarily uses labeled target-task examples and therefore constitutes task-specific training.

    Authors: We agree that the phrasing in the abstract is imprecise and could be misleading. The sparse linear regression step for concept-to-label mapping does require fitting coefficients on labeled target-task examples and therefore represents task-specific training (albeit a lightweight linear model rather than a neural network). The input-to-concept mapping via cross-modal retrieval on the web-scale concept bank requires no target-task training or neural network optimization, which is the primary distinction from standard CBMs. We will revise the abstract (and related claims in the introduction) to state that Z-CBMs require no neural network training on the target task while explicitly noting the use of sparse linear regression on target labels for the final mapping. This revision will be incorporated in the next version. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation uses external web data and standard off-the-shelf components without self-referential reduction

full rationale

The paper's Z-CBM pipeline consists of (1) retrieving concepts from a web-scale external bank via cross-modal search and (2) applying sparse linear regression on retrieved activations to predict labels. Neither step is defined in terms of the other, nor does any claimed 'prediction' reduce by construction to a parameter fitted from the target result itself. No self-citation chain, uniqueness theorem, or ansatz smuggling is invoked to justify the core mappings. The method is therefore self-contained against external benchmarks and receives the default non-circularity finding.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach depends on the quality and coverage of the externally sourced concept bank and on the reliability of existing cross-modal models for retrieval; no new entities are postulated.

free parameters (1)
  • sparsity regularization strength in concept regression
    Sparse linear regression requires a hyperparameter that controls how many concepts are retained; its value is not derived from first principles.
axioms (1)
  • domain assumption Cross-modal similarity search on the web concept bank reliably surfaces input-relevant concepts
    This premise is invoked directly in the input-to-concept mapping step described in the abstract.

pith-pipeline@v0.9.0 · 5727 in / 1280 out tokens · 29889 ms · 2026-05-23T03:35:48.727409+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Sparse Concept Anchoring for Interpretable and Controllable Neural Representations

    cs.LG 2025-12 unverdicted novelty 6.0

    Sparse Concept Anchoring biases neural latent spaces toward targeted concepts using under 0.1% labels per concept, enabling reversible steering via projection and permanent removal via weight ablation with minimal sid...

Reference graph

Works this paper leans on

12 extracted references · 12 canonical work pages · cited by 1 Pith paper · 2 internal anchors

  1. [1]

    Nltk: the natural language toolkit

    Bird, S. Nltk: the natural language toolkit. In Proceedings of the COLING/ACL 2006 Interactive Presentation Sessions, pp. 69–72,

  2. [2]

    The Faiss library

    Douze, M., Guzhva, A., Deng, C., Johnson, J., Szilvasy, G., Mazar´e, P.-E., Lomeli, M., Hosseini, L., and J´egou, H. The faiss library. arXiv preprint arXiv:2401.08281,

  3. [3]

    Clipscore: A reference-free evaluation metric for im- age captioning

    Hessel, J., Holtzman, A., Forbes, M., Le Bras, R., and Choi, Y . Clipscore: A reference-free evaluation metric for im- age captioning. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pp. 7514–7528,

  4. [4]

    Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery

    Rao, S., Mahajan, S., B¨ohle, M., and Schiele, B. Discover- then-name: Task-agnostic concept bottlenecks via auto- mated concept discovery. InProceedings of the European Conference on Computer Vision, 2024a. Rao, S., Mahajan, S., B ¨ohle, M., and Schiele, B. Discover-then-name: Task-agnostic concept bottlenecks via automated concept discovery. arXiv prepr...

  5. [5]

    UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild

    Soomro, K. Ucf101: A dataset of 101 human actions classes from videos in the wild. arXiv preprint arXiv:1212.0402,

  6. [6]

    Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance

    Srivastava, D., Yan, G., and Weng, T.-W. Vlg-cbm: Train- ing concept bottleneck models with vision-language guid- ance. arXiv preprint arXiv:2408.01432,

  7. [7]

    Explain via any concept: Concept bottleneck model with open vocabulary concepts

    Tan, A., Zhou, F., and Chen, H. Explain via any concept: Concept bottleneck model with open vocabulary concepts. arXiv preprint arXiv:2408.02265,

  8. [8]

    skscope: Fast sparsity- constrained optimization in python

    10 Zero-shot Concept Bottleneck Models Wang, Z., Zhu, J., Chen, P., Peng, H., Zhang, X., Wang, A., Zheng, Y ., Zhu, J., and Wang, X. skscope: Fast sparsity- constrained optimization in python. arXiv preprint arXiv:2403.18540,

  9. [9]

    A., Oliva, A., and Torralba, A

    Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. Sun database: Large-scale scene recognition from abbey to zoo. In 2010 IEEE computer society conference on computer vision and pattern recognition, pp. 3485–3492. IEEE,

  10. [10]

    a photo of [class name]

    11 Zero-shot Concept Bottleneck Models Table 7: CLIP-Score on 12 classification datasets. We compute the averaged CLIP-Scores between images and concepts with top-10 absolute coefficients. Method Air Bird Cal Car DTD Euro Flo Food IN Pet SUN UCF Avg. Label-free CBM 0.6824 0.7818 0.7023 0.7106 0.6552 0.6179 0.6988 0.6959 0.7202 0.7119 0.7327 0.6688 0.6982 ...

  11. [11]

    3 with quantitative and qualitative evaluations

    To confirm this, we conduct a deeper analysis of the effects of Z-CBMs on the modality gap 12 Zero-shot Concept Bottleneck Models 0.8 0.6 0.4 0.2 0.0 0.2 0.3 0.2 0.1 0.0 0.1 0.2 0.3 0.4 Concept Features Image Features Label Features Reconstructed Features Figure 6: PCA feature visualization of Z-CBMs 1e-3 1e-4 1e-5 1e-6 1e-7 1e-8 Lasso λ 0 10 20 30 40 50 ...

  12. [12]

    Reproducibility Statement

    if the data source can be biased. Reproducibility Statement. As described in Sec. 4 and 5 , the implementation of the proposed method uses a publicly available code base. For example, the VLMs backbones are publicly available in the OpenAI CLIP 2 and Open CLIP 3 GitHub repositories. All datasets are also available on the web; see the references in Sec. 5....