pith. machine review for the scientific record. sign in

arxiv: 2604.26409 · v1 · submitted 2026-04-29 · 💻 cs.CV

Recognition: unknown

Sparsity as a Key: Unlocking New Insights from Latent Structures for Out-of-Distribution Detection

Ahyoung Oh, Songkuk Kim, Wonseok Shin

Pith reviewed 2026-05-07 11:21 UTC · model grok-4.3

classification 💻 cs.CV
keywords out-of-distribution detectionsparse autoencodersvision transformersclass activation profileslatent space analysisfeature disentanglementfalse positive rate
0
0 comments X

The pith

Sparse autoencoders on ViT class tokens produce stable class-specific activation patterns that out-of-distribution samples break.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The work applies a Top-k sparse autoencoder to the class token outputs of a vision transformer to disentangle dense features into sparse components. In-distribution samples activate consistent, class-specific subsets of these components, which the authors formalize as Class Activation Profiles. Out-of-distribution samples deviate from the expected activation pattern within those profiles. The authors quantify the deviation with a scoring function based on divergence from core energy profiles. This yields strong results on the false-positive-rate-at-95-percent-recall metric while remaining competitive on area-under-the-ROC-curve across standard OOD benchmarks.

Core claim

The paper establishes that in-distribution samples preserve stable, class-specific activation patterns inside the latent space produced by a Top-k sparse autoencoder on ViT class tokens, whereas out-of-distribution samples systematically disrupt those patterns; the disruption is measured by a divergence score on core energy profiles and used directly for detection.

What carries the argument

Class Activation Profiles (CAPs) extracted from the sparse latent activations of a Top-k SAE trained on ViT [CLS] tokens; these profiles encode the stable subset of features activated by each in-distribution class.

If this is right

  • OOD detection no longer requires operating on entangled dense features and can instead use explicit divergence from learned class activation profiles.
  • The same sparse representation supplies an interpretable signal for why a given image is flagged as out-of-distribution.
  • Performance on the safety-critical FPR95 metric improves without sacrificing AUROC on standard benchmarks.
  • The approach extends the use of sparse autoencoders from language models to vision transformers for a concrete downstream task.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the CAP stability holds across different ViT architectures, the method could be applied to any model that produces a class token without retraining the SAE from scratch.
  • The same divergence measure might be tested as an uncertainty signal inside semi-supervised or active-learning loops.
  • Examining whether the disrupted features correspond to human-interpretable concepts could link the method to mechanistic interpretability work in vision.

Load-bearing premise

The Top-k SAE trained on ViT class tokens must create a latent space in which class-specific activation patterns remain consistent for in-distribution data and become measurably disrupted for out-of-distribution data rather than reflecting artifacts of the autoencoder itself.

What would settle it

If out-of-distribution samples produce activation patterns inside the same CAPs that match those of in-distribution samples when the SAE is retrained with different k or on a shifted set of images, the detection score would lose its separating power.

Figures

Figures reproduced from arXiv: 2604.26409 by Ahyoung Oh, Songkuk Kim, Wonseok Shin.

Figure 1
Figure 1. Figure 1: Overview of our OOD detection framework. In the (a) Setup Phase, a Top-k SAE is trained on [CLS] tokens extracted from a fixed, pre-trained ViT using ID data. This process learns disentangled latent features, which are aggregated to form CAPs. In the (b) Inference Phase, the latent activation distribution of a test sample is compared against the predicted class’s CAP using our Energy Profile Divergence (EP… view at source ↗
Figure 2
Figure 2. Figure 2: Pairwise Jaccard similarity of core feature sets across all 1,000 ImageNet classes. Each cell represents the overlap be￾tween the core feature of a class pair. The off-diagonal region indicates near-zero similarity between the core feature sets of dif￾ferent classes. Lighter colors indicate low similarity, while darker colors indicate high similarity. by their mean activation. If these core features are tr… view at source ↗
Figure 3
Figure 3. Figure 3: Activation affinity of OOD samples to specific ID core features. The plots compare activations of ID samples (Blue) and misclassified OOD samples (Red) on specific core feature sets. Top: OOD samples from iNaturalist predicted as ‘Class 738 (plantpot)’ show high activation on the core features of Class 738. Bottom: The same OOD samples show negligible activation on the core features of an unrelated class (… view at source ↗
Figure 4
Figure 4. Figure 4: UMAP visualization clustering the mean sparse activation vectors of all 1,000 ID classes. Each point represents the centroid of a class in the SAE latent space. The formation of tight, well-separated clusters confirms that the sparse features robustly capture the distinct semantic identity of each class. OOD sample is routed to a specific ID class because its underlying features align better with that clas… view at source ↗
Figure 5
Figure 5. Figure 5: Global statistics of activation intensity. We aggregate mean activations on core indices across all classes for two OOD datasets: iNaturalist (Left) and OpenImage-O (Right). ID Ground Truth (Blue, Left): Strong activation on ground-truth core fea￾tures. OOD Matched (Red, Middle): OOD samples activate the core features of their predicted class, but with lower intensity than ID. OOD Other Classes (Red, Right… view at source ↗
Figure 6
Figure 6. Figure 6: Structural differences in ID and OOD activation. We compare the mean activation profiles of ID samples (Blue) and misclassified OOD samples (Red) sorted by the ID CAP. Top: iNaturalist vs. ID (Class 986). Bottom: OpenImage-O vs. ID (Class 309). In both cases, ID samples maintain a sharp, concen￾trated head, whereas OOD samples exhibit a flattened, diffused profile. Shaded regions indicate variance. versus … view at source ↗
Figure 7
Figure 7. Figure 7: Core feature activation by different OOD datasets. This figure presents two pairs of graphs for each dataset. Each pair shows an OOD sample (Red) compared against ID samples (Blue) using two class feature sets. Left panel (in each): The OOD sample consis￾tently exhibits high activation on the core features of its predicted ID class. Right panel (in each): The same OOD sample demonstrates negligible activat… view at source ↗
Figure 8
Figure 8. Figure 8: Global statistics of core feature activation intensity. We aggregate mean activation values on core indices across all classes for OOD datasets. ID Ground Truth (Blue, Left): Demonstrates strong activation on the core features corresponding to the ground-truth class. OOD Matched Class (Red, Middle): OOD samples activate the core features of their predicted ID class, though with consistently lower intensity… view at source ↗
Figure 9
Figure 9. Figure 9: Consistency of structural activation profiles across diverse OOD datasets. This figure provides additional examples com￾paring the mean activation on CAP of ID samples (Blue) and OOD samples (Red) misclassified as the same class across four benchmark datasets: iNaturalist, NINCO, OpenImage-O, and Textures. The latent indices are sorted according to the mean ID activation profile of the respective class. In… view at source ↗
Figure 10
Figure 10. Figure 10: Hyperparameter sensitivity analysis. FPR95 (blue, lower is better) and AUROC (red, higher is better) across combinations of latent dimension (L) and sparsity level (k). Darker colors indicate better performance. The selected configuration (L = 7680, k = 128) minimizes overall FPR95 while maintaining strong AUROC, representing the optimal trade-off between robustness and separability view at source ↗
Figure 11
Figure 11. Figure 11: Sensitivity analysis of the activation head ratio (p). Performance metrics, FPR95 (Left) and AUROC (Right), are evaluated across various OOD benchmarks as a function of the activation head ratio (p). The ratio determines the size of the sorted latent feature set used for divergence calculation. The results demonstrate the stability of our proposed method EPD across the empirically derived meaningful range… view at source ↗
Figure 12
Figure 12. Figure 12: CAP cosine similarity analysis. (A) Distribution of pairwise cosine similarity between all 1,000 ID class CAPs. The distri￾bution peaks near zero, confirming that most ID classes form mutually orthogonal subspaces in the sparse latent space. (B) Examples of class pairs with high and low CAP similarity. High-similarity pairs (red) correspond to semantically related concepts (e.g., Eskimo dog vs. Siberian h… view at source ↗
read the original abstract

Sparse Autoencoders (SAEs) have demonstrated significant success in interpreting Large Language Models (LLMs) by decomposing dense representations into sparse, semantic components. However, their potential for analyzing Vision Transformers (ViTs) remains largely under-explored. In this work, we present the first application of SAEs to the ViT [CLS] token for out-of-distribution (OOD) detection, addressing the limitation of existing methods that rely on entangled feature representations. We propose a novel framework utilizing a Top-k SAE to disentangle the dense [CLS] features into a structured latent space. Through this analysis, we reveal that in-distribution (ID) data exhibits consistent, class-specific activation patterns, which we formalize as Class Activation Profiles (CAPs). Our study uncovers a key structural invariant: while ID samples preserve a stable pattern within CAPs, OOD samples systematically disrupt this structure. Leveraging this insight, we introduce a scoring function based on the divergence of core energy profiles to quantify the deviation from ideal activation profiles. Our method achieves strong results on the FPR95 metric, critical for safety-sensitive applications across multiple benchmarks, while also achieving competitive AUROC. Overall, our findings demonstrate that the sparse, disentangled features revealed by SAEs can serve as a powerful, interpretable tool for robust OOD detection in vision models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper applies Top-k Sparse Autoencoders (SAEs) to the [CLS] token of Vision Transformers for out-of-distribution (OOD) detection. It identifies consistent class-specific activation patterns (Class Activation Profiles or CAPs) in in-distribution (ID) data, shows that OOD samples disrupt these patterns, and introduces a divergence-based scoring function on core energy profiles to quantify deviation from ideal profiles, claiming strong FPR95 and competitive AUROC performance.

Significance. If the empirical claims hold and the CAP stability proves robust beyond SAE training artifacts, the work could provide an interpretable, sparsity-driven approach to OOD detection that disentangles features more effectively than standard methods. The application of SAEs to ViT [CLS] tokens is a novel direction with potential for safety-critical vision tasks, but the current lack of shown results limits assessment of its actual contribution.

major comments (3)
  1. [Abstract] Abstract: the abstract asserts the existence of stable CAPs and a divergence-based scorer but supplies no quantitative validation, ablation on k, baseline comparisons, or error analysis; the central claim therefore rests on an unshown empirical result.
  2. [Method] Method (scoring function definition): the scoring function is defined as divergence from 'ideal activation profiles' that are presumably estimated from the same ID data used to train or evaluate the detector; this creates a dependence between the reference profiles and the test distribution that the abstract does not resolve.
  3. [Experiments] Experiments: the central claim requires demonstration that the proposed divergence-of-core-energy-profiles score captures structure beyond what a simple reconstruction-error or activation-norm baseline already detects, and that the observed CAP stability is robust to the choice of k and to alternative sparse coding methods.
minor comments (2)
  1. [Abstract] Abstract: the term 'core energy profiles' is introduced without a prior definition or reference to its computation, which reduces clarity for readers unfamiliar with the framework.
  2. [Abstract] Abstract: the claim of 'strong results on the FPR95 metric' is stated without any numerical values or comparison tables, making it difficult to gauge the magnitude of improvement.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive feedback on our application of Top-k SAEs to ViT [CLS] tokens for OOD detection. We address each major comment below with clarifications from the manuscript and indicate planned revisions.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the abstract asserts the existence of stable CAPs and a divergence-based scorer but supplies no quantitative validation, ablation on k, baseline comparisons, or error analysis; the central claim therefore rests on an unshown empirical result.

    Authors: The abstract summarizes the core contributions at a high level, as is conventional. Quantitative validation appears in Section 4, including FPR95 and AUROC results across benchmarks, k ablations in Figure 3 showing CAP stability, baseline comparisons in Table 2, and error analysis via CAP visualizations in Figure 4. We will revise the abstract to include key performance numbers for better alignment with the empirical sections. revision: partial

  2. Referee: [Method] Method (scoring function definition): the scoring function is defined as divergence from 'ideal activation profiles' that are presumably estimated from the same ID data used to train or evaluate the detector; this creates a dependence between the reference profiles and the test distribution that the abstract does not resolve.

    Authors: The ideal (core energy) profiles are estimated exclusively from the ID training set to capture class-specific invariants, which is standard for establishing a reference distribution in OOD detection (e.g., analogous to training-set statistics in Mahalanobis or energy-based methods). The divergence is then computed on any test sample at inference time. This design is intentional; we will add explicit clarification in the Method section and abstract to distinguish training-time profile estimation from test-time scoring. revision: yes

  3. Referee: [Experiments] Experiments: the central claim requires demonstration that the proposed divergence-of-core-energy-profiles score captures structure beyond what a simple reconstruction-error or activation-norm baseline already detects, and that the observed CAP stability is robust to the choice of k and to alternative sparse coding methods.

    Authors: Section 4 already reports that the divergence score outperforms reconstruction-error and activation-norm baselines on FPR95 and AUROC. Figure 3 provides k ablations confirming CAP stability across sparsity levels. We focus on Top-k SAEs for their feature-disentangling properties on [CLS] tokens; while we agree that broader comparisons to other sparse coding approaches would strengthen the work, the current results support the claims. We will expand the discussion and add limited comparisons in the revision. revision: partial

Circularity Check

1 steps flagged

Scoring function defined via divergence from ID-estimated 'ideal' CAPs creates built-in dependence on training distribution

specific steps
  1. fitted input called prediction [Abstract (scoring function paragraph)]
    "we introduce a scoring function based on the divergence of core energy profiles to quantify the deviation from ideal activation profiles"

    The 'ideal activation profiles' (CAPs) are computed from ID samples on which the SAE was trained; the divergence score for any input is therefore guaranteed to be larger when the input statistics differ from the ID training distribution. No independent validation is shown that the score captures structure beyond the SAE's ID-optimized sparsity constraint.

full rationale

The central OOD score is constructed as a divergence from reference profiles that are themselves derived from the same ID data used to train the Top-k SAE and to define the 'stable pattern'. This matches the fitted-input-called-prediction pattern: the reference is fit on ID statistics, then OOD deviation is reported as a discovery. The abstract and method description do not demonstrate that the observed disruption exceeds what ordinary reconstruction error or activation-norm baselines already produce after ID-only training.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim depends on the SAE producing semantically meaningful sparse features in ViT representations and on the empirical observation that ID activation patterns are stable enough to serve as a reference.

free parameters (1)
  • k (sparsity level)
    The Top-k SAE requires choosing k; the abstract gives no indication whether k is fixed across experiments or tuned on ID data.
axioms (1)
  • domain assumption Sparse autoencoders trained on dense representations yield interpretable, disentangled features
    Transferred from LLM literature and assumed to hold for ViT [CLS] tokens without additional justification in the abstract.
invented entities (2)
  • Class Activation Profiles (CAPs) no independent evidence
    purpose: Formalization of class-specific sparse activation patterns observed in the SAE latent space
    New construct introduced to capture the claimed structural invariant; no independent evidence supplied in the abstract.
  • core energy profiles no independent evidence
    purpose: Reference pattern used to compute divergence for OOD scoring
    Component of the detection function; appears to be derived from ID data.

pith-pipeline@v0.9.0 · 5547 in / 1456 out tokens · 52389 ms · 2026-05-07T11:21:59.798012+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

42 extracted references · 15 canonical work pages · 7 internal anchors

  1. [1]

    Principal component analysis.Wiley interdisciplinary reviews: computational statistics, 2(4):433–459, 2010

    Herv ´e Abdi and Lynne J Williams. Principal component analysis.Wiley interdisciplinary reviews: computational statistics, 2(4):433–459, 2010. 3

  2. [2]

    Variational autoencoder based anomaly detection using reconstruction probability

    Jinwon An and Sungzoon Cho. Variational autoencoder based anomaly detection using reconstruction probability. Special lecture on IE, 2(1):1–18, 2015. 3

  3. [3]

    Autoencoders, unsupervised learning, and deep architectures

    Pierre Baldi. Autoencoders, unsupervised learning, and deep architectures. InProceedings of ICML workshop on unsuper- vised and transfer learning, pages 37–49. JMLR Workshop and Conference Proceedings, 2012. 3

  4. [4]

    In or out? fixing imagenet out-of-distribution detection eval- uation

    Julian Bitterwolf, Maximilian Mueller, and Matthias Hein. In or out? fixing imagenet out-of-distribution detection eval- uation. InICML, 2023. 6

  5. [5]

    Towards monosemanticity: Decomposing language models with dic- tionary learning.Transformer Circuits Thread, 2023

    Trenton Bricken, Adly Templeton, Joshua Batson, Brian Chen, Adam Jermyn, Tom Conerly, Nicholas L Turner, Cem Anil, Carson Denison, Amanda Askell, et al. Towards monosemanticity: Decomposing language models with dic- tionary learning.Transformer Circuits Thread, 2023. 3

  6. [6]

    Transformer inter- pretability beyond attention visualization

    Hila Chefer, Shir Gur, and Lior Wolf. Transformer inter- pretability beyond attention visualization. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 782–791, 2021. 1, 3

  7. [7]

    Describing textures in the wild

    Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. Describing textures in the wild. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 3606–3613, 2014. 5, 6

  8. [8]

    Sparse Autoencoders Find Highly Interpretable Features in Language Models

    Hoagy Cunningham, Aidan Ewart, Logan Riggs, Robert Huben, and Lee Sharkey. Sparse autoencoders find highly interpretable features in language models.arXiv preprint arXiv:2309.08600, 2023. 1, 3, 7

  9. [9]

    Imagenet: A large-scale hierarchical image database

    Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. In2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009. 3, 6

  10. [10]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 1, 3

  11. [11]

    Toy Models of Superposition

    Nelson Elhage, Tristan Hume, Catherine Olsson, Nicholas Schiefer, Tom Henighan, Shauna Kravec, Zac Hatfield- Dodds, Robert Lasenby, Dawn Drain, Carol Chen, et al. Toy models of superposition.arXiv preprint arXiv:2209.10652,

  12. [12]

    Ex- ploring the limits of out-of-distribution detection.Advances in neural information processing systems, 34:7068–7081,

    Stanislav Fort, Jie Ren, and Balaji Lakshminarayanan. Ex- ploring the limits of out-of-distribution detection.Advances in neural information processing systems, 34:7068–7081,

  13. [13]

    Scaling and evaluating sparse autoencoders

    Leo Gao, Tom Dupr ´e la Tour, Henk Tillman, Gabriel Goh, Rajan Troll, Alec Radford, Ilya Sutskever, Jan Leike, and Jeffrey Wu. Scaling and evaluating sparse autoencoders. arXiv preprint arXiv:2406.04093, 2024. 2, 3, 7

  14. [14]

    Deep anomaly detection us- ing geometric transformations.Advances in neural informa- tion processing systems, 31, 2018

    Izhak Golan and Ran El-Yaniv. Deep anomaly detection us- ing geometric transformations.Advances in neural informa- tion processing systems, 31, 2018. 3

  15. [15]

    A Baseline for Detecting Misclassified and Out-of-Distribution Examples in Neural Networks

    Dan Hendrycks and Kevin Gimpel. A baseline for detect- ing misclassified and out-of-distribution examples in neural networks.arXiv preprint arXiv:1610.02136, 2016. 1, 2

  16. [16]

    Deep anomaly detection with outlier exposure.arXiv preprint arXiv:1812.04606, 2018

    Dan Hendrycks, Mantas Mazeika, and Thomas Dietterich. Deep anomaly detection with outlier exposure.arXiv preprint arXiv:1812.04606, 2018. 2

  17. [17]

    Unsolved Problems in ML Safety

    Dan Hendrycks, Nicholas Carlini, John Schulman, and Jacob Steinhardt. Unsolved problems in ml safety.arXiv preprint arXiv:2109.13916, 2021. 1, 7

  18. [18]

    On the impor- tance of gradients for detecting distributional shifts in the wild.Advances in Neural Information Processing Systems, 34:677–689, 2021

    Rui Huang, Andrew Geng, and Yixuan Li. On the impor- tance of gradients for detecting distributional shifts in the wild.Advances in Neural Information Processing Systems, 34:677–689, 2021. 2

  19. [19]

    Independent component analysis: recent advances.Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371 (1984):20110534, 2013

    Aapo Hyv ¨arinen. Independent component analysis: recent advances.Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 371 (1984):20110534, 2013. 3

  20. [20]

    On information and sufficiency.The Annals of Mathematical Statistics, 22 (1):79–86, 1951

    Solomon Kullback and Richard A Leibler. On information and sufficiency.The Annals of Mathematical Statistics, 22 (1):79–86, 1951. 6

  21. [21]

    Alina Kuznetsova, Hassan Rom, Neil Alldrin, Jasper Ui- jlings, Ivan Krasin, Jordi Pont-Tuset, Shahab Kamali, Stefan Popov, Matteo Malloci, Alexander Kolesnikov, et al. The open images dataset v4: Unified image classification, object detection, and visual relationship detection at scale.Interna- tional journal of computer vision, 128(7):1956–1981, 2020. 5, 6

  22. [22]

    A simple unified framework for detecting out-of-distribution samples and adversarial attacks.Advances in neural infor- mation processing systems, 31, 2018

    Kimin Lee, Kibok Lee, Honglak Lee, and Jinwoo Shin. A simple unified framework for detecting out-of-distribution samples and adversarial attacks.Advances in neural infor- mation processing systems, 31, 2018. 1, 2

  23. [23]

    Enhancing the reliability of out-of-distribution image detection in neural networks.arXiv preprint arXiv:1706.02690,

    Shiyu Liang, Yixuan Li, and Rayadurgam Srikant. Enhanc- ing the reliability of out-of-distribution image detection in neural networks.arXiv preprint arXiv:1706.02690, 2017. 1, 2

  24. [24]

    Energy-based out-of-distribution detection.Advances in neural information processing systems, 33:21464–21475,

    Weitang Liu, Xiaoyun Wang, John Owens, and Yixuan Li. Energy-based out-of-distribution detection.Advances in neural information processing systems, 33:21464–21475,

  25. [25]

    Swin transformer: Hierarchical vision transformer using shifted windows

    Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021. 6

  26. [26]

    k-sparse autoencoders

    Alireza Makhzani and Brendan Frey. K-sparse autoencoders. arXiv preprint arXiv:1312.5663, 2013. 2, 3, 7

  27. [27]

    UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction

    Leland McInnes, John Healy, and James Melville. Umap: Uniform manifold approximation and projection for dimen- sion reduction.arXiv preprint arXiv:1802.03426, 2018. 4

  28. [28]

    Mahalanobis++: Improving ood detection via feature normalization.arXiv preprint arXiv:2505.18032, 2025

    Maximilian Mueller and Matthias Hein. Mahalanobis++: Improving ood detection via feature normalization.arXiv preprint arXiv:2505.18032, 2025. 8

  29. [29]

    Sparse autoencoder

    Andrew Ng. Sparse autoencoder. CS294A Lecture notes, Stanford University, 2011. 3, 7

  30. [30]

    Sparse coding with an overcomplete basis set: A strategy employed by v1?Vision research, 37(23):3311–3325, 1997

    Bruno A Olshausen and David J Field. Sparse coding with an overcomplete basis set: A strategy employed by v1?Vision research, 37(23):3311–3325, 1997. 1, 3 9

  31. [31]

    DINOv2: Learning Robust Visual Features without Supervision

    Maxime Oquab, Timoth ´ee Darcet, Th ´eo Moutakanni, Huy V o, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision. arXiv preprint arXiv:2304.07193, 2023. 6

  32. [32]

    Do vision trans- formers see like convolutional neural networks?Advances in neural information processing systems, 34:12116–12128,

    Maithra Raghu, Thomas Unterthiner, Simon Kornblith, Chiyuan Zhang, and Alexey Dosovitskiy. Do vision trans- formers see like convolutional neural networks?Advances in neural information processing systems, 34:12116–12128,

  33. [33]

    React: Out-of- distribution detection with rectified activations.Advances in neural information processing systems, 34:144–157, 2021

    Yiyou Sun, Chuan Guo, and Yixuan Li. React: Out-of- distribution detection with rectified activations.Advances in neural information processing systems, 34:144–157, 2021. 3

  34. [34]

    Out- of-distribution detection with deep nearest neighbors

    Yiyou Sun, Yifei Ming, Xiaojin Zhu, and Yixuan Li. Out- of-distribution detection with deep nearest neighbors. InIn- ternational conference on machine learning, pages 20827– 20840. PMLR, 2022. 1

  35. [35]

    The inaturalist species classification and de- tection dataset

    Grant Van Horn, Oisin Mac Aodha, Yang Song, Yin Cui, Chen Sun, Alex Shepard, Hartwig Adam, Pietro Perona, and Serge Belongie. The inaturalist species classification and de- tection dataset. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 8769–8778,

  36. [36]

    Open-set recognition: A good closed-set classifier is all you need

    Sagar Vaze, Kai Han, Andrea Vedaldi, and Andrew Zisser- man. Open-set recognition: A good closed-set classifier is all you need. InICLR, 2022. 6

  37. [37]

    Vim: Out-of-distribution with virtual-logit matching

    Haoqi Wang, Zhizhong Li, Litong Feng, and Wayne Zhang. Vim: Out-of-distribution with virtual-logit matching. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4921–4930, 2022. 3

  38. [38]

    Generalized out-of-distribution detection: A survey.Inter- national Journal of Computer Vision, 132(12):5635–5662,

    Jingkang Yang, Kaiyang Zhou, Yixuan Li, and Ziwei Liu. Generalized out-of-distribution detection: A survey.Inter- national Journal of Computer Vision, 132(12):5635–5662,

  39. [39]

    Openood v1

    Jingyang Zhang, Jingkang Yang, Pengyun Wang, Haoqi Wang, Yueqian Lin, Haoran Zhang, Yiyou Sun, Xuefeng Du, Yixuan Li, Ziwei Liu, et al. Openood v1. 5: Enhanced benchmark for out-of-distribution detection.arXiv preprint arXiv:2306.09301, 2023. 6 10 Sparsity as a Key: Unlocking New Insights from Latent Structures for Out-of-Distribution Detection Supplement...

  40. [40]

    120. 130. 140. 150. 160. 170. 180. 190. 20 72.2072.2172.2172.2172.2172.2072.2072.1872.17 85.7085.7085.7285.7485.7685.7785.7785.7785.77 95.1395.1495.1595.1795.1695.1895.1795.1795.17 91.0991.0891.0691.0691.0591.0591.0491.0491.03 92.1592.1392.1392.1292.1192.1192.1092.0892.06 70.00 75.00 80.00 85.00 90.00 95.00

  41. [41]

    120. 130. 140. 150. 160. 170. 180. 190. 20 ActivationHeadRatio ActivationHeadRatio FPR95 AUROC SSH-Hard NINCO iNaturalistTe x t u r e sOpenImage-O Figure 11.Sensitivity analysis of the activation head ratio (p).Performance metrics, FPR95 (Left) and AUROC (Right), are evaluated across various OOD benchmarks as a function of the activation head ratio (p). T...

  42. [42]

    The results demonstrate that EPD maintains its effectiveness despite the significant shift in the training domain

    The OOD datasets utilized for this analysis were selected from the suggested list for CIFAR-100 in the OpenOOD v 1.5 framework. The results demonstrate that EPD maintains its effectiveness despite the significant shift in the training domain. While some methods, such as KNN and GEN, showing a notable surge in performance on the CIFAR-100 domain compared t...