Low-Frequency Shortcuts in Texture-Driven Visual Learning

Cathy Hou; David Alvarez-Melis; Stratos Idreos; Utku \c{S}irin

arxiv: 2606.03493 · v1 · pith:P56VHMEMnew · submitted 2026-06-02 · 💻 cs.CV · cs.LG

Low-Frequency Shortcuts in Texture-Driven Visual Learning

Utku \c{S}irin , Cathy Hou , David Alvarez-Melis , Stratos Idreos This is my paper

Pith reviewed 2026-06-28 10:28 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords shortcut learningtexture-driven domainslow-frequency componentsspectral analysisvisual classificationout-of-distribution robustness

0 comments

The pith

Texture-driven visual models base most decisions on a few low-frequency components even though classification information lies in higher-frequency details.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper analyzes shortcut learning in neural networks but shifts focus from standard shape-driven benchmarks to texture-driven domains. It establishes that these models exhibit low-frequency shortcuts, relying on skewed spectral behavior from a small set of low-frequency components. Removing those components from both training and test data produces more balanced frequency use and raises in-distribution accuracy by as much as 8 percent. The same shortcuts cause large drops in accuracy under out-of-distribution corruptions, while their removal improves robustness to low-frequency corruptions at the expense of performance on high-frequency ones.

Core claim

Texture-driven domains suffer from low-frequency shortcuts. Models make the majority of their decisions based on a few low-frequency components with skewed spectral behavior, despite classification information residing in higher-frequency fine-grained details. Pruning the low-frequency components from training and test sets eliminates the shortcut, yields balanced spectral behavior, and improves in-distribution accuracy by up to 8 percent. The shortcuts also render models vulnerable to out-of-distribution corruptions, with accuracy drops reaching 70 percent, while pruning improves robustness to low-frequency corruptions by up to 40 percent and creates a trade-off on high-frequency corruption

What carries the argument

Low-frequency components (LFCs) identified by spectral analysis of model decisions; pruning them from images forces a shift from skewed to balanced spectral reliance.

If this is right

Pruning LFCs raises in-distribution accuracy by up to 8 percent.
Low-frequency shortcuts cause accuracy drops of up to 70 percent under out-of-distribution corruptions.
Pruning LFCs improves robustness to low-frequency corruptions by up to 40 percent.
The resulting balanced spectral behavior produces opposing effects on generalization to low-frequency versus high-frequency corruptions.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Frequency-aware data filtering may be worth testing on other texture-heavy tasks such as material or medical-image classification.
Training procedures could incorporate explicit penalties against over-reliance on any single frequency band to reduce shortcut formation.
The observed low-versus-high frequency trade-off suggests that robustness benchmarks should separately report performance across spectral regimes rather than aggregate scores alone.

Load-bearing premise

The assumption that the classification signal truly resides in the higher-frequency components and that removing the identified low-frequency components does not discard task-relevant information or introduce new artifacts.

What would settle it

Observe whether accuracy gains from LFC pruning disappear when the same models are evaluated on versions of the data in which higher-frequency content has been deliberately degraded while low-frequency content remains intact.

Figures

Figures reproduced from arXiv: 2606.03493 by Cathy Hou, David Alvarez-Melis, Stratos Idreos, Utku \c{S}irin.

**Figure 2.** Figure 2: Frequency analysis methodology. When pruning, we remove frequency components diagonally from the top-left to the bottom-right of the image (or vice versa), since oscillation rates and spatial complexity increase along this direction. Each such diagonal is referred to as a frequency component; the terms diagonal, frequency component, and component are used interchangeably. Pruning is used for sensitivity… view at source ↗

**Figure 3.** Figure 3: Sample images for the texturedriven domains we study. Ground Terrain Recognition. Ground terrain recognition supports applications such as autonomous driving [61; 89] and robot navigation [29; 96]. Terrains correspond to surface types (e.g., leaves, grass) and are classified using spatial cues that characterize surface material and texture (3rd image in [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: ID accuracy results for pruning LFCs (dark line), MFCs (light orange line), and HFCs (red [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: Accuracy contributions with unpruned (top) and pruned (bottom) training and test images. [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Sample images from TextileNet (top) and CIFAR-10 (bottom). These results indicate that texture-driven domains suffer from low-frequency shortcuts. While texturedriven domains have their classification information primarily in higher frequencies, neural networks rely exponentially more on LFCs than they do on HFCs. Pruning LFCs mitigates the shortcut by shifting the spectral behavior towards higher frequen… view at source ↗

**Figure 7.** Figure 7: OOD results for fog (top) and Gaussian blur (bottom) corruptions for ResNet-50. [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

**Figure 8.** Figure 8: Spectral behavior of TextileNet. significantly decreases its ID accuracy, by up to 10% at 10 LFCs. In [PITH_FULL_IMAGE:figures/full_fig_p006_8.png] view at source ↗

**Figure 9.** Figure 9: Low-frequency shortcuts persist across mixed-semantics tasks. 2. High-Frequency Corruption: Gaussian Blur. The left graph at bottom of [PITH_FULL_IMAGE:figures/full_fig_p007_9.png] view at source ↗

**Figure 10.** Figure 10: Pruning LFCs improves OOD accuracy. OOD Summary. Low-frequency shortcuts make models highly vulnerable to OOD corruptions, causing up to 70% accuracy drop compared to ID performance. Pruning LFCs significantly improves robustness to low-frequency corruptions, up to 40%, and introduces a trade-off for highfrequency corruptions; the improved spectral behavior provides a better generalization, whereas the … view at source ↗

**Figure 11.** Figure 11: Low-frequency shortcuts for a VFM (DinoV2) and VLM (CLIP) using GTOS dataset. [PITH_FULL_IMAGE:figures/full_fig_p008_11.png] view at source ↗

**Figure 12.** Figure 12: Test set ID accuracy results for pruning LFCs (dark line) and HFCs (red line). Test-set [PITH_FULL_IMAGE:figures/full_fig_p011_12.png] view at source ↗

**Figure 13.** Figure 13: Accuracy contributions based on test-set images. [PITH_FULL_IMAGE:figures/full_fig_p011_13.png] view at source ↗

**Figure 14.** Figure 14: OOD corruption results based on test-set images. [PITH_FULL_IMAGE:figures/full_fig_p012_14.png] view at source ↗

**Figure 15.** Figure 15: Spectral behavior for ResNet-50 across different seeds when using unpruned images. As [PITH_FULL_IMAGE:figures/full_fig_p013_15.png] view at source ↗

**Figure 16.** Figure 16: Spectral behavior for ResNet-50 across different seeds when using pruned images. As can [PITH_FULL_IMAGE:figures/full_fig_p013_16.png] view at source ↗

**Figure 17.** Figure 17: Sample images from CIFAR-10 (left) and texture-driven tasks (right). [PITH_FULL_IMAGE:figures/full_fig_p014_17.png] view at source ↗

**Figure 18.** Figure 18: ID Results for ResNet-50, MobileNet-V3, ViT-Small, and ViT-Tony. [PITH_FULL_IMAGE:figures/full_fig_p015_18.png] view at source ↗

**Figure 19.** Figure 19: Spectral Behavior for ResNet-50, MobileNet-V3, ViT-Small, and ViT-Tony when trained [PITH_FULL_IMAGE:figures/full_fig_p016_19.png] view at source ↗

**Figure 20.** Figure 20: Spectral Behavior for ResNet-50, MobileNet-V3, ViT-Small, and ViT-Tony when trained [PITH_FULL_IMAGE:figures/full_fig_p016_20.png] view at source ↗

**Figure 21.** Figure 21: OOD performance under different severity levels. X-axis: number of pruned LFCs. Y-axis: [PITH_FULL_IMAGE:figures/full_fig_p017_21.png] view at source ↗

**Figure 22.** Figure 22: OOD Results for ResNet50 [PITH_FULL_IMAGE:figures/full_fig_p018_22.png] view at source ↗

**Figure 23.** Figure 23: OOD Results for MobileNet-V3. H Impact of Model Size and Architecture on OOD Results [PITH_FULL_IMAGE:figures/full_fig_p018_23.png] view at source ↗

**Figure 24.** Figure 24: OOD Results for ViT-Small. 0.7 0.8 0.9 0 4 8 12 16 20 24 28 32 336 ID/OOD Accuracy # Pruned LF Diagonals OOD tests ViT-Tiny SP-Colorectal TextileNet GTOS 0 0.5 1 ID OOD ID OOD ID OOD ID OOD SP-Col TxNet GTOS C10 ID/OOD Accuracy 0 0.5 1 ID OOD ID OOD ID OOD ID OOD SP-Col TxNet GTOS C10 ID/OOD Accuracy 0 0.5 1 ID OOD ID OOD ID OOD ID OOD SP-Col TxNet GTOS C10 ID/OOD Accuracy 0 0.2 0.4 0.6 0.8 1 ID/OOD Accur… view at source ↗

**Figure 25.** Figure 25: OOD Results for ViT-Tiny. the PCP pipeline largely recovers from the corruptions, closely approximating the ID accuracy. This is because applying corruption after pruning reduces the impact of corruption, as corruption uses some of the pruned components. As a result, the final OOD accuracy is higher for PCP than for CP. For elastic corruption, however, CP achieves higher accuracy than PCP. This is because… view at source ↗

**Figure 26.** Figure 26: OOD corruption pipelines. PCP: prune-corrupt-prune. CP: corrupt-prune. [PITH_FULL_IMAGE:figures/full_fig_p020_26.png] view at source ↗

**Figure 27.** Figure 27: CIFAR-10’s OOD performance closely approximates its ID performance. It suffers an [PITH_FULL_IMAGE:figures/full_fig_p020_27.png] view at source ↗

**Figure 28.** Figure 28: Frequency characteristics of the corruptions we use, across the four tasks we analyze. [PITH_FULL_IMAGE:figures/full_fig_p021_28.png] view at source ↗

**Figure 29.** Figure 29: Impact of pruning HFCs on OOD performance (red line). We present pruning HFCs, along [PITH_FULL_IMAGE:figures/full_fig_p024_29.png] view at source ↗

read the original abstract

Neural networks suffer from shortcut learning, where learned features generalize well to the training set but not to in-distribution (ID) or out-of-distribution (OOD) test sets. Existing studies are all based on a few standard benchmarks, which are shape-driven. Numerous application domains, however, are texture-driven. In this work, we present shortcut learning analysis for texture-driven domains, and compare it with that of a standard benchmark. We show that texture-driven domains suffer from low-frequency shortcuts. They make the majority of their decisions based on a few low-frequency components (LFCs) with a skewed spectral behavior, despite that their classification information is in higher-frequency, fine-grained details. Pruning LFCs from training and test sets eliminates the shortcut and provides a more balanced spectral behavior, improving the ID accuracy by up to 8%. We show that low-frequency shortcuts make the models highly vulnerable to OOD corruptions, leading up to 70% accuracy drop compared to the ID accuracy. Pruning LFCs significantly improves robustness to low-frequency corruptions, by up to 40%, and introduces a trade-off for high-frequency corruptions; the balanced spectral behavior provides a better generalization performance, whereas the increased dependence on high-frequency features reduces it. OOD accuracy depends on the interaction between these two factors.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Texture domains show low-frequency shortcuts that pruning can mitigate for ID gains and some robustness, but the claim that higher frequencies alone carry the signal rests on thin controls.

read the letter

The key takeaway is that this work identifies low-frequency shortcuts as a bigger issue in texture-driven tasks than in standard shape benchmarks, with LFC pruning yielding up to 8% ID accuracy lift and 40% better robustness to low-frequency corruptions.

The paper does a clean job extending shortcut analysis beyond the usual ImageNet-style datasets. It documents the skewed spectral reliance in texture cases, shows that pruning produces more balanced frequency use, and tracks the resulting trade-off where high-frequency corruptions become harder. Those quantitative comparisons are the useful part.

The soft spot is the assumption that higher-frequency components hold the real classification signal. The accuracy improvements after pruning are offered as proof, yet the abstract gives no details on whether the filtering preserves class identity, introduces artifacts, or simply reduces noise. Without something like human labeling on the pruned images or reconstruction checks, the gains could come from data alteration rather than shortcut removal. The OOD trade-off discussion is noted but would need tighter validation in the full text.

This is aimed at people studying robustness in texture-heavy applications such as medical imaging or material inspection. It fills a gap in the shortcut literature with measurable effects.

I would send it to peer review. The core observation is worth referee time even if the causal story around pruning needs more evidence.

Referee Report

2 major / 2 minor

Summary. The manuscript analyzes shortcut learning in texture-driven visual classification tasks, contrasting it with shape-driven benchmarks. It argues that texture-driven models rely on a few low-frequency components (LFCs) as shortcuts, even though discriminative information is in higher frequencies. By pruning LFCs from both training and test sets, the authors report improved in-distribution (ID) accuracy (up to 8%) and robustness to low-frequency corruptions (up to 40%), while noting a trade-off with high-frequency corruptions due to increased high-frequency dependence.

Significance. If the pruning genuinely isolates shortcuts without removing task-relevant signal, this would extend shortcut analysis beyond standard benchmarks to texture-driven domains common in applications, providing a concrete spectral intervention that improves both ID performance and low-frequency robustness. The empirical measurement of spectral bias and the reported OOD trade-off offer falsifiable predictions for follow-up work in domain-specific robustness.

major comments (2)

[Abstract] Abstract: the reported gains of up to 8% ID accuracy and 40% robustness are presented without dataset details, spectral analysis method, or controls verifying that pruned images preserve class identity (e.g., human labeling accuracy or reconstruction error); this directly affects whether the gains demonstrate shortcut elimination or result from data modification.
[Abstract] Abstract: the central claim that classification information resides in higher-frequency components (and that LFC pruning removes only the shortcut) rests on the accuracy improvements after pruning; without independent verification that higher frequencies alone suffice, the observed gains risk circularity with the pruning operation itself altering image statistics.

minor comments (2)

[Abstract] Abstract: the statement that 'numerous application domains... are texture-driven' is not accompanied by concrete examples or citations to such domains.
[Abstract] Abstract: the comparison to 'a standard benchmark' does not specify which benchmark or how the texture-driven datasets were chosen.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript analyzing low-frequency shortcuts in texture-driven visual domains. We address each major comment point by point below.

read point-by-point responses

Referee: [Abstract] Abstract: the reported gains of up to 8% ID accuracy and 40% robustness are presented without dataset details, spectral analysis method, or controls verifying that pruned images preserve class identity (e.g., human labeling accuracy or reconstruction error); this directly affects whether the gains demonstrate shortcut elimination or result from data modification.

Authors: We agree that the abstract's brevity omits key contextual details that would strengthen interpretability. The full manuscript specifies the texture-driven datasets analyzed, describes the Fourier-domain pruning procedure for isolating LFCs, and reports reconstruction-based metrics confirming that class identity is retained post-pruning. In the revised manuscript we will expand the abstract to include concise references to the datasets, the spectral pruning method, and the identity-preservation controls. revision: yes
Referee: [Abstract] Abstract: the central claim that classification information resides in higher-frequency components (and that LFC pruning removes only the shortcut) rests on the accuracy improvements after pruning; without independent verification that higher frequencies alone suffice, the observed gains risk circularity with the pruning operation itself altering image statistics.

Authors: The primary evidence for the claim is the post-pruning accuracy improvement together with the measured shift to balanced spectral usage. The manuscript additionally quantifies the original models' spectral bias toward LFCs and documents the resulting robustness trade-off with high-frequency corruptions, which supplies corroborating (non-circular) support for increased high-frequency reliance. We will add an explicit discussion paragraph in the revision to separate the pruning-based evidence from the supporting spectral-bias and trade-off analyses, thereby reducing any appearance of circularity. revision: partial

Circularity Check

0 steps flagged

No significant circularity; empirical measurement study

full rationale

The paper conducts an empirical analysis of shortcut learning by training models on texture-driven domains, observing reliance on low-frequency components via spectral analysis, and measuring accuracy/robustness changes after pruning those components from train and test sets. No equations, parameter fits, or derivations are present that reduce the reported gains (e.g., up to 8% ID accuracy) to quantities defined by the same data or self-citations. The pruning and accuracy measurements are independent experimental outcomes, not forced by construction. Self-contained against external benchmarks with no load-bearing self-citation chains.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that texture classification information is concentrated in high-frequency components and that the observed spectral skew is caused by shortcut learning rather than dataset statistics.

axioms (1)

domain assumption Classification information in texture-driven domains resides primarily in higher-frequency fine-grained details rather than low-frequency components.
Stated directly in the abstract as the contrast to the observed shortcut behavior.

pith-pipeline@v0.9.1-grok · 5770 in / 1256 out tokens · 20240 ms · 2026-06-28T10:28:00.223420+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

96 extracted references · 1 canonical work pages

[1]

https://github.com/phelber/EuroSAT, 2019

EuroSAT GitHub Repo. https://github.com/phelber/EuroSAT, 2019

2019
[2]

https://github.com/mwalmsley/galaxy_mnist, 2022

Galaxy MNIST GitHub Repo. https://github.com/mwalmsley/galaxy_mnist, 2022

2022
[3]

DINOv2: Learning Robust Visual Features without Supervision, 2023

2023
[4]

https://github.com/openai/CLIP, 2025

CLIP GitHub Repo. https://github.com/openai/CLIP, 2025

2025
[5]

https://github.com/facebookresearch/dinov2, 2025

DinoV2 GitHub Repo. https://github.com/facebookresearch/dinov2, 2025

2025
[6]

https://github.com/mwalmsley/galaxy-datasets, 2025

Galaxy Zoo. https://github.com/mwalmsley/galaxy-datasets, 2025

2025
[7]

Abello, Roberto Hirata, and Zhangyang Wang

Antonio A. Abello, Roberto Hirata, and Zhangyang Wang. Dissecting the High-Frequency Bias in Convolutional Neural Networks. InCVPRW, pages 863–871, 2021

2021
[8]

Ahmed, T

N. Ahmed, T. Natarajan, and K.R. Rao. Discrete Cosine Transform.IEEE Transactions on Computers, C-23(1):90–93, 1974

1974
[9]

Improving Vision Transformers by Revisiting High-Frequency Components

Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, and Wei Liu. Improving Vision Transformers by Revisiting High-Frequency Components. InECCV, page 1–18, 2022

2022
[10]

Nicholas Baker, Hongjing Lu, Gennady Erlikhman, and Philip J. Kellman. Deep Convolutional Networks do not Classify based on Global Object Shape.PLOS Computational Biology, 14 (12):1–43, 2018

2018
[11]

DeepSat: A Learning Framework for Satellite Imagery

Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert DiBiano, Manohar Karki, and Ramakrishna Nemani. DeepSat: A Learning Framework for Satellite Imagery. InSIGSPATIAL, 2015

2015
[12]

Network Dissection: Quantifying Interpretability of Deep Visual Representations

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. Network Dissection: Quantifying Interpretability of Deep Visual Representations. InCVPR, 2017

2017
[13]

Recognition in Terra Incognita

Sara Beery, Grant Van Horn, and Pietro Perona. Recognition in Terra Incognita. InECCV, 2018

2018
[14]

Tsaftaris, and Sonia Dahdouh

Christopher Boland, Keith A Goatman, Sotirios A. Tsaftaris, and Sonia Dahdouh. There Are No Shortcuts to Anywhere Worth Going: Identifying Shortcuts in Deep Learning Models for Medical Image Analysis. InInternational Conference on Medical Imaging with Deep Learning, volume 250, pages 131–150, 2024

2024
[15]

ImageNet-trained CNNs are not Biased Towards Texture: Revisiting Feature Reliance Through Controlled Suppression

Tom Burgert, Oliver Stoll, Paolo Rota, and Begüm Demir. ImageNet-trained CNNs are not Biased Towards Texture: Revisiting Feature Reliance Through Controlled Suppression. In NeurIPS, 2025

2025
[16]

Towards Understanding the Spectral Bias of Deep Learning

Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, and Quanquan Gu. Towards Understanding the Spectral Bias of Deep Learning. InIJCAI, pages 2205–2211, 8 2021

2021
[17]

Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis, 2024

Jaouad Dabounou and Amine Baazzouz. Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis, 2024

2024
[18]

Roads, Xiaoliang Luo, Daniel N

Nikolay Dagaev, Brett D. Roads, Xiaoliang Luo, Daniel N. Barry, Kaustubh R. Patil, and Bradley C. Love. A Too-Good-to-Be-True Prior to Reduce Shortcut Reliance.Pattern Recogni- tion Letters, 166:164–171, 2023

2023
[19]

Le, and Mingxing Tan

Zihang Dai, Hanxiao Liu, Quoc V . Le, and Mingxing Tan. CoAtNet: Marrying Convolution and Attention for All Data Sizes. InNeurIPS, 2021

2021
[20]

ImageNet: A Large-scale Hierarchical Image Database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A Large-scale Hierarchical Image Database. InCVPR, pages 248–255, 2009

2009
[21]

GalaxiesML: A Dataset of Galaxy Images, Photometry, Redshifts, and Structural Parameters for Machine Learning

Tuan Do, Bernie Boscoe, Evan Jones, Yun Qi Li, and Kevin Alfaro. GalaxiesML: A Dataset of Galaxy Images, Photometry, Redshifts, and Structural Parameters for Machine Learning. 2024. 25

2024
[22]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. InICLR, 2021

2021
[23]

FreeGaze: Resource-efficient Gaze Estimation via Frequency Domain Contrastive Learning.CoRR, abs/2209.06692, 2022

Lingyu Du and Guohao Lan. FreeGaze: Resource-efficient Gaze Estimation via Frequency Domain Contrastive Learning.CoRR, abs/2209.06692, 2022

work page arXiv 2022
[24]

Band- limited Training and Inference for Convolutional Neural Networks

Adam Dziedzic, John Paparrizos, Sanjay Krishnan, Aaron Elmore, and Michael Franklin. Band- limited Training and Inference for Convolutional Neural Networks. InICML, pages 1745–1754, 2019

2019
[25]

Using Compression to Speed Up Image Classifica- tion in Artificial Neural Networks

Dan Fu and Gabriel Guimaraes. Using Compression to Speed Up Image Classifica- tion in Artificial Neural Networks. 2016. URL https://www.danfu.org/files/ CompressionImageClassification.pdf

2016
[26]

Can Biases in ImageNet Models Explain Generalization? In CVPR, pages 22184–22194, 2024

Paul Gavrikov and Janis Keuper. Can Biases in ImageNet Models Explain Generalization? In CVPR, pages 22184–22194, 2024

2024
[27]

Wichmann, and Wieland Brendel

Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. ImageNet-trained CNNs are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness. InICLR, 2019

2019
[28]

Wichmann

Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A. Wichmann. Shortcut Learning in Deep Neural Networks.Nature Machine Intelligence, 2:665–673, 2020

2020
[29]

GA-Nav: Efficient Terrain Segmentation for Robot Navi- gation in Unstructured Outdoor Environments.IEEE Robotics and Automation Letters, 7(3): 8138–8145, 2022

Tianrui Guan, Divya Kothandaraman, Rohan Chandra, Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, and Dinesh Manocha. GA-Nav: Efficient Terrain Segmentation for Robot Navi- gation in Unstructured Outdoor Environments.IEEE Robotics and Automation Letters, 7(3): 8138–8145, 2022

2022
[30]

Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. VITON: An Image-Based Virtual Try-On Network. InCVPR, pages 7543–7552, 2018

2018
[31]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. InCVPR, pages 770–778, 2016

2016
[32]

Introducing eurosat: A novel dataset and deep learning benchmark for land use and land cover classification

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Introducing eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, pages 204–207. IEEE, 2018

2018
[33]

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019

2019
[34]

Dietterich

Dan Hendrycks and Thomas G. Dietterich. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. InICLR, 2019

2019
[35]

SPIDER-colorectal dataset

HistAI. SPIDER-colorectal dataset. https://huggingface.co/datasets/histai/ SPIDER-colorectal, 2025. Accessed: 2026-01-07

2025
[36]

Inflammation

HMB302. Inflammation. https://hmb302.ca/chapters/inflammation/, 2023. Online histology and pathology educational resource. Accessed: 2026-01-07

2023
[37]

Le, Mark Sandler, Bo Chen, Wei- jun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, and Yukun Zhu

Andrew Howard, Ruoming Pang, Hartwig Adam, Quoc V . Le, Mark Sandler, Bo Chen, Wei- jun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, and Yukun Zhu. Searching for MobileNetV3. InICCV, pages 1314–1324, 2019

2019
[38]

Measuring the Tendency of CNNs to Learn Surface Statistical Regularities, 2017

Jason Jo and Yoshua Bengio. Measuring the Tendency of CNNs to Learn Surface Statistical Regularities, 2017

2017
[39]

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Technical report, 2009. 26

2009
[40]

Sustainable Clothing Design: Use Matters.Journal of Design Research, 10(1–2):121–139, 2012

Kirsi Laitala and Casper Boks. Sustainable Clothing Design: Use Matters.Journal of Design Research, 10(1–2):121–139, 2012

2012
[41]

Unmasking Clever Hans Predictors and Assessing What Machines Really Learn.Nature Communications, 10(1), 2019

Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Unmasking Clever Hans Predictors and Assessing What Machines Really Learn.Nature Communications, 10(1), 2019

2019
[42]

Investigating and Explaining the Frequency Bias in Image Classification

Zhiyu Lin, Yifei Gao, and Jitao Sang. Investigating and Explaining the Frequency Bias in Image Classification. InIJCAI, pages 717–723, 2022

2022
[43]

Exploring Semantic Segmentation on the DCT Repre- sentation

Shao-Yuan Lo and Hsueh-Ming Hang. Exploring Semantic Segmentation on the DCT Repre- sentation. In1st ACM International Conference on Multimedia in Asia (MMASIA), pages 1–6, 2019

2019
[44]

Automatic Shortcut Removal for Self-supervised Representation Learning

Matthias Minderer, Olivier Bachem, Neil Houlsby, and Michael Tschannen. Automatic Shortcut Removal for Self-supervised Representation Learning. InICML, 2020

2020
[45]

Woodhead Publishing, 2018

Subramanian Senthilkannan Muthu.Circular Economy in Textiles and Apparel: Processing, Manufacturing, and Design. Woodhead Publishing, 2018

2018
[46]

Uncovering and Correct- ing Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis.Diagnostics, 12 (1), 2022

Meike Nauta, Robert Walsh, Andrew Dubowski, and Christin Seifert. Uncovering and Correct- ing Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis.Diagnostics, 12 (1), 2022

2022
[47]

SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models, 2025

Dmitry Nechaev, Alexey Pchelnikov, and Ekaterina Ivanova. SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models, 2025

2025
[48]

Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge

Hongjing Niu, Hanting Li, Feng Zhao, and Bin Li. Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge. InNeurIPS, pages 29064–29075, 2022

2022
[49]

Fast Vision Transformers with HiLo Attention

Zizheng Pan, Jianfei Cai, and Bohan Zhuang. Fast Vision Transformers with HiLo Attention. InNeurIPS, pages 14541–14554, 2022

2022
[50]

Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning, 2018

Nicolas Papernot and Patrick McDaniel. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning, 2018

2018
[51]

How Do Vision Transformers Work? InICLR, 2022

Namuk Park and Songkuk Kim. How Do Vision Transformers Work? InICLR, 2022

2022
[52]

Gradient Starvation: A Learning Proclivity in Neural Networks

Mohammad Pezeshki, Oumar Kaba, Yoshua Bengio, Aaron C Courville, Doina Precup, and Guillaume Lajoie. Gradient Starvation: A Learning Proclivity in Neural Networks. InNeurIPS, volume 34, pages 1256–1272, 2021

2021
[53]

URL https://docs.pytorch

PyTorch.PyTorch — ResNet-50 Model Documentation, 2025. URL https://docs.pytorch. org/vision/main/models/generated/torchvision.models.resnet50.html. Ac- cessed: 2026-01-07

2025
[54]

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning Transferable Visual Models From Natural Language Supervision. InICML, pages 8748–8763, 2021

2021
[55]

On the Spectral Bias of Neural Networks

Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the Spectral Bias of Neural Networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,PMLR, volume 97, pages 5301–5310, 2019

2019
[56]

Ramaswamy, Sunnie S

Vikram V . Ramaswamy, Sunnie S. Y . Kim, Ruth Fong, and Olga Russakovsky. Overlooked Factors in Concept-Based Explanations: Dataset Choice, Concept Learnability, and Human Capability. InCVPR, pages 10932–10941, 2023

2023
[57]

Global Filter Networks for Image Classification

Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, and Jie Zhou. Global Filter Networks for Image Classification. InNeurIPS, pages 980–993, 2021

2021
[58]

The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies

Basri Ronen, David Jacobs, Yoni Kasten, and Shira Kritchman. The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies. InNeurIPS, volume 32, 2019. 27

2019
[59]

The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG

Samuel Felipe dos Santos, Nicu Sebe, and Jurandy Almeida. The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG. In27th IEEE International Conference on Image Processing (ICIP), pages 1896–1900, 2020

1900
[60]

The Pitfalls of Simplicity Bias in Neural Networks

Harshay Shah, Kaustav Tamuly, Aditi Raghunathan, Prateek Jain, and Praneeth Netrapalli. The Pitfalls of Simplicity Bias in Neural Networks. InNeurIPS, 2020

2020
[61]

Road Recognition for Autonomous Vehicles Based on Intelligent Tire and SE-CNN

Runwu Shi, Shichun Yang, Yuyi Chen, Rui Wang, Jiayi Lu, Zhaowen Pang, and Yaoguang Cao. Road Recognition for Autonomous Vehicles Based on Intelligent Tire and SE-CNN. In Intelligent Systems and Pattern Recognition, volume 1589, pages 291–305. 2022

2022
[62]

TextileNet: Material taxonomy-based fashion textile dataset

Shu Zhong. TextileNet: Material taxonomy-based fashion textile dataset. https://github. com/hahashu/TextileNet, 2023. Accessed: 2026-01-07

2023
[63]

The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format.Proc

Utku Sirin and Stratos Idreos. The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format.Proc. ACM Manag. Data, 2(1), 2024

2024
[64]

Frequency-Store: Scaling Image AI by A Column-Store for Images

Utku Sirin, Victoria Kauffman, Aadit Saluja, Florian Klein, Jeremy Hsu, and Stratos Idreos. Frequency-Store: Scaling Image AI by A Column-Store for Images. InCIDR, 2025

2025
[65]

Srinidhi, Ozan Ciga, and Anne L

Chetan L. Srinidhi, Ozan Ciga, and Anne L. Martel. Deep neural network models for computa- tional histopathology: A survey.Medical Image Analysis, 67, 2021

2021
[66]

Majaj, and Denis G

Ajay Subramanian, Elena Sizikova, Najib J. Majaj, and Denis G. Pelli. Spatial-frequency Channels, Shape Bias, and Adversarial Robustness. InNeurIPS, 2023

2023
[67]

Neural Redshift: Random Networks Are Not Random Functions

Damien Teney, Armand Mihai Nicolicioiu, Valentin Hartmann, and Ehsan Abbasnejad. Neural Redshift: Random Networks Are Not Random Functions. InCVPR, pages 4786–4796, 2024

2024
[68]

Training Data-Efficient Image Transformers & Distillation Through Attention

Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herve Jegou. Training Data-Efficient Image Transformers & Distillation Through Attention. In ICML, pages 10347–10357, 2021

2021
[69]

MaxViT: Multi-axis Vision Transformer

Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. MaxViT: Multi-axis Vision Transformer. InECCV, page 459–479, 2022

2022
[70]

Griffiths

Shikhar Tuli, Ishita Dasgupta, Erin Grant, and Thomas L. Griffiths. Are Convolutional Neural Networks or Transformers More Like Human Vision? InProceedings of the 43rd Annual Meeting of the Cognitive Science Society, pages 1844–1850, 2021

2021
[71]

Interpretable Neural Network Classification Model Using First-order Logic Rules.Neurocomputing, 614(1):128–840, 2025

Haiming Tuo, Zuqiang Meng, Zihao Shi, and Daosheng Zhang. Interpretable Neural Network Classification Model Using First-order Logic Rules.Neurocomputing, 614(1):128–840, 2025

2025
[72]

E-commerce Worldwide—Statistics & Facts

Koen van Gelder. E-commerce Worldwide—Statistics & Facts. https://www.statista. com/topics/871/online-shopping/, 2025. Accessed: 2026-01-07

2025
[73]

Mike Walmsley, Chris Lintott, Tobias Géron, Sandor Kruk, Coleman Krawczyk, Kyle W Willett, Steven Bamford, Lee S Kelvin, Lucy Fortson, Yarin Gal, William Keel, Karen L Masters, Vihang Mehta, Brooke D Simmons, Rebecca Smethurst, Lewis Smith, Elisabeth M Baeten, and Christine Macmillan. Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from V olunt...

2022
[74]

Lipton, and Eric P

Haohan Wang, Songwei Ge, Zachary C. Lipton, and Eric P. Xing. Learning Robust Global Representations by Penalizing Local Predictive Power. InNeurIPS, pages 10506–10518, 2019

2019
[75]

Haohan Wang, Xindi Wu, Zeyi Huang, and Eric P. Xing. High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks. InCVPR, pages 8681–8691, 2020

2020
[76]

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

Peihao Wang, Wenqing Zheng, Tianlong Chen, and Zhangyang Wang. Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. InICLR, 2022. 28

2022
[77]

What Do Neural Networks Learn in Image Classification? A Frequency Shortcut Perspective

Shunxin Wang, Raymond Veldhuis, Christoph Brune, and Nicola Strisciuglio. What Do Neural Networks Learn in Image Classification? A Frequency Shortcut Perspective. InICCV, pages 1433–1442, 2023

2023
[78]

A Survey on the Robustness of Computer Vision Models against Common Corruptions, 2024

Shunxin Wang, Raymond Veldhuis, Christoph Brune, and Nicola Strisciuglio. A Survey on the Robustness of Computer Vision Models against Common Corruptions, 2024

2024
[79]

Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization

Shunxin Wang, Raymond Veldhuis, and Nicola Strisciuglio. Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization. InCVPR, pages 25198–25207, 2025

2025
[80]

VTC-LFC: Vision Transformer Compression with Low-Frequency Components

Zhenyu Wang, Hao Luo, Pichao W ANG, Feng Ding, Fan Wang, and Hao Li. VTC-LFC: Vision Transformer Compression with Low-Frequency Components. InNeurIPS, pages 13974–13988, 2022

2022

Showing first 80 references.

[1] [1]

https://github.com/phelber/EuroSAT, 2019

EuroSAT GitHub Repo. https://github.com/phelber/EuroSAT, 2019

2019

[2] [2]

https://github.com/mwalmsley/galaxy_mnist, 2022

Galaxy MNIST GitHub Repo. https://github.com/mwalmsley/galaxy_mnist, 2022

2022

[3] [3]

DINOv2: Learning Robust Visual Features without Supervision, 2023

2023

[4] [4]

https://github.com/openai/CLIP, 2025

CLIP GitHub Repo. https://github.com/openai/CLIP, 2025

2025

[5] [5]

https://github.com/facebookresearch/dinov2, 2025

DinoV2 GitHub Repo. https://github.com/facebookresearch/dinov2, 2025

2025

[6] [6]

https://github.com/mwalmsley/galaxy-datasets, 2025

Galaxy Zoo. https://github.com/mwalmsley/galaxy-datasets, 2025

2025

[7] [7]

Abello, Roberto Hirata, and Zhangyang Wang

Antonio A. Abello, Roberto Hirata, and Zhangyang Wang. Dissecting the High-Frequency Bias in Convolutional Neural Networks. InCVPRW, pages 863–871, 2021

2021

[8] [8]

Ahmed, T

N. Ahmed, T. Natarajan, and K.R. Rao. Discrete Cosine Transform.IEEE Transactions on Computers, C-23(1):90–93, 1974

1974

[9] [9]

Improving Vision Transformers by Revisiting High-Frequency Components

Jiawang Bai, Li Yuan, Shu-Tao Xia, Shuicheng Yan, Zhifeng Li, and Wei Liu. Improving Vision Transformers by Revisiting High-Frequency Components. InECCV, page 1–18, 2022

2022

[10] [10]

Nicholas Baker, Hongjing Lu, Gennady Erlikhman, and Philip J. Kellman. Deep Convolutional Networks do not Classify based on Global Object Shape.PLOS Computational Biology, 14 (12):1–43, 2018

2018

[11] [11]

DeepSat: A Learning Framework for Satellite Imagery

Saikat Basu, Sangram Ganguly, Supratik Mukhopadhyay, Robert DiBiano, Manohar Karki, and Ramakrishna Nemani. DeepSat: A Learning Framework for Satellite Imagery. InSIGSPATIAL, 2015

2015

[12] [12]

Network Dissection: Quantifying Interpretability of Deep Visual Representations

David Bau, Bolei Zhou, Aditya Khosla, Aude Oliva, and Antonio Torralba. Network Dissection: Quantifying Interpretability of Deep Visual Representations. InCVPR, 2017

2017

[13] [13]

Recognition in Terra Incognita

Sara Beery, Grant Van Horn, and Pietro Perona. Recognition in Terra Incognita. InECCV, 2018

2018

[14] [14]

Tsaftaris, and Sonia Dahdouh

Christopher Boland, Keith A Goatman, Sotirios A. Tsaftaris, and Sonia Dahdouh. There Are No Shortcuts to Anywhere Worth Going: Identifying Shortcuts in Deep Learning Models for Medical Image Analysis. InInternational Conference on Medical Imaging with Deep Learning, volume 250, pages 131–150, 2024

2024

[15] [15]

ImageNet-trained CNNs are not Biased Towards Texture: Revisiting Feature Reliance Through Controlled Suppression

Tom Burgert, Oliver Stoll, Paolo Rota, and Begüm Demir. ImageNet-trained CNNs are not Biased Towards Texture: Revisiting Feature Reliance Through Controlled Suppression. In NeurIPS, 2025

2025

[16] [16]

Towards Understanding the Spectral Bias of Deep Learning

Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, and Quanquan Gu. Towards Understanding the Spectral Bias of Deep Learning. InIJCAI, pages 2205–2211, 8 2021

2021

[17] [17]

Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis, 2024

Jaouad Dabounou and Amine Baazzouz. Enhancing Neural Network Interpretability Through Conductance-Based Information Plane Analysis, 2024

2024

[18] [18]

Roads, Xiaoliang Luo, Daniel N

Nikolay Dagaev, Brett D. Roads, Xiaoliang Luo, Daniel N. Barry, Kaustubh R. Patil, and Bradley C. Love. A Too-Good-to-Be-True Prior to Reduce Shortcut Reliance.Pattern Recogni- tion Letters, 166:164–171, 2023

2023

[19] [19]

Le, and Mingxing Tan

Zihang Dai, Hanxiao Liu, Quoc V . Le, and Mingxing Tan. CoAtNet: Marrying Convolution and Attention for All Data Sizes. InNeurIPS, 2021

2021

[20] [20]

ImageNet: A Large-scale Hierarchical Image Database

Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. ImageNet: A Large-scale Hierarchical Image Database. InCVPR, pages 248–255, 2009

2009

[21] [21]

GalaxiesML: A Dataset of Galaxy Images, Photometry, Redshifts, and Structural Parameters for Machine Learning

Tuan Do, Bernie Boscoe, Evan Jones, Yun Qi Li, and Kevin Alfaro. GalaxiesML: A Dataset of Galaxy Images, Photometry, Redshifts, and Structural Parameters for Machine Learning. 2024. 25

2024

[22] [22]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, and Neil Houlsby. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. InICLR, 2021

2021

[23] [23]

FreeGaze: Resource-efficient Gaze Estimation via Frequency Domain Contrastive Learning.CoRR, abs/2209.06692, 2022

Lingyu Du and Guohao Lan. FreeGaze: Resource-efficient Gaze Estimation via Frequency Domain Contrastive Learning.CoRR, abs/2209.06692, 2022

work page arXiv 2022

[24] [24]

Band- limited Training and Inference for Convolutional Neural Networks

Adam Dziedzic, John Paparrizos, Sanjay Krishnan, Aaron Elmore, and Michael Franklin. Band- limited Training and Inference for Convolutional Neural Networks. InICML, pages 1745–1754, 2019

2019

[25] [25]

Using Compression to Speed Up Image Classifica- tion in Artificial Neural Networks

Dan Fu and Gabriel Guimaraes. Using Compression to Speed Up Image Classifica- tion in Artificial Neural Networks. 2016. URL https://www.danfu.org/files/ CompressionImageClassification.pdf

2016

[26] [26]

Can Biases in ImageNet Models Explain Generalization? In CVPR, pages 22184–22194, 2024

Paul Gavrikov and Janis Keuper. Can Biases in ImageNet Models Explain Generalization? In CVPR, pages 22184–22194, 2024

2024

[27] [27]

Wichmann, and Wieland Brendel

Robert Geirhos, Patricia Rubisch, Claudio Michaelis, Matthias Bethge, Felix A. Wichmann, and Wieland Brendel. ImageNet-trained CNNs are Biased Towards Texture; Increasing Shape Bias Improves Accuracy and Robustness. InICLR, 2019

2019

[28] [28]

Wichmann

Robert Geirhos, Jörn-Henrik Jacobsen, Claudio Michaelis, Richard Zemel, Wieland Brendel, Matthias Bethge, and Felix A. Wichmann. Shortcut Learning in Deep Neural Networks.Nature Machine Intelligence, 2:665–673, 2020

2020

[29] [29]

GA-Nav: Efficient Terrain Segmentation for Robot Navi- gation in Unstructured Outdoor Environments.IEEE Robotics and Automation Letters, 7(3): 8138–8145, 2022

Tianrui Guan, Divya Kothandaraman, Rohan Chandra, Adarsh Jagan Sathyamoorthy, Kasun Weerakoon, and Dinesh Manocha. GA-Nav: Efficient Terrain Segmentation for Robot Navi- gation in Unstructured Outdoor Environments.IEEE Robotics and Automation Letters, 7(3): 8138–8145, 2022

2022

[30] [30]

Xintong Han, Zuxuan Wu, Zhe Wu, Ruichi Yu, and Larry S. Davis. VITON: An Image-Based Virtual Try-On Network. InCVPR, pages 7543–7552, 2018

2018

[31] [31]

Deep Residual Learning for Image Recognition

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep Residual Learning for Image Recognition. InCVPR, pages 770–778, 2016

2016

[32] [32]

Introducing eurosat: A novel dataset and deep learning benchmark for land use and land cover classification

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Introducing eurosat: A novel dataset and deep learning benchmark for land use and land cover classification. In IGARSS 2018-2018 IEEE International Geoscience and Remote Sensing Symposium, pages 204–207. IEEE, 2018

2018

[33] [33]

Patrick Helber, Benjamin Bischke, Andreas Dengel, and Damian Borth. Eurosat: A novel dataset and deep learning benchmark for land use and land cover classification.IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2019

2019

[34] [34]

Dietterich

Dan Hendrycks and Thomas G. Dietterich. Benchmarking Neural Network Robustness to Common Corruptions and Perturbations. InICLR, 2019

2019

[35] [35]

SPIDER-colorectal dataset

HistAI. SPIDER-colorectal dataset. https://huggingface.co/datasets/histai/ SPIDER-colorectal, 2025. Accessed: 2026-01-07

2025

[36] [36]

Inflammation

HMB302. Inflammation. https://hmb302.ca/chapters/inflammation/, 2023. Online histology and pathology educational resource. Accessed: 2026-01-07

2023

[37] [37]

Le, Mark Sandler, Bo Chen, Wei- jun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, and Yukun Zhu

Andrew Howard, Ruoming Pang, Hartwig Adam, Quoc V . Le, Mark Sandler, Bo Chen, Wei- jun Wang, Liang-Chieh Chen, Mingxing Tan, Grace Chu, Vijay Vasudevan, and Yukun Zhu. Searching for MobileNetV3. InICCV, pages 1314–1324, 2019

2019

[38] [38]

Measuring the Tendency of CNNs to Learn Surface Statistical Regularities, 2017

Jason Jo and Yoshua Bengio. Measuring the Tendency of CNNs to Learn Surface Statistical Regularities, 2017

2017

[39] [39]

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky. Learning Multiple Layers of Features from Tiny Images. Technical report, 2009. 26

2009

[40] [40]

Sustainable Clothing Design: Use Matters.Journal of Design Research, 10(1–2):121–139, 2012

Kirsi Laitala and Casper Boks. Sustainable Clothing Design: Use Matters.Journal of Design Research, 10(1–2):121–139, 2012

2012

[41] [41]

Unmasking Clever Hans Predictors and Assessing What Machines Really Learn.Nature Communications, 10(1), 2019

Sebastian Lapuschkin, Stephan Wäldchen, Alexander Binder, Grégoire Montavon, Wojciech Samek, and Klaus-Robert Müller. Unmasking Clever Hans Predictors and Assessing What Machines Really Learn.Nature Communications, 10(1), 2019

2019

[42] [42]

Investigating and Explaining the Frequency Bias in Image Classification

Zhiyu Lin, Yifei Gao, and Jitao Sang. Investigating and Explaining the Frequency Bias in Image Classification. InIJCAI, pages 717–723, 2022

2022

[43] [43]

Exploring Semantic Segmentation on the DCT Repre- sentation

Shao-Yuan Lo and Hsueh-Ming Hang. Exploring Semantic Segmentation on the DCT Repre- sentation. In1st ACM International Conference on Multimedia in Asia (MMASIA), pages 1–6, 2019

2019

[44] [44]

Automatic Shortcut Removal for Self-supervised Representation Learning

Matthias Minderer, Olivier Bachem, Neil Houlsby, and Michael Tschannen. Automatic Shortcut Removal for Self-supervised Representation Learning. InICML, 2020

2020

[45] [45]

Woodhead Publishing, 2018

Subramanian Senthilkannan Muthu.Circular Economy in Textiles and Apparel: Processing, Manufacturing, and Design. Woodhead Publishing, 2018

2018

[46] [46]

Uncovering and Correct- ing Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis.Diagnostics, 12 (1), 2022

Meike Nauta, Robert Walsh, Andrew Dubowski, and Christin Seifert. Uncovering and Correct- ing Shortcut Learning in Machine Learning Models for Skin Cancer Diagnosis.Diagnostics, 12 (1), 2022

2022

[47] [47]

SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models, 2025

Dmitry Nechaev, Alexey Pchelnikov, and Ekaterina Ivanova. SPIDER: A Comprehensive Multi-Organ Supervised Pathology Dataset and Baseline Models, 2025

2025

[48] [48]

Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge

Hongjing Niu, Hanting Li, Feng Zhao, and Bin Li. Roadblocks for Temporarily Disabling Shortcuts and Learning New Knowledge. InNeurIPS, pages 29064–29075, 2022

2022

[49] [49]

Fast Vision Transformers with HiLo Attention

Zizheng Pan, Jianfei Cai, and Bohan Zhuang. Fast Vision Transformers with HiLo Attention. InNeurIPS, pages 14541–14554, 2022

2022

[50] [50]

Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning, 2018

Nicolas Papernot and Patrick McDaniel. Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning, 2018

2018

[51] [51]

How Do Vision Transformers Work? InICLR, 2022

Namuk Park and Songkuk Kim. How Do Vision Transformers Work? InICLR, 2022

2022

[52] [52]

Gradient Starvation: A Learning Proclivity in Neural Networks

Mohammad Pezeshki, Oumar Kaba, Yoshua Bengio, Aaron C Courville, Doina Precup, and Guillaume Lajoie. Gradient Starvation: A Learning Proclivity in Neural Networks. InNeurIPS, volume 34, pages 1256–1272, 2021

2021

[53] [53]

URL https://docs.pytorch

PyTorch.PyTorch — ResNet-50 Model Documentation, 2025. URL https://docs.pytorch. org/vision/main/models/generated/torchvision.models.resnet50.html. Ac- cessed: 2026-01-07

2025

[54] [54]

Learning Transferable Visual Models From Natural Language Supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agar- wal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning Transferable Visual Models From Natural Language Supervision. InICML, pages 8748–8763, 2021

2021

[55] [55]

On the Spectral Bias of Neural Networks

Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. On the Spectral Bias of Neural Networks. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors,PMLR, volume 97, pages 5301–5310, 2019

2019

[56] [56]

Ramaswamy, Sunnie S

Vikram V . Ramaswamy, Sunnie S. Y . Kim, Ruth Fong, and Olga Russakovsky. Overlooked Factors in Concept-Based Explanations: Dataset Choice, Concept Learnability, and Human Capability. InCVPR, pages 10932–10941, 2023

2023

[57] [57]

Global Filter Networks for Image Classification

Yongming Rao, Wenliang Zhao, Zheng Zhu, Jiwen Lu, and Jie Zhou. Global Filter Networks for Image Classification. InNeurIPS, pages 980–993, 2021

2021

[58] [58]

The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies

Basri Ronen, David Jacobs, Yoni Kasten, and Shira Kritchman. The Convergence Rate of Neural Networks for Learned Functions of Different Frequencies. InNeurIPS, volume 32, 2019. 27

2019

[59] [59]

The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG

Samuel Felipe dos Santos, Nicu Sebe, and Jurandy Almeida. The Good, The Bad, and The Ugly: Neural Networks Straight From JPEG. In27th IEEE International Conference on Image Processing (ICIP), pages 1896–1900, 2020

1900

[60] [60]

The Pitfalls of Simplicity Bias in Neural Networks

Harshay Shah, Kaustav Tamuly, Aditi Raghunathan, Prateek Jain, and Praneeth Netrapalli. The Pitfalls of Simplicity Bias in Neural Networks. InNeurIPS, 2020

2020

[61] [61]

Road Recognition for Autonomous Vehicles Based on Intelligent Tire and SE-CNN

Runwu Shi, Shichun Yang, Yuyi Chen, Rui Wang, Jiayi Lu, Zhaowen Pang, and Yaoguang Cao. Road Recognition for Autonomous Vehicles Based on Intelligent Tire and SE-CNN. In Intelligent Systems and Pattern Recognition, volume 1589, pages 291–305. 2022

2022

[62] [62]

TextileNet: Material taxonomy-based fashion textile dataset

Shu Zhong. TextileNet: Material taxonomy-based fashion textile dataset. https://github. com/hahashu/TextileNet, 2023. Accessed: 2026-01-07

2023

[63] [63]

The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format.Proc

Utku Sirin and Stratos Idreos. The Image Calculator: 10x Faster Image-AI Inference by Replacing JPEG with Self-designing Storage Format.Proc. ACM Manag. Data, 2(1), 2024

2024

[64] [64]

Frequency-Store: Scaling Image AI by A Column-Store for Images

Utku Sirin, Victoria Kauffman, Aadit Saluja, Florian Klein, Jeremy Hsu, and Stratos Idreos. Frequency-Store: Scaling Image AI by A Column-Store for Images. InCIDR, 2025

2025

[65] [65]

Srinidhi, Ozan Ciga, and Anne L

Chetan L. Srinidhi, Ozan Ciga, and Anne L. Martel. Deep neural network models for computa- tional histopathology: A survey.Medical Image Analysis, 67, 2021

2021

[66] [66]

Majaj, and Denis G

Ajay Subramanian, Elena Sizikova, Najib J. Majaj, and Denis G. Pelli. Spatial-frequency Channels, Shape Bias, and Adversarial Robustness. InNeurIPS, 2023

2023

[67] [67]

Neural Redshift: Random Networks Are Not Random Functions

Damien Teney, Armand Mihai Nicolicioiu, Valentin Hartmann, and Ehsan Abbasnejad. Neural Redshift: Random Networks Are Not Random Functions. InCVPR, pages 4786–4796, 2024

2024

[68] [68]

Training Data-Efficient Image Transformers & Distillation Through Attention

Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, and Herve Jegou. Training Data-Efficient Image Transformers & Distillation Through Attention. In ICML, pages 10347–10357, 2021

2021

[69] [69]

MaxViT: Multi-axis Vision Transformer

Zhengzhong Tu, Hossein Talebi, Han Zhang, Feng Yang, Peyman Milanfar, Alan Bovik, and Yinxiao Li. MaxViT: Multi-axis Vision Transformer. InECCV, page 459–479, 2022

2022

[70] [70]

Griffiths

Shikhar Tuli, Ishita Dasgupta, Erin Grant, and Thomas L. Griffiths. Are Convolutional Neural Networks or Transformers More Like Human Vision? InProceedings of the 43rd Annual Meeting of the Cognitive Science Society, pages 1844–1850, 2021

2021

[71] [71]

Interpretable Neural Network Classification Model Using First-order Logic Rules.Neurocomputing, 614(1):128–840, 2025

Haiming Tuo, Zuqiang Meng, Zihao Shi, and Daosheng Zhang. Interpretable Neural Network Classification Model Using First-order Logic Rules.Neurocomputing, 614(1):128–840, 2025

2025

[72] [72]

E-commerce Worldwide—Statistics & Facts

Koen van Gelder. E-commerce Worldwide—Statistics & Facts. https://www.statista. com/topics/871/online-shopping/, 2025. Accessed: 2026-01-07

2025

[73] [73]

Mike Walmsley, Chris Lintott, Tobias Géron, Sandor Kruk, Coleman Krawczyk, Kyle W Willett, Steven Bamford, Lee S Kelvin, Lucy Fortson, Yarin Gal, William Keel, Karen L Masters, Vihang Mehta, Brooke D Simmons, Rebecca Smethurst, Lewis Smith, Elisabeth M Baeten, and Christine Macmillan. Galaxy Zoo DECaLS: Detailed Visual Morphology Measurements from V olunt...

2022

[74] [74]

Lipton, and Eric P

Haohan Wang, Songwei Ge, Zachary C. Lipton, and Eric P. Xing. Learning Robust Global Representations by Penalizing Local Predictive Power. InNeurIPS, pages 10506–10518, 2019

2019

[75] [75]

Haohan Wang, Xindi Wu, Zeyi Huang, and Eric P. Xing. High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks. InCVPR, pages 8681–8691, 2020

2020

[76] [76]

Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice

Peihao Wang, Wenqing Zheng, Tianlong Chen, and Zhangyang Wang. Anti-Oversmoothing in Deep Vision Transformers via the Fourier Domain Analysis: From Theory to Practice. InICLR, 2022. 28

2022

[77] [77]

What Do Neural Networks Learn in Image Classification? A Frequency Shortcut Perspective

Shunxin Wang, Raymond Veldhuis, Christoph Brune, and Nicola Strisciuglio. What Do Neural Networks Learn in Image Classification? A Frequency Shortcut Perspective. InICCV, pages 1433–1442, 2023

2023

[78] [78]

A Survey on the Robustness of Computer Vision Models against Common Corruptions, 2024

Shunxin Wang, Raymond Veldhuis, Christoph Brune, and Nicola Strisciuglio. A Survey on the Robustness of Computer Vision Models against Common Corruptions, 2024

2024

[79] [79]

Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization

Shunxin Wang, Raymond Veldhuis, and Nicola Strisciuglio. Do ImageNet-trained Models Learn Shortcuts? The Impact of Frequency Shortcuts on Generalization. InCVPR, pages 25198–25207, 2025

2025

[80] [80]

VTC-LFC: Vision Transformer Compression with Low-Frequency Components

Zhenyu Wang, Hao Luo, Pichao W ANG, Feng Ding, Fan Wang, and Hao Li. VTC-LFC: Vision Transformer Compression with Low-Frequency Components. InNeurIPS, pages 13974–13988, 2022

2022