arxiv: 2605.14727 · v1 · submitted 2026-05-14 · 💻 cs.CV

Recognition: no theorem link

CHASM: Cross-frequency Harmonized Axis-Separable Mixing for Spectral Token Operators

Pengcheng Fang , Hongli Chen , Yuxia Chen , Tengjiao Sun , Jiaxin Liu , Xiaohao Cai

Authors on Pith no claims yet

Pith reviewed 2026-05-15 04:35 UTC · model grok-4.3

classification 💻 cs.CV

keywords spectral token mixersFourier transformscross-frequency harmonizationMRI reconstructionimage segmentationaxis-separable mixing

0 comments

The pith

CHASM shares one channel eigenbasis across frequencies while keeping per-frequency positive gains to improve spectral token mixers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Spectral token mixers based on Fourier transforms model global interactions efficiently but often fail to align channel directions across frequencies. CHASM provides a middle ground by sharing a learned channel eigenbasis among all frequencies and retaining individual positive spectral gains per frequency. This shared basis makes channel directions comparable across the spectrum while the gains preserve frequency-specific adaptivity. The operator is applied separably along height and width axes as a drop-in replacement inside existing backbones. Controlled same-backbone experiments show consistent gains over prior spectral mixers in accelerated MRI reconstruction, undersampled MRI segmentation, and natural-image reconstruction.

Core claim

CHASM separates a shared channel eigenbasis, used by every frequency, from frequency-specific positive spectral gains, creating cross-frequency harmonization that strengthens spectral token operators when inserted into standard vision backbones.

What carries the argument

Shared channel eigenbasis spectral operator with per-frequency positive gains, applied separably along spatial axes.

If this is right

Higher reconstruction quality in accelerated MRI tasks compared to same-backbone baselines.
Improved segmentation accuracy on undersampled MRI data.
Better results in natural-image reconstruction using the same backbone.
Ablations confirm that dropping the shared-basis constraint weakens the observed benefit.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same shared-basis idea could be tested in other frequency-domain operators beyond Fourier mixers.
Coherent sampling geometry may prove important for realizing cross-frequency benefits in related architectures.
The structured separation of shared and specific components might help control parameter count while retaining adaptivity.

Load-bearing premise

Enforcing a shared channel eigenbasis across frequencies supplies a useful inductive bias whose benefit is not merely an artifact of extra parameters or particular training setups.

What would settle it

An experiment in which removing the shared-basis constraint leaves performance unchanged or randomizing coherent sampling geometry eliminates the reported gains.

Figures

Figures reproduced from arXiv: 2605.14727 by Hongli Chen, Jiaxin Liu, Pengcheng Fang, Tengjiao Sun, Xiaohao Cai, Yuxia Chen.

**Figure 1.** Figure 1: Overview of CHASM. (a) Standard spectral mixers often learn independent channel operators for different [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Qualitative comparison on the fastMRI and CC359 datasets under single-coil settings. (a) Reconstruction [PITH_FULL_IMAGE:figures/full_fig_p008_2.png] view at source ↗

read the original abstract

Spectral token mixers based on Fourier transforms provide an efficient way to model global interactions in visual feature maps. Existing designs often either apply filter-wise spectral responses along fixed channel axes, or learn adaptive frequency-indexed channel mixing without explicitly aligning the channel directions used across frequencies. We propose CHASM, a Cross-frequency Harmonized Axis-Separable Mixer, as a structured middle ground. CHASM separates what should be shared from what should remain frequency-specific: all frequencies share a learned channel eigenbasis, while each frequency retains its own positive spectral gains. The shared basis makes channel directions comparable across the spectrum, whereas the positive gains preserve local spectral adaptivity. CHASM applies this structured operator separably along the height and width axes and is used as a drop-in replacement mixer inside existing backbones. We provide a structural characterization of the shared-basis operator family and evaluate CHASM through controlled same-backbone comparisons. Across accelerated MRI reconstruction, undersampled MRI segmentation, and natural-image reconstruction, CHASM consistently improves over same-backbone spectral-mixer baselines. Ablations show that removing the shared-basis constraint weakens performance, and randomizing coherent sampling geometry substantially reduces the gain, supporting cross-frequency harmonization as a useful inductive bias for spectral token operators.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CHASM gives a clean middle-ground spectral mixer via shared channel eigenbasis plus per-frequency gains, but the reported gains rest on unverified parameter counts and thin experimental details.

read the letter

CHASM defines a spectral token mixer that shares one learned channel eigenbasis across frequencies while keeping strictly positive per-frequency gains, then applies the whole thing separably along height and width. This sits between fixed filter designs and fully adaptive frequency-indexed mixing, and the paper supplies a structural characterization of the resulting operator family. That characterization and the explicit separation of shared versus frequency-specific parts are the concrete new elements. The evaluations replace the mixer inside existing backbones and show gains on accelerated MRI reconstruction, undersampled MRI segmentation, and natural-image reconstruction, with ablations that remove the shared-basis constraint and report weaker results. The design avoids obvious circularity by tying the claimed benefit to external task performance rather than internal fitting metrics. The main soft spot is the parameter-count issue the stress-test note flags. The abstract gives no numbers and does not confirm that the ablated variants or the baseline spectral mixers are matched in total parameters, so it remains possible that the improvements come from extra capacity rather than the cross-frequency harmonization bias. No error bars, exact implementation details, or statistical tests appear in the provided summary either, which leaves the robustness of the “consistent improvements” claim hard to judge. This paper is for people working on efficient Fourier-based vision backbones or medical imaging pipelines that already use spectral token mixers. A reader who wants a new inductive bias for making channel directions comparable across frequencies would get something usable from the operator description and the ablation logic. It deserves a serious referee because the core construction is well-specified and the tasks are relevant, even though the empirical controls will need tightening in revision.

Referee Report

1 major / 2 minor

Summary. The paper proposes CHASM, a spectral token mixer for visual feature maps that enforces a shared learned channel eigenbasis across frequencies while allowing per-frequency positive spectral gains, applied axis-separably along height and width. It is positioned as a drop-in replacement in existing backbones and is evaluated via controlled same-backbone comparisons on accelerated MRI reconstruction, undersampled MRI segmentation, and natural-image reconstruction tasks, where it reports consistent gains over spectral-mixer baselines. Ablations are cited to show that removing the shared-basis constraint weakens performance and that randomizing sampling geometry reduces the benefit, framing the shared eigenbasis as a useful inductive bias for cross-frequency harmonization.

Significance. If the reported gains prove robust after parameter-matched controls and statistical verification, CHASM would supply a concrete structural prior for spectral operators that separates shared channel directions from frequency-specific scaling. This could be useful in domains like medical imaging where global frequency interactions matter and where existing Fourier-based mixers lack explicit cross-frequency alignment. The structural characterization of the shared-basis family is a positive element that could support future analysis.

major comments (1)

[Abstract] Abstract and operator description: the central claim that the shared channel eigenbasis supplies a useful inductive bias (rather than a capacity artifact) rests on same-backbone comparisons and ablations, yet no statement confirms that CHASM and the frequency-indexed baselines have identical parameter counts. The construction (shared eigenbasis plus per-frequency gains) appears to add parameters relative to purely frequency-indexed mixing; without explicit matching or an ablation that isolates the basis while holding total parameters fixed, the performance delta cannot be attributed to cross-frequency harmonization.

minor comments (2)

[Abstract] The abstract states 'consistent improvements' and 'ablations show' but supplies no quantitative values, error bars, exact baseline implementations, data splits, or statistical tests; these details are required to assess robustness.
[Method] The positive spectral gains are described as 'positive' but the precise constraint (e.g., ReLU, softplus, or projection) and its effect on the operator's spectral properties should be stated explicitly.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the thoughtful review and for identifying the need to clarify parameter counts in our comparisons. We address the concern directly below and will update the manuscript to make the parameter analysis explicit.

read point-by-point responses

Referee: [Abstract] Abstract and operator description: the central claim that the shared channel eigenbasis supplies a useful inductive bias (rather than a capacity artifact) rests on same-backbone comparisons and ablations, yet no statement confirms that CHASM and the frequency-indexed baselines have identical parameter counts. The construction (shared eigenbasis plus per-frequency gains) appears to add parameters relative to purely frequency-indexed mixing; without explicit matching or an ablation that isolates the basis while holding total parameters fixed, the performance delta cannot be attributed to cross-frequency harmonization.

Authors: We appreciate this observation. In fact, CHASM uses substantially fewer parameters than a purely frequency-indexed mixer. A frequency-indexed baseline applies an independent channel-mixing matrix at each frequency, incurring O(F·C²) parameters. CHASM instead learns one shared eigenbasis (O(C²)) and a scalar positive gain per frequency (O(F)), for a total of O(C² + F) parameters. The reported gains are therefore obtained with a strictly smaller model, which reinforces rather than undermines the value of the shared-basis inductive bias. We will revise the manuscript to (i) report exact parameter counts for CHASM and every baseline in the experimental tables, (ii) add a brief statement in the abstract and operator section confirming the parameter relationship, and (iii) include an additional ablation that inflates the baseline capacity to match or exceed CHASM’s parameter budget. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical gains rest on controlled external-task comparisons, not self-defining derivations

full rationale

The paper presents CHASM as an architectural operator (shared channel eigenbasis plus per-frequency positive gains, applied axis-separably) and supports its utility via same-backbone empirical comparisons and ablations on accelerated MRI reconstruction, undersampled MRI segmentation, and natural-image tasks. No equations, structural characterizations, or first-principles derivations are shown that reduce the reported performance deltas to quantities defined by the same fitted parameters or by self-citation chains. The ablations (removing shared-basis constraint, randomizing sampling geometry) are described as independent checks on the inductive bias, and the evaluation framing explicitly uses external benchmarks rather than internal redefinitions. This keeps the central claim self-contained against the provided evidence.

Axiom & Free-Parameter Ledger

2 free parameters · 2 axioms · 0 invented entities

Based on abstract only; the design rests on the domain assumption that a shared eigenbasis renders channel directions comparable across frequencies and that positive per-frequency gains preserve useful spectral adaptivity. No numerical free parameters are stated; the eigenbasis and gains are learned during training.

free parameters (2)

shared channel eigenbasis
Learned matrix whose columns define common channel directions across frequencies.
per-frequency positive spectral gains
Learned positive scalars, one per frequency bin, that scale the shared basis.

axioms (2)

domain assumption Shared eigenbasis makes channel directions comparable across the spectrum
Invoked to justify why the shared basis is beneficial.
domain assumption Positive gains preserve local spectral adaptivity
Stated as the reason for keeping gains frequency-specific and positive.

pith-pipeline@v0.9.0 · 5537 in / 1504 out tokens · 38218 ms · 2026-05-15T04:35:05.204424+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

66 extracted references · 66 canonical work pages · 1 internal anchor

[1]

International Conference on Learning Representations , year =

Fourier Neural Operator for Parametric Partial Differential Equations , author =. International Conference on Learning Representations , year =

work page
[2]

International Conference on Learning Representations , year =

Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers , author =. International Conference on Learning Representations , year =

work page
[3]

Advances in Neural Information Processing Systems , volume =

Global Filter Networks for Image Classification , author =. Advances in Neural Information Processing Systems , volume =

work page
[4]

Advances in Neural Information Processing Systems , volume =

Fast Fourier Convolution , author =. Advances in Neural Information Processing Systems , volume =

work page
[5]

, journal =

Lustig, Michael and Donoho, David and Pauly, John M. , journal =. Sparse. 2007 , doi =

work page 2007
[6]

and Bruno, Mary and Defazio, Aaron and Parente, Marc and Geras, Krzysztof J

Zbontar, Jure and Knoll, Florian and Sriram, Anuroop and Muckley, Matthew J. and Bruno, Mary and Defazio, Aaron and Parente, Marc and Geras, Krzysztof J. and Katsnelson, Joe and Chandarana, Hersh and others , journal =

work page
[7]

2015 , doi =

Ronneberger, Olaf and Fischer, Philipp and Brox, Thomas , booktitle =. 2015 , doi =

work page 2015
[8]

International Conference on Learning Representations , year =

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale , author =. International Conference on Learning Representations , year =

work page
[9]

First Conference on Language Modeling , year =

Mamba: Linear-Time Sequence Modeling with Selective State Spaces , author =. First Conference on Language Modeling , year =

work page
[10]

International Conference on Learning Representations (ICLR) , year =

Fourier Neural Operator for Parametric Partial Differential Equations , author =. International Conference on Learning Representations (ICLR) , year =

work page
[11]

Journal of Machine Learning Research , volume =

Fourier Neural Operator with Learned Deformations for PDEs on General Geometries , author =. Journal of Machine Learning Research , volume =. 2023 , url =

work page 2023
[12]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Global Filter Networks for Image Classification , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

work page
[13]

International Conference on Learning Representations (ICLR) , year =

Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators , author =. International Conference on Learning Representations (ICLR) , year =

work page
[14]

Advances in Neural Information Processing Systems (NeurIPS) , year =

Fast Fourier Convolution , author =. Advances in Neural Information Processing Systems (NeurIPS) , year =

work page
[15]

European Conference on Computer Vision (ECCV) , pages =

When Fast Fourier Transform Meets Transformer for Image Restoration , author =. European Conference on Computer Vision (ECCV) , pages =

work page
[16]

and Namboodiri, Vinay P

Patro, Badri N. and Namboodiri, Vinay P. and Agneeswaran, Vijay S. , title =. Proceedings of the Winter Conference on Applications of Computer Vision (WACV) , month =. 2025 , pages =

work page 2025
[17]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =

Liu, Xiaoyi and Tang, Hao , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) , month =. 2025 , pages =

work page 2025
[18]

2025 , doi =

Chen, Hongli and Fang, Pengcheng and Chen, Yuxia and Ren, Yingxuan and Hao, Jing and Tang, Fangfang and Cai, Xiaohao and Shan, Shanshan and Liu, Feng , journal =. 2025 , doi =

work page 2025
[19]

2025 , doi =

Fang, Pengcheng and Chen, Hongli and Yao, Guangzhen and Shi, Jian and Tang, Fangfang and Cai, Xiaohao and Shan, Shanshan and Liu, Feng , journal =. 2025 , doi =

work page 2025
[20]

Tolstikhin, Ilya O. and Houlsby, Neil and Kolesnikov, Alexander and Beyer, Lucas and Zhai, Xiaohua and Unterthiner, Thomas and Yung, Jessica and Steiner, Andreas and Keysers, Daniel and Uszkoreit, Jakob and Lucic, Mario and Dosovitskiy, Alexey , booktitle =

work page
[21]

2022 , doi =

Yu, Weihao and Luo, Mi and Zhou, Pan and Si, Chenyang and Zhou, Yichen and Wang, Xinchao and Feng, Jiashi and Yan, Shuicheng , booktitle =. 2022 , doi =

work page 2022
[22]

European Conference on Computer Vision (ECCV) , pages =

Frequency-Spatial Entanglement Learning for Camouflaged Object Detection , author =. European Conference on Computer Vision (ECCV) , pages =. 2024 , url =

work page 2024
[23]

IEEE Transactions on Medical Imaging , volume=

Simultaneous truth and performance level estimation (STAPLE): an algorithm for the validation of image segmentation , author=. IEEE Transactions on Medical Imaging , volume=. 2004 , publisher=

work page 2004
[24]

Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

Scene parsing through ade20k dataset , author=. Proceedings of the IEEE conference on computer vision and pattern recognition , pages=

work page
[25]

Electronics letters , volume=

The scope of PSNR in image and video quality assessment , author=. Electronics letters , volume=. 2008 , publisher=

work page 2008
[26]

IEEE Transactions on Image Processing , volume=

Image quality assessment: from error visibility to structural similarity , author=. IEEE Transactions on Image Processing , volume=. 2004 , publisher=

work page 2004
[27]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

A convnet for the 2020s , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[28]

European conference on computer vision , pages=

Swin-unet: Unet-like pure transformer for medical image segmentation , author=. European conference on computer vision , pages=. 2022 , organization=

work page 2022
[29]

HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction

HiFi-Mamba: Dual-Stream W-Laplacian Enhanced Mamba for High-Fidelity MRI Reconstruction , author=. arXiv preprint arXiv:2508.09179 , year=

work page internal anchor Pith review Pith/arXiv arXiv
[30]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

GFNet: Global filter networks for visual recognition , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2023 , publisher=

work page 2023
[31]

Proceedings of the AAAI Conference on Artificial Intelligence , volume=

Fft-based dynamic token mixer for vision , author=. Proceedings of the AAAI Conference on Artificial Intelligence , volume=

work page
[33]

arXiv preprint arXiv:2502.18394 , year=

The fft strikes again: An efficient alternative to self-attention , author=. arXiv preprint arXiv:2502.18394 , year=

work page arXiv
[34]

Proceedings of the 2022 Conference of the north American chapter of the Association for Computational Linguistics: human language technologies , pages=

Fnet: Mixing tokens with fourier transforms , author=. Proceedings of the 2022 Conference of the north American chapter of the Association for Computational Linguistics: human language technologies , pages=

work page 2022
[35]

Proceedings of the IEEE/CVF international conference on computer vision , pages=

Adaptive frequency filters as efficient global token mixers , author=. Proceedings of the IEEE/CVF international conference on computer vision , pages=

work page
[36]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

Spanet: Frequency-balancing token mixer using spectral pooling aggregation modulation , author=. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages=

work page
[37]

arXiv preprint arXiv:2111.13587 , year=

Adaptive fourier neural operators: Efficient token mixers for transformers , author=. arXiv preprint arXiv:2111.13587 , year=

work page arXiv
[38]

IEEE Transactions on Artificial Intelligence , year=

Fourier-driven Lightweight Token Mixing Model for Efficient Time Series Forecasting , author=. IEEE Transactions on Artificial Intelligence , year=

work page
[39]

Computers in Biology and Medicine , volume=

A global-frequency-domain network for medical image segmentation , author=. Computers in Biology and Medicine , volume=. 2023 , publisher=

work page 2023
[40]

IEEE Robotics and Automation Letters , year=

A frequency-based attention neural network and subject-adaptive transfer learning for sEMG hand gesture classification , author=. IEEE Robotics and Automation Letters , year=

work page
[41]

2024 IEEE International Symposium on Biomedical Imaging (ISBI) , pages=

GLFNET: Global-Local (frequency) Filter Networks for efficient medical image segmentation , author=. 2024 IEEE International Symposium on Biomedical Imaging (ISBI) , pages=. 2024 , organization=

work page 2024
[42]

Computers in Biology and Medicine , volume=

Dual-domain faster Fourier convolution based network for MR image reconstruction , author=. Computers in Biology and Medicine , volume=. 2024 , publisher=

work page 2024
[43]

Medical Image Analysis , volume=

Fourier Convolution Block with global receptive field for MRI reconstruction , author=. Medical Image Analysis , volume=. 2025 , publisher=

work page 2025
[44]

Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

Taco: Enhancing multimodal in-context learning via task mapping-guided sequence configuration , author=. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing , pages=

work page 2025
[45]

Yanshu Li and Yi Cao and Hongyang He and Qisen Cheng and Xiang Fu and Xi Xiao and Tianyang Wang and Ruixiang Tang , booktitle=. M. 2025 , url=

work page 2025
[46]

arXiv preprint arXiv:2508.07871 , year=

CATP: Contextually Adaptive Token Pruning for Efficient and Enhanced Multimodal In-Context Learning , author=. arXiv preprint arXiv:2508.07871 , year=

work page arXiv
[47]

arXiv preprint arXiv:2505.17097 , year=

Cama: Enhancing multimodal in-context learning with context-aware modulated attention , author=. arXiv preprint arXiv:2505.17097 , year=

work page arXiv
[48]

2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pages=

Frequency-aware Adaptive Filtering Network for Few-Shot Medical Image Segmentation , author=. 2024 IEEE International Conference on Bioinformatics and Biomedicine (BIBM) , pages=. 2024 , organization=

work page 2024
[49]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Deep frequency filtering for domain generalization , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[50]

Available at SSRN 5509758 , year=

Brain-Inspired Frequency-Based Transformer with Neuromorphic Memory Consolidation for Natural Language Understanding , author=. Available at SSRN 5509758 , year=

work page
[51]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

Frequency-adaptive dilated convolution for semantic segmentation , author=. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages=

work page
[52]

arXiv preprint arXiv:2403.15360 , year=

Simba: Simplified mamba-based architecture for vision and multivariate time series , author=. arXiv preprint arXiv:2403.15360 , year=

work page arXiv
[53]

International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=

Global k-space interpolation for dynamic MRI reconstruction using masked image modeling , author=. International Conference on Medical Image Computing and Computer-Assisted Intervention , pages=. 2023 , organization=

work page 2023
[54]

arXiv preprint arXiv:2507.15364 , year=

EEG-based Epileptic Prediction via a Two-stage Channel-aware Set Transformer Network , author=. arXiv preprint arXiv:2507.15364 , year=

work page arXiv
[55]

, author=

Accurate MRI Reconstruction via Multi-Domain Recurrent Networks. , author=. IJCAI , pages=

work page
[56]

2025 International Joint Conference on Neural Networks (IJCNN) , pages=

TSNet: A Transformer-based Medical Image Segmentation Algorithm for Improving Channel Interaction , author=. 2025 International Joint Conference on Neural Networks (IJCNN) , pages=. 2025 , organization=

work page 2025
[57]

iScience , volume=

StarMA Net: A star-shape multi-scale attention network for medical imaging classification , author=. iScience , volume=. 2025 , publisher=

work page 2025
[58]

European Conference on Computer Vision , pages=

When fast fourier transform meets transformer for image restoration , author=. European Conference on Computer Vision , pages=. 2024 , organization=

work page 2024
[59]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Scconv: Spatial and channel reconstruction convolution for feature redundancy , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[60]

An open, multi-vendor, multi-field-strength brain

Souza, Roberto and Lucena, Oeslle and Garrafa, Julia and Gobbi, David and Saluzzi, Marina and Appenzeller, Simone and Rittner, Let. An open, multi-vendor, multi-field-strength brain. NeuroImage , volume =. 2018 , doi =

work page 2018
[61]

arXiv preprint arXiv:2603.15569 , year=

Mamba-3: Improved sequence modeling using state space principles , author=. arXiv preprint arXiv:2603.15569 , year=

work page arXiv
[62]

2009 , doi =

Deng, Jia and Dong, Wei and Socher, Richard and Li, Li-Jia and Li, Kai and Fei-Fei, Li , booktitle =. 2009 , doi =

work page 2009
[63]

Menze, Bjoern H. and Jakab, Andras and Bauer, Stefan and Kalpathy-Cramer, Jayashree and Farahani, Keyvan and Kirby, Justin and Burren, Yuliya and Porz, Nicole and Slotboom, Johannes and Wiest, Roland and others , journal =. The Multimodal Brain Tumor Image Segmentation Benchmark (. 2015 , doi =

work page 2015
[64]

IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=

Metaformer baselines for vision , author=. IEEE Transactions on Pattern Analysis and Machine Intelligence , volume=. 2023 , publisher=

work page 2023
[65]

arXiv preprint arXiv:2308.13363 , year=

CS-Mixer: A Cross-Scale Vision MLP Model with Spatial-Channel Mixing , author=. arXiv preprint arXiv:2308.13363 , year=

work page arXiv
[66]

Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

Convnext v2: Co-designing and scaling convnets with masked autoencoders , author=. Proceedings of the IEEE/CVF conference on computer vision and pattern recognition , pages=

work page
[67]

2012 , doi =

Matrix Analysis , author =. 2012 , doi =

work page 2012