Recognition: 3 theorem links
· Lean TheoremRhamba: Region-Aware Hybrid Attention-Mamba Framework for Self-Supervised Learning in Resting-State fMRI
Pith reviewed 2026-05-11 00:43 UTC · model grok-4.3
The pith
Region-aware hybrid Attention-Mamba pretraining improves fMRI classification of schizophrenia and ADHD.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Rhamba integrates region-aligned patch embeddings with three masking strategies of increasing spatial specificity and compares four architectural variants during pretraining on ABIDE. After fine-tuning on COBRE and ADHD-200, the Mamba-Attention hybrid encoder-decoder records the highest average AUROC across both tasks and outperforms state-of-the-art baselines. Integrated Gradients analysis identifies contributing brain regions, and results indicate that downstream performance arises from the specific pairing of masking strategy and architecture.
What carries the argument
Region-aligned patch embeddings processed by hybrid Attention-Mamba encoder-decoder blocks under Any, Majority, or Pure masking strategies.
If this is right
- Masking strategy produces a consistent ordering of reconstruction loss but only modest and dataset-dependent effects on classification accuracy.
- The Mamba-Attention configuration achieves the highest average AUROC across the two evaluation datasets.
- Peak performance requires specific combinations of masking strategy and architecture instead of one universally best option.
- Integrated Gradients reveals the brain regions that drive predictions for each model variant.
- Rhamba exceeds state-of-the-art methods in the comparative evaluations performed.
Where Pith is reading between the lines
- Hybrid models of this form may scale more efficiently than pure attention models when handling long fMRI sequences.
- The region-aware emphasis could transfer to other neuroimaging modalities or to predicting functional connectivity patterns.
- Tuning masking specificity per target disorder may become standard practice when applying similar frameworks.
Load-bearing premise
Performance differences in the downstream tasks result from the masking strategies and hybrid architecture choices rather than dataset properties or unstated implementation details.
What would settle it
A replication on the same pretraining and fine-tuning datasets in which a pure attention model or non-region-aware masking matches or exceeds the reported AUROC of the MA hybrid would falsify the advantage of the Rhamba design.
Figures
read the original abstract
Self-supervised pretraining is promising for large-scale neuroimaging, yet the impact of region-aware masking and hybrid sequence modeling remains underexplored. In this work, we introduce Rhamba, a region-aware pretraining framework that integrates anatomically guided masking with hybrid Attention-Mamba architectures for resting state functional magnetic resonance imaging (fMRI) analysis. Models were pretrained on the ABIDE dataset using region-aligned patch embeddings and three masking strategies (Any, Majority, and Pure) with increasing spatial specificity. We evaluated four architectural variants: a Mamba only model, an Alternate architecture with interleaved Mamba and Attention blocks, and two hybrid encoder-decoder configurations (Attention-Mamba (AM) and Mamba-Attention (MA)). The pretrained models were fine-tuned on downstream classification tasks using the COBRE and ADHD-200 datasets for schizophrenia and attention-deficit/hyperactivity disorder discrimination. We employed Integrated Gradients, an explainable AI method, to identify the brain regions contributing to model predictions. Masking strategy strongly influenced reconstruction behavior, with reconstruction loss following a consistent ordering (Any > Majority > Pure). However, this trend did not directly translate into downstream performance, where differences were modest and dataset-dependent. The hybrid architecture with the MA configuration achieved the highest average AUROC across both datasets, and Rhamba outperformed state-of-the-art methods in comparative evaluation. Region-wise analysis showed that peak performance depends on the interaction between masking strategy and architecture rather than a single dominant configuration. Overall, Rhamba offers a flexible framework for balancing interpretability, scalability, and performance in large-scale fMRI representation learning.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Rhamba, a region-aware self-supervised pretraining framework for resting-state fMRI that combines anatomically guided masking strategies (Any, Majority, Pure) with hybrid Attention-Mamba sequence models. Models are pretrained on ABIDE using region-aligned patch embeddings, then fine-tuned for binary classification on COBRE (schizophrenia) and ADHD-200 datasets. Four architectures are compared (Mamba-only, Alternate, AM, MA), with the MA hybrid reported to achieve the highest average AUROC; the framework is claimed to outperform prior SOTA methods, and Integrated Gradients is used to highlight contributing brain regions. The abstract notes that masking trends in reconstruction loss do not directly translate to downstream performance, which is described as modest and dataset-dependent.
Significance. If the reported AUROC gains and outperformance hold under rigorous statistical controls, the work would offer a practical, scalable alternative to pure transformer or Mamba baselines for fMRI representation learning, with built-in region-level interpretability. The hybrid design and masking ablation could inform efficient long-sequence modeling in neuroimaging, where data efficiency and anatomical priors matter.
major comments (2)
- [Results / Comparative evaluation] Results section (AUROC tables and comparative evaluation): The central claim that the MA hybrid yields the highest average AUROC across COBRE and ADHD-200 and that Rhamba outperforms SOTA rests on modest, dataset-dependent differences without reported standard deviations, multiple random seeds, or statistical significance tests (e.g., McNemar or paired t-tests). This directly undermines the superiority assertion, as the abstract itself qualifies the downstream differences as modest.
- [Experiments / Downstream evaluation] Experimental protocol (fine-tuning and evaluation subsections): No details are provided on hyperparameter search ranges, data-split stratification, or whether the same random seeds were used across the four architectures and three masking strategies. Without these controls, observed orderings could arise from implementation variance rather than the region-aware masking plus hybrid design.
minor comments (2)
- [Methods / Architecture] Clarify the exact definition of the MA versus AM encoder-decoder configurations (e.g., which blocks are in the encoder versus decoder) and include a diagram or pseudocode for the hybrid stacking.
- [Pretraining results] The reconstruction-loss ordering (Any > Majority > Pure) is stated but not quantified with numerical values or linked to a specific figure or table; add these values for reproducibility.
Simulated Author's Rebuttal
Thank you for the constructive feedback on our manuscript. We appreciate the referee's careful reading and address the major comments point by point below. We will revise the manuscript to incorporate additional statistical rigor and experimental details where feasible.
read point-by-point responses
-
Referee: [Results / Comparative evaluation] Results section (AUROC tables and comparative evaluation): The central claim that the MA hybrid yields the highest average AUROC across COBRE and ADHD-200 and that Rhamba outperforms SOTA rests on modest, dataset-dependent differences without reported standard deviations, multiple random seeds, or statistical significance tests (e.g., McNemar or paired t-tests). This directly undermines the superiority assertion, as the abstract itself qualifies the downstream differences as modest.
Authors: We agree that the lack of standard deviations, multiple random seeds, and formal statistical tests weakens the comparative claims. The abstract correctly qualifies the differences as modest and dataset-dependent, and we do not claim large effect sizes. In the revised manuscript, we will report AUROC values with standard deviations computed over multiple random seeds and include paired statistical tests (e.g., paired t-tests or McNemar's test) to assess significance of the observed orderings. This will provide a more rigorous basis for the reported trends without overstating the results. revision: yes
-
Referee: [Experiments / Downstream evaluation] Experimental protocol (fine-tuning and evaluation subsections): No details are provided on hyperparameter search ranges, data-split stratification, or whether the same random seeds were used across the four architectures and three masking strategies. Without these controls, observed orderings could arise from implementation variance rather than the region-aware masking plus hybrid design.
Authors: We acknowledge that insufficient detail on the experimental controls limits reproducibility and the ability to rule out implementation variance. In the revised version, we will expand the experimental protocol and fine-tuning subsections to specify the hyperparameter search ranges (including learning rate, batch size, and optimizer settings), the data-split stratification approach (e.g., by site or diagnostic label to preserve class balance), and confirmation that identical random seeds were used across all architecture-masking combinations for fair comparison. revision: yes
Circularity Check
No circularity: standard empirical self-supervised pipeline
full rationale
The paper describes an empirical self-supervised pretraining framework (region-aware masking on ABIDE followed by fine-tuning on COBRE/ADHD-200) using standard hybrid Attention-Mamba architectures and Integrated Gradients for post-hoc explanation. No equations, first-principles derivations, or predictions are presented that reduce by construction to fitted parameters, self-definitions, or self-citation chains. All performance claims rest on experimental comparisons rather than any load-bearing mathematical step that imports its own inputs. The methodology is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
hybrid Attention-Mamba architectures... three masking strategies (Any, Majority, and Pure)... reconstruction loss following a consistent ordering (Any > Majority > Pure)
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
MA configuration achieved the highest average AUROC... Rhamba outperformed state-of-the-art methods
-
IndisputableMonolith/Foundation/DimensionForcing.leanalexander_duality_circle_linking unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
no mention of 8-tick period, phi-ladder, or J-cost anywhere in the architecture or analysis
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Kay, and David W
Seiji Ogawa, Tso-Ming Lee, Alan R. Kay, and David W. Tank. Brain magnetic resonance imaging with contrast dependent on blood oxygenation.Proceedings of the National Academy of Sciences, 87(24):9868–9872, 1990
1990
-
[2]
Functional mapping of the human visual cortex by magnetic resonance imaging.Science, 254(5032):716–719, 1991
Jack W Belliveau, David N Kennedy, Robert C McKinstry, Bradley R Buchbinder, Robert M Weisskoff, Mark S Cohen, JM Vevea, Thomas J Brady, and Bruce R Rosen. Functional mapping of the human visual cortex by magnetic resonance imaging.Science, 254(5032):716–719, 1991
1991
-
[3]
Functional connectivity in the motor cortex of resting human brain using echo-planar mri.Magnetic resonance in medicine, 34(4):537–541, 1995
Bharat Biswal, F Zerrin Yetkin, Victor M Haughton, and James S Hyde. Functional connectivity in the motor cortex of resting human brain using echo-planar mri.Magnetic resonance in medicine, 34(4):537–541, 1995
1995
-
[4]
Consistent resting-state networks across healthy subjects.Proceedings of the national academy of sciences, 103(37):13848–13853, 2006
Jessica S Damoiseaux, Serge ARB Rombouts, Frederik Barkhof, Philip Scheltens, Cornelis J Stam, Stephen M Smith, and Christian F Beckmann. Consistent resting-state networks across healthy subjects.Proceedings of the national academy of sciences, 103(37):13848–13853, 2006
2006
-
[5]
Resting state fmri: a personal history.Neuroimage, 62(2):938–944, 2012
Bharat B Biswal. Resting state fmri: a personal history.Neuroimage, 62(2):938–944, 2012
2012
-
[6]
The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014
Adriana Di Martino, Chao-Gan Yan, Qingyang Li, Erin Denio, Francisco X Castellanos, Kaat Alaerts, Jeffrey S Anderson, Michal Assaf, Susan Y Bookheimer, Mirella Dapretto, et al. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014
2014
-
[7]
The adhd-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience.Frontiers in systems neuroscience, 6:62, 2012
ADHD-200 consortium. The adhd-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience.Frontiers in systems neuroscience, 6:62, 2012
2012
-
[8]
Common neural patterns of substance use disorder: a seed-based resting-state functional connectivity meta-analysis.Translational Psychiatry, 15(1):190, 2025
Xiaonan Zhang, Haoyu Zhang, Yingbo Shao, Yang Li, Feifei Zhang, and Hui Zhang. Common neural patterns of substance use disorder: a seed-based resting-state functional connectivity meta-analysis.Translational Psychiatry, 15(1):190, 2025. 18
2025
-
[9]
Kevin Hilbert, Joscha Böhnlein, Charlotte Meinke, Alice V Chavanne, Till Langhammer, Lara Stumpe, Nils Winter, Ramona Leenings, Dirk Adolph, V olker Arolt, et al. Lack of evidence for predictive utility from resting state fmri data for individual exposure-based cognitive behavioral therapy outcomes: A machine learning study in two large multi-site samples...
2024
-
[10]
The history and future of resting-state functional magnetic resonance imaging.Nature, 641(8065):1121–1131, 2025
Bharat B Biswal and Lucina Q Uddin. The history and future of resting-state functional magnetic resonance imaging.Nature, 641(8065):1121–1131, 2025
2025
-
[11]
Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls.Neuroimage, 145:137–165, 2017
Mohammad R Arbabshirani, Sergey Plis, Jing Sui, and Vince D Calhoun. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls.Neuroimage, 145:137–165, 2017
2017
-
[12]
Cross-validation failure: Small sample sizes lead to large error bars.Neuroim- age, 180:68–77, 2018
Gaël Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars.Neuroim- age, 180:68–77, 2018
2018
-
[13]
Resting state fmri functional connectivity-based classification using a convolutional neural network architecture.Frontiers in neuroinformatics, 11:61, 2017
Regina J Meszlényi, Krisztian Buza, and Zoltán Vidnyánszky. Resting state fmri functional connectivity-based classification using a convolutional neural network architecture.Frontiers in neuroinformatics, 11:61, 2017
2017
-
[14]
3d-cnn based discrimination of schizophrenia using resting-state fmri.Artificial intelligence in medicine, 98:10–17, 2019
Muhammad Naveed Iqbal Qureshi, Jooyoung Oh, and Boreom Lee. 3d-cnn based discrimination of schizophrenia using resting-state fmri.Artificial intelligence in medicine, 98:10–17, 2019
2019
-
[15]
The use of fmri regional analysis to automatically detect adhd through a 3d cnn-based approach.Journal of Imaging Informatics in Medicine, 38 (1):203–216, 2025
Perihan Gül¸ sah Gülhan and Güzin Özmen. The use of fmri regional analysis to automatically detect adhd through a 3d cnn-based approach.Journal of Imaging Informatics in Medicine, 38 (1):203–216, 2025
2025
-
[16]
Identifying autism from resting-state fmri using long short-term memory networks
Nicha C Dvornek, Pamela Ventola, Kevin A Pelphrey, and James S Duncan. Identifying autism from resting-state fmri using long short-term memory networks. Ininternational workshop on machine learning in medical imaging, pages 362–370. Springer, 2017
2017
-
[17]
Characterization of early stage parkinson’s disease from resting-state fmri data using a long short-term memory network.Frontiers in Neuroimaging, 1:952084, 2022
Xueqi Guo, Sule Tinaz, and Nicha C Dvornek. Characterization of early stage parkinson’s disease from resting-state fmri data using a long short-term memory network.Frontiers in Neuroimaging, 1:952084, 2022
2022
-
[18]
A novel graph neural network framework for resting-state functional mri spatiotemporal dynamics analysis.Physica A: Statistical Mechanics and its Applications, 669: 130582, 2025
Tao Wang, Zenghui Ding, Zheng Chang, Xianjun Yang, Yanyan Chen, Meng Li, Shu Xu, and Yu Wang. A novel graph neural network framework for resting-state functional mri spatiotemporal dynamics analysis.Physica A: Statistical Mechanics and its Applications, 669: 130582, 2025
2025
-
[19]
Classification of brain disorders in rs-fmri via local-to-global graph neural networks.IEEE transactions on medical imaging, 42(2):444–455, 2022
Hao Zhang, Ran Song, Liping Wang, Lin Zhang, Dawei Wang, Cong Wang, and Wei Zhang. Classification of brain disorders in rs-fmri via local-to-global graph neural networks.IEEE transactions on medical imaging, 42(2):444–455, 2022
2022
-
[20]
Representation learning of resting state fmri with variational autoencoder.NeuroImage, 241: 118423, 2021
Jung-Hoon Kim, Yizhen Zhang, Kuan Han, Zheyu Wen, Minkyu Choi, and Zhongming Liu. Representation learning of resting state fmri with variational autoencoder.NeuroImage, 241: 118423, 2021
2021
-
[21]
Classification of mdd using a transformer classifier with large-scale multisite resting-state fmri data.Human brain mapping, 45(1):e26542, 2024
Peishan Dai, Ying Zhou, Yun Shi, Da Lu, Zailiang Chen, Beiji Zou, Kun Liu, Shenghui Liao, and REST meta MDD Consortium. Classification of mdd using a transformer classifier with large-scale multisite resting-state fmri data.Human brain mapping, 45(1):e26542, 2024
2024
-
[22]
Predicting task-related brain activity from resting-state brain dynamics with fmri transformer
Junbeom Kwon, Jungwoo Seo, Heehwan Wang, Taesup Moon, Shinjae Yoo, and Jiook Cha. Predicting task-related brain activity from resting-state brain dynamics with fmri transformer. Imaging Neuroscience, 3:imag_a_00440, 2025
2025
-
[23]
Current challenges in translational and clinical fmri and future directions
Karsten Specht. Current challenges in translational and clinical fmri and future directions. Frontiers in psychiatry, 10:924, 2020
2020
-
[24]
On the generalizability of resting-state fmri machine learning classifiers.Frontiers in human neuroscience, 8:502, 2014
Wolfgang Huf, Klaudius Kalcher, Roland N Boubela, Georg Rath, Andreas Vecsei, Peter Filzmoser, and Ewald Moser. On the generalizability of resting-state fmri machine learning classifiers.Frontiers in human neuroscience, 8:502, 2014. 19
2014
-
[25]
Reproducible brain-wide association studies require thousands of individuals
Scott Marek, Brenden Tervo-Clemmens, Finnegan J Calabro, David F Montez, Benjamin P Kay, Alexander S Hatoum, Meghan Rose Donohue, William Foran, Ryland L Miller, Timothy J Hendrickson, et al. Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902):654–660, 2022
2022
-
[26]
SwiFT: Swin 4d fMRI transformer
Peter Yongho Kim, Junbeom Kwon, Sunghwan Joo, Sangyoon Bae, Donggyu Lee, Yoonho Jung, Shinjae Yoo, Jiook Cha, and Taesup Moon. SwiFT: Swin 4d fMRI transformer. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https: //openreview.net/forum?id=dKeWh6EzBB
2023
-
[27]
Zijian Dong, Ruilin Li, Yilei Wu, Thuan Tinh Nguyen, Joanna Su Xian Chong, Fang Ji, Nathanael Ren Jie Tong, Christopher Li Hsian Chen, and Juan Helen Zhou. Brain-jepa: Brain dynamics foundation model with gradient positioning and spatiotemporal masking.arXiv preprint arXiv:2409.19407, 2024. URLhttps://arxiv.org/abs/2409.19407
-
[28]
Unsupervised contrastive graph learning for resting-state functional mri analysis and brain disorder detection.Human brain mapping, 44(17):5672–5692, 2023
Xiaochuan Wang, Ying Chu, Qianqian Wang, Liang Cao, Lishan Qiao, Limei Zhang, and Mingxia Liu. Unsupervised contrastive graph learning for resting-state functional mri analysis and brain disorder detection.Human brain mapping, 44(17):5672–5692, 2023
2023
-
[29]
Self-supervised graph contrastive learning with diffusion augmentation for functional mri analysis and brain disorder detection.Medical image analysis, 101:103403, 2025
Xiaochuan Wang, Yuqi Fang, Qianqian Wang, Pew-Thian Yap, Hongtu Zhu, and Mingxia Liu. Self-supervised graph contrastive learning with diffusion augmentation for functional mri analysis and brain disorder detection.Medical image analysis, 101:103403, 2025
2025
-
[30]
3d masked autoencoder with spatiotemporal transformer for modeling of 4d fmri data.Medical Image Analysis, page 103861, 2025
Jie Gao, Bao Ge, Ning Qiang, and Shijie Zhao. 3d masked autoencoder with spatiotemporal transformer for modeling of 4d fmri data.Medical Image Analysis, page 103861, 2025
2025
-
[31]
Deep feature extraction for resting-state functional mri by self-supervised learning and application to schizophrenia diagnosis.Frontiers in neuroscience, 15:696853, 2021
Yuki Hashimoto, Yousuke Ogata, Manabu Honda, and Yuichi Yamashita. Deep feature extraction for resting-state functional mri by self-supervised learning and application to schizophrenia diagnosis.Frontiers in neuroscience, 15:696853, 2021
2021
-
[32]
Computing personalized brain functional networks from fmri using self-supervised deep learning.Medical Image Analysis, 85:102756, 2023
Hongming Li, Dhivya Srinivasan, Chuanjun Zhuo, Zaixu Cui, Raquel E Gur, Ruben C Gur, Desmond J Oathes, Christos Davatzikos, Theodore D Satterthwaite, and Yong Fan. Computing personalized brain functional networks from fmri using self-supervised deep learning.Medical Image Analysis, 85:102756, 2023
2023
-
[33]
Whole milc: generalizing learned dynamics across tasks, datasets, and populations
Usman Mahmood, Md Mahfuzur Rahman, Alex Fedorov, Noah Lewis, Zening Fu, Vince D Calhoun, and Sergey M Plis. Whole milc: generalizing learned dynamics across tasks, datasets, and populations. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 407–417. Springer, 2020
2020
-
[34]
Detecting cognitive fatigue in subjects with traumatic brain injury from fmri scans using self-supervised learning
Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Glenn Wylie, and Fillia Make- don. Detecting cognitive fatigue in subjects with traumatic brain injury from fmri scans using self-supervised learning. InProceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments, pages 83–90, 2023
2023
-
[35]
Graph self-supervised learning with application to brain networks analysis.IEEE Journal of Biomedical and Health Informatics, 27(8):4154–4165, 2023
Guangqi Wen, Peng Cao, Lingwen Liu, Jinzhu Yang, Xizhe Zhang, Fei Wang, and Osmar R Zaiane. Graph self-supervised learning with application to brain networks analysis.IEEE Journal of Biomedical and Health Informatics, 27(8):4154–4165, 2023
2023
-
[36]
Graph convolutional network with self-supervised learning for brain disease classification.IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21(6):1830–1841, 2024
Guangyu Wang, Ying Chu, Qianqian Wang, Limei Zhang, Lishan Qiao, and Mingxia Liu. Graph convolutional network with self-supervised learning for brain disease classification.IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21(6):1830–1841, 2024
2024
-
[37]
Swin transformer: Hierarchical vision transformer using shifted windows
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021
2021
-
[38]
Self-supervised transformer- based foundation model for functional magnetic resonance imaging
Matteo Ferrante, Stefano Iervese, Laura Astolfi, and Nicola Toschi. Self-supervised transformer- based foundation model for functional magnetic resonance imaging. In2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1–6. IEEE, 2025. 20
2025
-
[39]
Causal fmri-mamba: Causal state space model for neural decoding and brain task states recognition
Weihao Deng, Fei Han, Qinghua Ling, Qing Liu, and Henry Han. Causal fmri-mamba: Causal state space model for neural decoding and brain task states recognition. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025
2025
-
[40]
State-space model for brain network analysis on rs-fmri
Brain Network Mamba and A Bi-Directional. State-space model for brain network analysis on rs-fmri. InMachine Learning in Medical Imaging: 16th International Workshop, MLMI 2025, Held in Conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings, page 224. Springer Nature, 2026
2025
-
[41]
Towards a general-purpose foundation model for functional MRI analysis
Cheng Wang, Yu Jiang, Zhihao Peng, Chenxin Li, Chang-bae Bang, Lin Zhao, Wanyi Fu, Jinglei Lv, Jorge Sepulcre, Carl Yang, Lifang He, Tianming Liu, Xue-Jun Kong, Quanzheng Li, Daniel S. Barron, Anqi Qiu, Randy Hirschtick, Byung-Hoon Kim, Hongbin Han, Xiang Li, and Yixuan Yuan. Towards a general-purpose foundation model for functional mri analysis.Nature Bi...
-
[42]
Jamba: A Hybrid Transformer-Mamba Language Model
Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, and Yoav Shoham. Jamba: A hybrid transformer-mamba langua...
work page internal anchor Pith review arXiv 2024
-
[43]
Transmamba: Flexibly switching between transformer and mamba
Yixing Li, Ruobing Xie, Zhen Yang, Xingwu Sun, Shuaipeng Li, Weidong Han, Zhanhui Kang, Yu Cheng, Chengzhong Xu, Di Wang, and Jie Jiang. Transmamba: A sequence-level hybrid transformer-mamba language model.arXiv preprint arXiv:2503.24067, 2026. URL https://arxiv.org/abs/2503.24067
-
[44]
Can mamba learn how to learn? a comparative study on in-context learning tasks,
Jongho Park, Jaeseung Park, Zheyang Xiong, Nayoung Lee, Jaewoong Cho, Samet Oymak, Kangwook Lee, and Dimitris Papailiopoulos. Can mamba learn how to learn? a comparative study on in-context learning tasks.arXiv preprint arXiv:2402.04248, 2024. URL https: //arxiv.org/abs/2402.04248
-
[45]
Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, and Bryan Catanzaro. An empirical study of mamba-based language models.arXiv preprint arXiv:2406.07887, 2024. URL https://arxiv.o...
-
[46]
Research on autism diagnosis method based on transformer and mamba
Le Zhao and Yanli Zhang. Research on autism diagnosis method based on transformer and mamba. In2025 6th International Conference on Machine Learning and Computer Application (ICMLCA), pages 1190–1193. IEEE, 2025
2025
-
[47]
Brainmt: A hybrid mamba- transformer architecture for modeling long-range dependencies in functional mri data
Arunkumar Kannan, Martin A Lindquist, and Brian Caffo. Brainmt: A hybrid mamba- transformer architecture for modeling long-range dependencies in functional mri data. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 150–160. Springer, 2025
2025
-
[48]
Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu. Brainmae: a region-aware self-supervised learning framework for brain signals.arXiv preprint arXiv:2406.17086, 2024
-
[49]
Ruthwik Reddy Doodipala, Pankaj Pandey, Carolina Torres Rojas, Manob Jyoti Saikia, and Ranganatha Sitaram. Region-aware reconstruction strategy for pre-training fmri foundation model.arXiv preprint arXiv:2511.00443, 2025
-
[50]
Uk biobank: bank on it.The Lancet, 369(9578):1980–1982, 2007
Lyle J Palmer. Uk biobank: bank on it.The Lancet, 369(9578):1980–1982, 2007. ISSN 0140-6736. doi: https://doi.org/10.1016/S0140-6736(07)60924-6. URL https://www. sciencedirect.com/science/article/pii/S0140673607609246
-
[51]
B.J. Casey, Tariq Cannonier, May I. Conley, Alexandra O. Cohen, Deanna M. Barch, Mary M. Heitzeg, Mary E. Soules, Theresa Teslovich, Danielle V . Dellarco, Hugh Garavan, Catherine A. Orr, Tor D. Wager, Marie T. Banich, Nicole K. Speer, Matthew T. Sutherland, Michael C. 21 Riedel, Anthony S. Dick, James M. Bjork, Kathleen M. Thomas, Bader Chaarani, Margie ...
-
[52]
David C. Van Essen, Stephen M. Smith, Deanna M. Barch, Timothy E.J. Behrens, Essa Yacoub, and Kamil Ugurbil. The wu-minn human connectome project: An overview.NeuroImage, 80:62–79, 2013. ISSN 1053-8119. doi: https://doi.org/10.1016/j.neuroimage.2013.05.041. URL https://www.sciencedirect.com/science/article/pii/S1053811913005351. Mapping the Connectome
-
[53]
Mamba: Linear-time sequence modeling with selective state spaces,
Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces,
-
[54]
URLhttps://arxiv.org/abs/2312.00752
work page internal anchor Pith review Pith/arXiv arXiv
-
[55]
Shubhi Bansal, Sreeharish A, Madhava Prasath J, Manikandan M, Sreekanth Madisetty, Mo- hammad Zia Ur Rehman, Chandravardhan Singh Raghaw, Gaurav Duggal, and Nagendra Kumar. A comprehensive survey of mamba architectures for medical image analysis: Classifi- cation, segmentation, restoration and beyond.arXiv preprint arXiv:2410.02362, 2025. URL https://arxi...
-
[56]
COBRE preprocessed with NIAK 0.12.4
Pierre Bellec. COBRE preprocessed with NIAK 0.12.4. 1 2015. doi: 10.6084/m9.figshare. 1160600.v15. URL https://figshare.com/articles/dataset/COBRE_preprocessed_ with_NIAK_0_12_4/1160600
-
[57]
A model to advance the translational potential of neuroimaging in clinical neuroscience
The ADHD-200 Consortium. A model to advance the translational potential of neuroimaging in clinical neuroscience. http://fcon_1000.projects.nitrc.org/indi/adhd200/, 2011. Accessed: 2025-08-18
2011
-
[58]
Axiomatic attribution for deep networks,
Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks,
- [59]
-
[60]
Adam: A Method for Stochastic Optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[61]
Resting state hyperconnectivity of the default mode network in schizophrenia and clinical high-risk state for psychosis.Cerebral Cortex, 33(13):8456–8464,
Daisuke Sasabayashi, Toshiaki Takahashi, Yuki Takayanagi, Kiyotaka Nemoto, Masafumi Ueno, Akira Furuichi, Yasuhiro Higuchi, Yuki Mizukami, Hiroki Kobayashi, Yuki Yuasa, Kiyoto Noguchi, and Michio Suzuki. Resting state hyperconnectivity of the default mode network in schizophrenia and clinical high-risk state for psychosis.Cerebral Cortex, 33(13):8456–8464,
-
[62]
doi: 10.1093/cercor/bhad131
-
[63]
Woodward
Hamed Karbasforoushan and Nathan D. Woodward. Resting-state networks in schizophrenia. Current Topics in Medicinal Chemistry, 12(21):2404–2414, 2012
2012
-
[64]
Aberrant cerebello-thalamo-cortical functional and effective connectivity in first-episode schizophrenia with auditory verbal hallucinations
Yarui Wei, Kangkang Xue, Meng Yang, Huan Wang, Jingli Chen, Shaoqiang Han, Xiaoxiao Wang, Hong Li, Yong Zhang, Xueqin Song, et al. Aberrant cerebello-thalamo-cortical functional and effective connectivity in first-episode schizophrenia with auditory verbal hallucinations. Schizophrenia bulletin, 48(6):1336–1343, 2022
2022
-
[65]
Bernis Sutcubasi, Baris Metin, Mustafa Kerem Kurban, Zeynep Elcin Metin, Birsu Beser, and Edmund Sonuga-Barke. Resting-state network dysconnectivity in adhd: A system-neuroscience- based meta-analysis.World Journal of Biological Psychiatry, 21(9):662–672, 2020. doi: 10.1080/15622975.2020.1775889
-
[66]
Fair, Jonathan Posner, Bonnie J
Damien A. Fair, Jonathan Posner, Bonnie J. Nagel, Deepti Bathula, Taciana G. Costa Dias, Kathryn L. Mills, Michael S. Blythe, Aishat Giwa, Colleen F. Schmitt, and Joel T. Nigg. Atypical default network connectivity in youth with attention-deficit/hyperactivity disorder. Biological Psychiatry, 68(12):1084–1091, 2010. doi: 10.1016/j.biopsych.2010.07.003. 22
-
[67]
A simple framework for contrastive learning of visual representations.ICML, 2020
Ting Chen et al. A simple framework for contrastive learning of visual representations.ICML, 2020
2020
-
[68]
Masked autoencoders are scalable vision learners.CVPR, 2022
Kaiming He et al. Masked autoencoders are scalable vision learners.CVPR, 2022
2022
-
[69]
Randall Balestriero and Yann LeCun. Learning by reconstruction produces uninformative features for perception.arXiv preprint arXiv:2402.11337, 2024
-
[70]
Local masked reconstruction for efficient self-supervised learning on high-resolution images
Jun Chen, Faizan Farooq Khan, Ming Hu, Ammar Sherif, Zongyuan Ge, Boyang Li, and Mohamed Elhoseiny. Local masked reconstruction for efficient self-supervised learning on high-resolution images. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 8046–8056. IEEE, 2025
2025
-
[71]
How mask matters: Towards theoretical understandings of masked autoencoders.Advances in Neural Information Processing Systems, 35:27127–27139, 2022
Qi Zhang, Yifei Wang, and Yisen Wang. How mask matters: Towards theoretical understandings of masked autoencoders.Advances in Neural Information Processing Systems, 35:27127–27139, 2022
2022
-
[72]
Large-scale brain networks and psychopathology: a unifying triple network model.Trends in cognitive sciences, 15(10):483–506, 2011
Vinod Menon. Large-scale brain networks and psychopathology: a unifying triple network model.Trends in cognitive sciences, 15(10):483–506, 2011
2011
-
[73]
Brain networks in schizophrenia.Neuropsychology review, 24(1):32–48, 2014
Martijn P Van Den Heuvel and Alex Fornito. Brain networks in schizophrenia.Neuropsychology review, 24(1):32–48, 2014
2014
-
[74]
Efficiently modeling long sequences with structured state spaces.ICLR, 2022
Albert Gu et al. Efficiently modeling long sequences with structured state spaces.ICLR, 2022
2022
-
[75]
Mamba: Linear-time sequence modeling with selective state spaces
Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. InFirst Conference on Language Modeling, 2024. URL https://openreview.net/forum? id=tEYskw1VY2
2024
-
[76]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
2017
-
[77]
Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia
Huan Jing, Chunguo Zhang, Haohao Yan, Xiaoling Li, Jiaquan Liang, Wenting Liang, Yangpan Ou, Weibin Wu, Huagui Guo, Wen Deng, et al. Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia. Frontiers in Neuroscience, 17:1243168, 2023
2023
-
[78]
Lateralized brain connectivity in auditory verbal hallucinations: fmri insights into the superior and middle temporal gyri.Frontiers in Human Neuroscience, 19:1650178, 2025
Vyara Zaykova, Sevdalina Kandilarova, Rositsa Paunova, Ferihan Popova, and Drozdstoy Stoyanov. Lateralized brain connectivity in auditory verbal hallucinations: fmri insights into the superior and middle temporal gyri.Frontiers in Human Neuroscience, 19:1650178, 2025
2025
-
[79]
Structural and functional alterations of the temporal lobe in schizophrenia: a literature review.Cureus, 12(10), 2020
Arveen Kaur, Deepak M Basavanagowda, Bindu Rathod, Nupur Mishra, Sehrish Fuad, Sadia Nosher, Zaid A Alrashid, Devyani Mohan, and Stacey E Heindl. Structural and functional alterations of the temporal lobe in schizophrenia: a literature review.Cureus, 12(10), 2020
2020
-
[80]
Occipital alpha connectivity dur- ing resting-state electroencephalography in patients with ultra-high risk for psychosis and schizophrenia.Frontiers in psychiatry, 10:553, 2019
Tiantian Liu, Jian Zhang, Xiaonan Dong, Zhucheng Li, Xiaorui Shi, Yizhou Tong, Ruobing Yang, Jinglong Wu, Changming Wang, and Tianyi Yan. Occipital alpha connectivity dur- ing resting-state electroencephalography in patients with ultra-high risk for psychosis and schizophrenia.Frontiers in psychiatry, 10:553, 2019
2019
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.