arxiv: 2605.01240 · v2 · submitted 2026-05-02 · 💻 cs.LG · cs.AI

Recognition: 3 theorem links

· Lean Theorem

Rhamba: Region-Aware Hybrid Attention-Mamba Framework for Self-Supervised Learning in Resting-State fMRI

Carolina Torres-Rojas, Manob Jyoti Saikia, Pankaj Pandey, Pratheek Eranki, Ranganatha Sitaram, Ruthwik Reddy Doodipala

Authors on Pith no claims yet

Pith reviewed 2026-05-11 00:43 UTC · model grok-4.3

classification 💻 cs.LG cs.AI

keywords self-supervised learningresting-state fMRIhybrid Mamba-Attentionregion-aware maskingbrain disorder classificationABIDE pretrainingIntegrated Gradients

0 comments

The pith

Region-aware hybrid Attention-Mamba pretraining improves fMRI classification of schizophrenia and ADHD.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces Rhamba, a self-supervised framework that pairs anatomically guided masking strategies with hybrid Attention-Mamba models for resting-state fMRI. It pretrains models on the ABIDE dataset using region-aligned patches and three masking approaches, then fine-tunes them on separate datasets to classify schizophrenia and ADHD. The Mamba-Attention hybrid reaches the highest average AUROC and exceeds prior methods, though gains depend on how masking and architecture interact rather than any single choice. A sympathetic reader would care because this offers a concrete route to learn useful representations from large unlabeled neuroimaging collections without requiring extensive labeled data.

Core claim

Rhamba integrates region-aligned patch embeddings with three masking strategies of increasing spatial specificity and compares four architectural variants during pretraining on ABIDE. After fine-tuning on COBRE and ADHD-200, the Mamba-Attention hybrid encoder-decoder records the highest average AUROC across both tasks and outperforms state-of-the-art baselines. Integrated Gradients analysis identifies contributing brain regions, and results indicate that downstream performance arises from the specific pairing of masking strategy and architecture.

What carries the argument

Region-aligned patch embeddings processed by hybrid Attention-Mamba encoder-decoder blocks under Any, Majority, or Pure masking strategies.

If this is right

Masking strategy produces a consistent ordering of reconstruction loss but only modest and dataset-dependent effects on classification accuracy.
The Mamba-Attention configuration achieves the highest average AUROC across the two evaluation datasets.
Peak performance requires specific combinations of masking strategy and architecture instead of one universally best option.
Integrated Gradients reveals the brain regions that drive predictions for each model variant.
Rhamba exceeds state-of-the-art methods in the comparative evaluations performed.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Hybrid models of this form may scale more efficiently than pure attention models when handling long fMRI sequences.
The region-aware emphasis could transfer to other neuroimaging modalities or to predicting functional connectivity patterns.
Tuning masking specificity per target disorder may become standard practice when applying similar frameworks.

Load-bearing premise

Performance differences in the downstream tasks result from the masking strategies and hybrid architecture choices rather than dataset properties or unstated implementation details.

What would settle it

A replication on the same pretraining and fine-tuning datasets in which a pure attention model or non-region-aware masking matches or exceeds the reported AUROC of the MA hybrid would falsify the advantage of the Rhamba design.

Figures

Figures reproduced from arXiv: 2605.01240 by Carolina Torres-Rojas, Manob Jyoti Saikia, Pankaj Pandey, Pratheek Eranki, Ranganatha Sitaram, Ruthwik Reddy Doodipala.

**Figure 1.** Figure 1: Overview of the proposed framework. (a) Pre-training pipeline, including ROI-based view at source ↗

**Figure 2.** Figure 2: Masking strategy comparison and region-wise architecture performance across datasets. view at source ↗

**Figure 3.** Figure 3: Reconstruction loss across masking strategies, regions, and architectures. view at source ↗

**Figure 4.** Figure 4: Interpretation maps generated using the Integrated Gradients (IG) method shown in sagittal view at source ↗

read the original abstract

Self-supervised pretraining is promising for large-scale neuroimaging, yet the impact of region-aware masking and hybrid sequence modeling remains underexplored. In this work, we introduce Rhamba, a region-aware pretraining framework that integrates anatomically guided masking with hybrid Attention-Mamba architectures for resting state functional magnetic resonance imaging (fMRI) analysis. Models were pretrained on the ABIDE dataset using region-aligned patch embeddings and three masking strategies (Any, Majority, and Pure) with increasing spatial specificity. We evaluated four architectural variants: a Mamba only model, an Alternate architecture with interleaved Mamba and Attention blocks, and two hybrid encoder-decoder configurations (Attention-Mamba (AM) and Mamba-Attention (MA)). The pretrained models were fine-tuned on downstream classification tasks using the COBRE and ADHD-200 datasets for schizophrenia and attention-deficit/hyperactivity disorder discrimination. We employed Integrated Gradients, an explainable AI method, to identify the brain regions contributing to model predictions. Masking strategy strongly influenced reconstruction behavior, with reconstruction loss following a consistent ordering (Any > Majority > Pure). However, this trend did not directly translate into downstream performance, where differences were modest and dataset-dependent. The hybrid architecture with the MA configuration achieved the highest average AUROC across both datasets, and Rhamba outperformed state-of-the-art methods in comparative evaluation. Region-wise analysis showed that peak performance depends on the interaction between masking strategy and architecture rather than a single dominant configuration. Overall, Rhamba offers a flexible framework for balancing interpretability, scalability, and performance in large-scale fMRI representation learning.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Rhamba pairs region-aware masking with Attention-Mamba hybrids for rs-fMRI pretraining, but the modest, dataset-dependent gains lack the variance or significance tests needed to back the superiority claim.

read the letter

Rhamba introduces a pretraining setup that uses three anatomically guided masking strategies (Any, Majority, Pure) on region-aligned patches from ABIDE, then tests four variants including Mamba-only, interleaved, and two hybrid encoder-decoder layouts before fine-tuning on COBRE and ADHD-200 for schizophrenia and ADHD classification. They also apply Integrated Gradients to surface contributing brain regions. That specific masking-plus-hybrid combination is the clearest new element; prior work has used Mamba or attention on fMRI but not this exact pairing with spatial specificity levels in the mask design.

Referee Report

2 major / 2 minor

Summary. The paper introduces Rhamba, a region-aware self-supervised pretraining framework for resting-state fMRI that combines anatomically guided masking strategies (Any, Majority, Pure) with hybrid Attention-Mamba sequence models. Models are pretrained on ABIDE using region-aligned patch embeddings, then fine-tuned for binary classification on COBRE (schizophrenia) and ADHD-200 datasets. Four architectures are compared (Mamba-only, Alternate, AM, MA), with the MA hybrid reported to achieve the highest average AUROC; the framework is claimed to outperform prior SOTA methods, and Integrated Gradients is used to highlight contributing brain regions. The abstract notes that masking trends in reconstruction loss do not directly translate to downstream performance, which is described as modest and dataset-dependent.

Significance. If the reported AUROC gains and outperformance hold under rigorous statistical controls, the work would offer a practical, scalable alternative to pure transformer or Mamba baselines for fMRI representation learning, with built-in region-level interpretability. The hybrid design and masking ablation could inform efficient long-sequence modeling in neuroimaging, where data efficiency and anatomical priors matter.

major comments (2)

[Results / Comparative evaluation] Results section (AUROC tables and comparative evaluation): The central claim that the MA hybrid yields the highest average AUROC across COBRE and ADHD-200 and that Rhamba outperforms SOTA rests on modest, dataset-dependent differences without reported standard deviations, multiple random seeds, or statistical significance tests (e.g., McNemar or paired t-tests). This directly undermines the superiority assertion, as the abstract itself qualifies the downstream differences as modest.
[Experiments / Downstream evaluation] Experimental protocol (fine-tuning and evaluation subsections): No details are provided on hyperparameter search ranges, data-split stratification, or whether the same random seeds were used across the four architectures and three masking strategies. Without these controls, observed orderings could arise from implementation variance rather than the region-aware masking plus hybrid design.

minor comments (2)

[Methods / Architecture] Clarify the exact definition of the MA versus AM encoder-decoder configurations (e.g., which blocks are in the encoder versus decoder) and include a diagram or pseudocode for the hybrid stacking.
[Pretraining results] The reconstruction-loss ordering (Any > Majority > Pure) is stated but not quantified with numerical values or linked to a specific figure or table; add these values for reproducibility.

Simulated Author's Rebuttal

2 responses · 0 unresolved

Thank you for the constructive feedback on our manuscript. We appreciate the referee's careful reading and address the major comments point by point below. We will revise the manuscript to incorporate additional statistical rigor and experimental details where feasible.

read point-by-point responses

Referee: [Results / Comparative evaluation] Results section (AUROC tables and comparative evaluation): The central claim that the MA hybrid yields the highest average AUROC across COBRE and ADHD-200 and that Rhamba outperforms SOTA rests on modest, dataset-dependent differences without reported standard deviations, multiple random seeds, or statistical significance tests (e.g., McNemar or paired t-tests). This directly undermines the superiority assertion, as the abstract itself qualifies the downstream differences as modest.

Authors: We agree that the lack of standard deviations, multiple random seeds, and formal statistical tests weakens the comparative claims. The abstract correctly qualifies the differences as modest and dataset-dependent, and we do not claim large effect sizes. In the revised manuscript, we will report AUROC values with standard deviations computed over multiple random seeds and include paired statistical tests (e.g., paired t-tests or McNemar's test) to assess significance of the observed orderings. This will provide a more rigorous basis for the reported trends without overstating the results. revision: yes
Referee: [Experiments / Downstream evaluation] Experimental protocol (fine-tuning and evaluation subsections): No details are provided on hyperparameter search ranges, data-split stratification, or whether the same random seeds were used across the four architectures and three masking strategies. Without these controls, observed orderings could arise from implementation variance rather than the region-aware masking plus hybrid design.

Authors: We acknowledge that insufficient detail on the experimental controls limits reproducibility and the ability to rule out implementation variance. In the revised version, we will expand the experimental protocol and fine-tuning subsections to specify the hyperparameter search ranges (including learning rate, batch size, and optimizer settings), the data-split stratification approach (e.g., by site or diagnostic label to preserve class balance), and confirmation that identical random seeds were used across all architecture-masking combinations for fair comparison. revision: yes

Circularity Check

0 steps flagged

No circularity: standard empirical self-supervised pipeline

full rationale

The paper describes an empirical self-supervised pretraining framework (region-aware masking on ABIDE followed by fine-tuning on COBRE/ADHD-200) using standard hybrid Attention-Mamba architectures and Integrated Gradients for post-hoc explanation. No equations, first-principles derivations, or predictions are presented that reduce by construction to fitted parameters, self-definitions, or self-citation chains. All performance claims rest on experimental comparisons rather than any load-bearing mathematical step that imports its own inputs. The methodology is self-contained against external benchmarks and does not invoke uniqueness theorems or ansatzes from prior author work.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no explicit free parameters, axioms, or invented entities; the framework implicitly assumes that region-aligned patch embeddings and the listed masking strategies capture meaningful anatomical structure in fMRI.

pith-pipeline@v0.9.0 · 5622 in / 1081 out tokens · 62524 ms · 2026-05-11T00:43:14.360895+00:00 · methodology

Review history (2 revisions) →

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

hybrid Attention-Mamba architectures... three masking strategies (Any, Majority, and Pure)... reconstruction loss following a consistent ordering (Any > Majority > Pure)
IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

MA configuration achieved the highest average AUROC... Rhamba outperformed state-of-the-art methods
IndisputableMonolith/Foundation/DimensionForcing.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

no mention of 8-tick period, phi-ladder, or J-cost anywhere in the architecture or analysis

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

87 extracted references · 20 canonical work pages · 3 internal anchors

[1]

Kay, and David W

Seiji Ogawa, Tso-Ming Lee, Alan R. Kay, and David W. Tank. Brain magnetic resonance imaging with contrast dependent on blood oxygenation.Proceedings of the National Academy of Sciences, 87(24):9868–9872, 1990

1990
[2]

Functional mapping of the human visual cortex by magnetic resonance imaging.Science, 254(5032):716–719, 1991

Jack W Belliveau, David N Kennedy, Robert C McKinstry, Bradley R Buchbinder, Robert M Weisskoff, Mark S Cohen, JM Vevea, Thomas J Brady, and Bruce R Rosen. Functional mapping of the human visual cortex by magnetic resonance imaging.Science, 254(5032):716–719, 1991

1991
[3]

Functional connectivity in the motor cortex of resting human brain using echo-planar mri.Magnetic resonance in medicine, 34(4):537–541, 1995

Bharat Biswal, F Zerrin Yetkin, Victor M Haughton, and James S Hyde. Functional connectivity in the motor cortex of resting human brain using echo-planar mri.Magnetic resonance in medicine, 34(4):537–541, 1995

1995
[4]

Consistent resting-state networks across healthy subjects.Proceedings of the national academy of sciences, 103(37):13848–13853, 2006

Jessica S Damoiseaux, Serge ARB Rombouts, Frederik Barkhof, Philip Scheltens, Cornelis J Stam, Stephen M Smith, and Christian F Beckmann. Consistent resting-state networks across healthy subjects.Proceedings of the national academy of sciences, 103(37):13848–13853, 2006

2006
[5]

Resting state fmri: a personal history.Neuroimage, 62(2):938–944, 2012

Bharat B Biswal. Resting state fmri: a personal history.Neuroimage, 62(2):938–944, 2012

2012
[6]

The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014

Adriana Di Martino, Chao-Gan Yan, Qingyang Li, Erin Denio, Francisco X Castellanos, Kaat Alaerts, Jeffrey S Anderson, Michal Assaf, Susan Y Bookheimer, Mirella Dapretto, et al. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014

2014
[7]

The adhd-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience.Frontiers in systems neuroscience, 6:62, 2012

ADHD-200 consortium. The adhd-200 consortium: a model to advance the translational potential of neuroimaging in clinical neuroscience.Frontiers in systems neuroscience, 6:62, 2012

2012
[8]

Common neural patterns of substance use disorder: a seed-based resting-state functional connectivity meta-analysis.Translational Psychiatry, 15(1):190, 2025

Xiaonan Zhang, Haoyu Zhang, Yingbo Shao, Yang Li, Feifei Zhang, and Hui Zhang. Common neural patterns of substance use disorder: a seed-based resting-state functional connectivity meta-analysis.Translational Psychiatry, 15(1):190, 2025. 18

2025
[9]

Kevin Hilbert, Joscha Böhnlein, Charlotte Meinke, Alice V Chavanne, Till Langhammer, Lara Stumpe, Nils Winter, Ramona Leenings, Dirk Adolph, V olker Arolt, et al. Lack of evidence for predictive utility from resting state fmri data for individual exposure-based cognitive behavioral therapy outcomes: A machine learning study in two large multi-site samples...

2024
[10]

The history and future of resting-state functional magnetic resonance imaging.Nature, 641(8065):1121–1131, 2025

Bharat B Biswal and Lucina Q Uddin. The history and future of resting-state functional magnetic resonance imaging.Nature, 641(8065):1121–1131, 2025

2025
[11]

Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls.Neuroimage, 145:137–165, 2017

Mohammad R Arbabshirani, Sergey Plis, Jing Sui, and Vince D Calhoun. Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls.Neuroimage, 145:137–165, 2017

2017
[12]

Cross-validation failure: Small sample sizes lead to large error bars.Neuroim- age, 180:68–77, 2018

Gaël Varoquaux. Cross-validation failure: Small sample sizes lead to large error bars.Neuroim- age, 180:68–77, 2018

2018
[13]

Resting state fmri functional connectivity-based classification using a convolutional neural network architecture.Frontiers in neuroinformatics, 11:61, 2017

Regina J Meszlényi, Krisztian Buza, and Zoltán Vidnyánszky. Resting state fmri functional connectivity-based classification using a convolutional neural network architecture.Frontiers in neuroinformatics, 11:61, 2017

2017
[14]

3d-cnn based discrimination of schizophrenia using resting-state fmri.Artificial intelligence in medicine, 98:10–17, 2019

Muhammad Naveed Iqbal Qureshi, Jooyoung Oh, and Boreom Lee. 3d-cnn based discrimination of schizophrenia using resting-state fmri.Artificial intelligence in medicine, 98:10–17, 2019

2019
[15]

The use of fmri regional analysis to automatically detect adhd through a 3d cnn-based approach.Journal of Imaging Informatics in Medicine, 38 (1):203–216, 2025

Perihan Gül¸ sah Gülhan and Güzin Özmen. The use of fmri regional analysis to automatically detect adhd through a 3d cnn-based approach.Journal of Imaging Informatics in Medicine, 38 (1):203–216, 2025

2025
[16]

Identifying autism from resting-state fmri using long short-term memory networks

Nicha C Dvornek, Pamela Ventola, Kevin A Pelphrey, and James S Duncan. Identifying autism from resting-state fmri using long short-term memory networks. Ininternational workshop on machine learning in medical imaging, pages 362–370. Springer, 2017

2017
[17]

Characterization of early stage parkinson’s disease from resting-state fmri data using a long short-term memory network.Frontiers in Neuroimaging, 1:952084, 2022

Xueqi Guo, Sule Tinaz, and Nicha C Dvornek. Characterization of early stage parkinson’s disease from resting-state fmri data using a long short-term memory network.Frontiers in Neuroimaging, 1:952084, 2022

2022
[18]

A novel graph neural network framework for resting-state functional mri spatiotemporal dynamics analysis.Physica A: Statistical Mechanics and its Applications, 669: 130582, 2025

Tao Wang, Zenghui Ding, Zheng Chang, Xianjun Yang, Yanyan Chen, Meng Li, Shu Xu, and Yu Wang. A novel graph neural network framework for resting-state functional mri spatiotemporal dynamics analysis.Physica A: Statistical Mechanics and its Applications, 669: 130582, 2025

2025
[19]

Classification of brain disorders in rs-fmri via local-to-global graph neural networks.IEEE transactions on medical imaging, 42(2):444–455, 2022

Hao Zhang, Ran Song, Liping Wang, Lin Zhang, Dawei Wang, Cong Wang, and Wei Zhang. Classification of brain disorders in rs-fmri via local-to-global graph neural networks.IEEE transactions on medical imaging, 42(2):444–455, 2022

2022
[20]

Representation learning of resting state fmri with variational autoencoder.NeuroImage, 241: 118423, 2021

Jung-Hoon Kim, Yizhen Zhang, Kuan Han, Zheyu Wen, Minkyu Choi, and Zhongming Liu. Representation learning of resting state fmri with variational autoencoder.NeuroImage, 241: 118423, 2021

2021
[21]

Classification of mdd using a transformer classifier with large-scale multisite resting-state fmri data.Human brain mapping, 45(1):e26542, 2024

Peishan Dai, Ying Zhou, Yun Shi, Da Lu, Zailiang Chen, Beiji Zou, Kun Liu, Shenghui Liao, and REST meta MDD Consortium. Classification of mdd using a transformer classifier with large-scale multisite resting-state fmri data.Human brain mapping, 45(1):e26542, 2024

2024
[22]

Predicting task-related brain activity from resting-state brain dynamics with fmri transformer

Junbeom Kwon, Jungwoo Seo, Heehwan Wang, Taesup Moon, Shinjae Yoo, and Jiook Cha. Predicting task-related brain activity from resting-state brain dynamics with fmri transformer. Imaging Neuroscience, 3:imag_a_00440, 2025

2025
[23]

Current challenges in translational and clinical fmri and future directions

Karsten Specht. Current challenges in translational and clinical fmri and future directions. Frontiers in psychiatry, 10:924, 2020

2020
[24]

On the generalizability of resting-state fmri machine learning classifiers.Frontiers in human neuroscience, 8:502, 2014

Wolfgang Huf, Klaudius Kalcher, Roland N Boubela, Georg Rath, Andreas Vecsei, Peter Filzmoser, and Ewald Moser. On the generalizability of resting-state fmri machine learning classifiers.Frontiers in human neuroscience, 8:502, 2014. 19

2014
[25]

Reproducible brain-wide association studies require thousands of individuals

Scott Marek, Brenden Tervo-Clemmens, Finnegan J Calabro, David F Montez, Benjamin P Kay, Alexander S Hatoum, Meghan Rose Donohue, William Foran, Ryland L Miller, Timothy J Hendrickson, et al. Reproducible brain-wide association studies require thousands of individuals. Nature, 603(7902):654–660, 2022

2022
[26]

SwiFT: Swin 4d fMRI transformer

Peter Yongho Kim, Junbeom Kwon, Sunghwan Joo, Sangyoon Bae, Donggyu Lee, Yoonho Jung, Shinjae Yoo, Jiook Cha, and Taesup Moon. SwiFT: Swin 4d fMRI transformer. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https: //openreview.net/forum?id=dKeWh6EzBB

2023
[27]

Brain-jepa: Brain dynamics foundation model with gradient positioning and spatiotemporal masking.arXiv preprint arXiv:2409.19407, 2024

Zijian Dong, Ruilin Li, Yilei Wu, Thuan Tinh Nguyen, Joanna Su Xian Chong, Fang Ji, Nathanael Ren Jie Tong, Christopher Li Hsian Chen, and Juan Helen Zhou. Brain-jepa: Brain dynamics foundation model with gradient positioning and spatiotemporal masking.arXiv preprint arXiv:2409.19407, 2024. URLhttps://arxiv.org/abs/2409.19407

work page arXiv 2024
[28]

Unsupervised contrastive graph learning for resting-state functional mri analysis and brain disorder detection.Human brain mapping, 44(17):5672–5692, 2023

Xiaochuan Wang, Ying Chu, Qianqian Wang, Liang Cao, Lishan Qiao, Limei Zhang, and Mingxia Liu. Unsupervised contrastive graph learning for resting-state functional mri analysis and brain disorder detection.Human brain mapping, 44(17):5672–5692, 2023

2023
[29]

Self-supervised graph contrastive learning with diffusion augmentation for functional mri analysis and brain disorder detection.Medical image analysis, 101:103403, 2025

Xiaochuan Wang, Yuqi Fang, Qianqian Wang, Pew-Thian Yap, Hongtu Zhu, and Mingxia Liu. Self-supervised graph contrastive learning with diffusion augmentation for functional mri analysis and brain disorder detection.Medical image analysis, 101:103403, 2025

2025
[30]

3d masked autoencoder with spatiotemporal transformer for modeling of 4d fmri data.Medical Image Analysis, page 103861, 2025

Jie Gao, Bao Ge, Ning Qiang, and Shijie Zhao. 3d masked autoencoder with spatiotemporal transformer for modeling of 4d fmri data.Medical Image Analysis, page 103861, 2025

2025
[31]

Deep feature extraction for resting-state functional mri by self-supervised learning and application to schizophrenia diagnosis.Frontiers in neuroscience, 15:696853, 2021

Yuki Hashimoto, Yousuke Ogata, Manabu Honda, and Yuichi Yamashita. Deep feature extraction for resting-state functional mri by self-supervised learning and application to schizophrenia diagnosis.Frontiers in neuroscience, 15:696853, 2021

2021
[32]

Computing personalized brain functional networks from fmri using self-supervised deep learning.Medical Image Analysis, 85:102756, 2023

Hongming Li, Dhivya Srinivasan, Chuanjun Zhuo, Zaixu Cui, Raquel E Gur, Ruben C Gur, Desmond J Oathes, Christos Davatzikos, Theodore D Satterthwaite, and Yong Fan. Computing personalized brain functional networks from fmri using self-supervised deep learning.Medical Image Analysis, 85:102756, 2023

2023
[33]

Whole milc: generalizing learned dynamics across tasks, datasets, and populations

Usman Mahmood, Md Mahfuzur Rahman, Alex Fedorov, Noah Lewis, Zening Fu, Vince D Calhoun, and Sergey M Plis. Whole milc: generalizing learned dynamics across tasks, datasets, and populations. InInternational Conference on Medical Image Computing and Computer- Assisted Intervention, pages 407–417. Springer, 2020

2020
[34]

Detecting cognitive fatigue in subjects with traumatic brain injury from fmri scans using self-supervised learning

Ashish Jaiswal, Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Glenn Wylie, and Fillia Make- don. Detecting cognitive fatigue in subjects with traumatic brain injury from fmri scans using self-supervised learning. InProceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments, pages 83–90, 2023

2023
[35]

Graph self-supervised learning with application to brain networks analysis.IEEE Journal of Biomedical and Health Informatics, 27(8):4154–4165, 2023

Guangqi Wen, Peng Cao, Lingwen Liu, Jinzhu Yang, Xizhe Zhang, Fei Wang, and Osmar R Zaiane. Graph self-supervised learning with application to brain networks analysis.IEEE Journal of Biomedical and Health Informatics, 27(8):4154–4165, 2023

2023
[36]

Graph convolutional network with self-supervised learning for brain disease classification.IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21(6):1830–1841, 2024

Guangyu Wang, Ying Chu, Qianqian Wang, Limei Zhang, Lishan Qiao, and Mingxia Liu. Graph convolutional network with self-supervised learning for brain disease classification.IEEE/ACM Transactions on Computational Biology and Bioinformatics, 21(6):1830–1841, 2024

2024
[37]

Swin transformer: Hierarchical vision transformer using shifted windows

Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows. InProceedings of the IEEE/CVF international conference on computer vision, pages 10012–10022, 2021

2021
[38]

Self-supervised transformer- based foundation model for functional magnetic resonance imaging

Matteo Ferrante, Stefano Iervese, Laura Astolfi, and Nicola Toschi. Self-supervised transformer- based foundation model for functional magnetic resonance imaging. In2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC), pages 1–6. IEEE, 2025. 20

2025
[39]

Causal fmri-mamba: Causal state space model for neural decoding and brain task states recognition

Weihao Deng, Fei Han, Qinghua Ling, Qing Liu, and Henry Han. Causal fmri-mamba: Causal state space model for neural decoding and brain task states recognition. InICASSP 2025-2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 1–5. IEEE, 2025

2025
[40]

State-space model for brain network analysis on rs-fmri

Brain Network Mamba and A Bi-Directional. State-space model for brain network analysis on rs-fmri. InMachine Learning in Medical Imaging: 16th International Workshop, MLMI 2025, Held in Conjunction with MICCAI 2025, Daejeon, South Korea, September 23, 2025, Proceedings, page 224. Springer Nature, 2026

2025
[41]

Towards a general-purpose foundation model for functional MRI analysis

Cheng Wang, Yu Jiang, Zhihao Peng, Chenxin Li, Chang-bae Bang, Lin Zhao, Wanyi Fu, Jinglei Lv, Jorge Sepulcre, Carl Yang, Lifang He, Tianming Liu, Xue-Jun Kong, Quanzheng Li, Daniel S. Barron, Anqi Qiu, Randy Hirschtick, Byung-Hoon Kim, Hongbin Han, Xiang Li, and Yixuan Yuan. Towards a general-purpose foundation model for functional mri analysis.Nature Bi...

work page doi:10.1038/s41551-026-01666-y 2026
[42]

Jamba: A Hybrid Transformer-Mamba Language Model

Opher Lieber, Barak Lenz, Hofit Bata, Gal Cohen, Jhonathan Osin, Itay Dalmedigos, Erez Safahi, Shaked Meirom, Yonatan Belinkov, Shai Shalev-Shwartz, Omri Abend, Raz Alon, Tomer Asida, Amir Bergman, Roman Glozman, Michael Gokhman, Avashalom Manevich, Nir Ratner, Noam Rozen, Erez Shwartz, Mor Zusman, and Yoav Shoham. Jamba: A hybrid transformer-mamba langua...

work page internal anchor Pith review arXiv 2024
[43]

Transmamba: Flexibly switching between transformer and mamba

Yixing Li, Ruobing Xie, Zhen Yang, Xingwu Sun, Shuaipeng Li, Weidong Han, Zhanhui Kang, Yu Cheng, Chengzhong Xu, Di Wang, and Jie Jiang. Transmamba: A sequence-level hybrid transformer-mamba language model.arXiv preprint arXiv:2503.24067, 2026. URL https://arxiv.org/abs/2503.24067

work page arXiv 2026
[44]

Can mamba learn how to learn? a comparative study on in-context learning tasks,

Jongho Park, Jaeseung Park, Zheyang Xiong, Nayoung Lee, Jaewoong Cho, Samet Oymak, Kangwook Lee, and Dimitris Papailiopoulos. Can mamba learn how to learn? a comparative study on in-context learning tasks.arXiv preprint arXiv:2402.04248, 2024. URL https: //arxiv.org/abs/2402.04248

work page arXiv 2024
[45]

Waleffe, W

Roger Waleffe, Wonmin Byeon, Duncan Riach, Brandon Norick, Vijay Korthikanti, Tri Dao, Albert Gu, Ali Hatamizadeh, Sudhakar Singh, Deepak Narayanan, Garvit Kulshreshtha, Vartika Singh, Jared Casper, Jan Kautz, Mohammad Shoeybi, and Bryan Catanzaro. An empirical study of mamba-based language models.arXiv preprint arXiv:2406.07887, 2024. URL https://arxiv.o...

work page arXiv 2024
[46]

Research on autism diagnosis method based on transformer and mamba

Le Zhao and Yanli Zhang. Research on autism diagnosis method based on transformer and mamba. In2025 6th International Conference on Machine Learning and Computer Application (ICMLCA), pages 1190–1193. IEEE, 2025

2025
[47]

Brainmt: A hybrid mamba- transformer architecture for modeling long-range dependencies in functional mri data

Arunkumar Kannan, Martin A Lindquist, and Brian Caffo. Brainmt: A hybrid mamba- transformer architecture for modeling long-range dependencies in functional mri data. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 150–160. Springer, 2025

2025
[48]

Brain- MAE: A region-aware self-supervised learning framework for brain signals.arXiv preprint arXiv:2406.17086, 2024

Yifan Yang, Yutong Mao, Xufu Liu, and Xiao Liu. Brainmae: a region-aware self-supervised learning framework for brain signals.arXiv preprint arXiv:2406.17086, 2024

work page arXiv 2024
[49]

Region-aware reconstruction strategy for pre-training fmri foundation model.arXiv preprint arXiv:2511.00443, 2025

Ruthwik Reddy Doodipala, Pankaj Pandey, Carolina Torres Rojas, Manob Jyoti Saikia, and Ranganatha Sitaram. Region-aware reconstruction strategy for pre-training fmri foundation model.arXiv preprint arXiv:2511.00443, 2025

work page arXiv 2025
[50]

Uk biobank: bank on it.The Lancet, 369(9578):1980–1982, 2007

Lyle J Palmer. Uk biobank: bank on it.The Lancet, 369(9578):1980–1982, 2007. ISSN 0140-6736. doi: https://doi.org/10.1016/S0140-6736(07)60924-6. URL https://www. sciencedirect.com/science/article/pii/S0140673607609246

work page doi:10.1016/s0140-6736(07)60924-6 1980
[51]

Casey, Tariq Cannonier, May I

B.J. Casey, Tariq Cannonier, May I. Conley, Alexandra O. Cohen, Deanna M. Barch, Mary M. Heitzeg, Mary E. Soules, Theresa Teslovich, Danielle V . Dellarco, Hugh Garavan, Catherine A. Orr, Tor D. Wager, Marie T. Banich, Nicole K. Speer, Matthew T. Sutherland, Michael C. 21 Riedel, Anthony S. Dick, James M. Bjork, Kathleen M. Thomas, Bader Chaarani, Margie ...

work page doi:10.1016/j.dcn.2018.03.001 2018
[52]

Van Essen, Stephen M

David C. Van Essen, Stephen M. Smith, Deanna M. Barch, Timothy E.J. Behrens, Essa Yacoub, and Kamil Ugurbil. The wu-minn human connectome project: An overview.NeuroImage, 80:62–79, 2013. ISSN 1053-8119. doi: https://doi.org/10.1016/j.neuroimage.2013.05.041. URL https://www.sciencedirect.com/science/article/pii/S1053811913005351. Mapping the Connectome

work page doi:10.1016/j.neuroimage.2013.05.041 2013
[53]

Mamba: Linear-time sequence modeling with selective state spaces,

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces,
[54]

URLhttps://arxiv.org/abs/2312.00752

work page internal anchor Pith review Pith/arXiv arXiv
[55]

A comprehensive survey of mamba architectures for medical image analysis: Classifi- cation, segmentation, restoration and beyond.arXiv preprint arXiv:2410.02362, 2025

Shubhi Bansal, Sreeharish A, Madhava Prasath J, Manikandan M, Sreekanth Madisetty, Mo- hammad Zia Ur Rehman, Chandravardhan Singh Raghaw, Gaurav Duggal, and Nagendra Kumar. A comprehensive survey of mamba architectures for medical image analysis: Classifi- cation, segmentation, restoration and beyond.arXiv preprint arXiv:2410.02362, 2025. URL https://arxi...

work page arXiv 2025
[56]

COBRE preprocessed with NIAK 0.12.4

Pierre Bellec. COBRE preprocessed with NIAK 0.12.4. 1 2015. doi: 10.6084/m9.figshare. 1160600.v15. URL https://figshare.com/articles/dataset/COBRE_preprocessed_ with_NIAK_0_12_4/1160600

work page doi:10.6084/m9.figshare 2015
[57]

A model to advance the translational potential of neuroimaging in clinical neuroscience

The ADHD-200 Consortium. A model to advance the translational potential of neuroimaging in clinical neuroscience. http://fcon_1000.projects.nitrc.org/indi/adhd200/, 2011. Accessed: 2025-08-18

2011
[58]

Axiomatic attribution for deep networks,

Mukund Sundararajan, Ankur Taly, and Qiqi Yan. Axiomatic attribution for deep networks,
[59]

URLhttps://arxiv.org/abs/1703.01365

work page arXiv
[60]

Adam: A Method for Stochastic Optimization

Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 2014

work page internal anchor Pith review Pith/arXiv arXiv 2014
[61]

Resting state hyperconnectivity of the default mode network in schizophrenia and clinical high-risk state for psychosis.Cerebral Cortex, 33(13):8456–8464,

Daisuke Sasabayashi, Toshiaki Takahashi, Yuki Takayanagi, Kiyotaka Nemoto, Masafumi Ueno, Akira Furuichi, Yasuhiro Higuchi, Yuki Mizukami, Hiroki Kobayashi, Yuki Yuasa, Kiyoto Noguchi, and Michio Suzuki. Resting state hyperconnectivity of the default mode network in schizophrenia and clinical high-risk state for psychosis.Cerebral Cortex, 33(13):8456–8464,
[62]

doi: 10.1093/cercor/bhad131

work page doi:10.1093/cercor/bhad131
[63]

Woodward

Hamed Karbasforoushan and Nathan D. Woodward. Resting-state networks in schizophrenia. Current Topics in Medicinal Chemistry, 12(21):2404–2414, 2012

2012
[64]

Aberrant cerebello-thalamo-cortical functional and effective connectivity in first-episode schizophrenia with auditory verbal hallucinations

Yarui Wei, Kangkang Xue, Meng Yang, Huan Wang, Jingli Chen, Shaoqiang Han, Xiaoxiao Wang, Hong Li, Yong Zhang, Xueqin Song, et al. Aberrant cerebello-thalamo-cortical functional and effective connectivity in first-episode schizophrenia with auditory verbal hallucinations. Schizophrenia bulletin, 48(6):1336–1343, 2022

2022
[65]

Resting-state network dysconnectivity in adhd: A system-neuroscience- based meta-analysis.World Journal of Biological Psychiatry, 21(9):662–672, 2020

Bernis Sutcubasi, Baris Metin, Mustafa Kerem Kurban, Zeynep Elcin Metin, Birsu Beser, and Edmund Sonuga-Barke. Resting-state network dysconnectivity in adhd: A system-neuroscience- based meta-analysis.World Journal of Biological Psychiatry, 21(9):662–672, 2020. doi: 10.1080/15622975.2020.1775889

work page doi:10.1080/15622975.2020.1775889 2020
[66]

Fair, Jonathan Posner, Bonnie J

Damien A. Fair, Jonathan Posner, Bonnie J. Nagel, Deepti Bathula, Taciana G. Costa Dias, Kathryn L. Mills, Michael S. Blythe, Aishat Giwa, Colleen F. Schmitt, and Joel T. Nigg. Atypical default network connectivity in youth with attention-deficit/hyperactivity disorder. Biological Psychiatry, 68(12):1084–1091, 2010. doi: 10.1016/j.biopsych.2010.07.003. 22

work page doi:10.1016/j.biopsych.2010.07.003 2010
[67]

A simple framework for contrastive learning of visual representations.ICML, 2020

Ting Chen et al. A simple framework for contrastive learning of visual representations.ICML, 2020

2020
[68]

Masked autoencoders are scalable vision learners.CVPR, 2022

Kaiming He et al. Masked autoencoders are scalable vision learners.CVPR, 2022

2022
[69]

Learning by reconstruction produces uninformative features for perception.arXiv preprint arXiv:2402.11337, 2024

Randall Balestriero and Yann LeCun. Learning by reconstruction produces uninformative features for perception.arXiv preprint arXiv:2402.11337, 2024

work page arXiv 2024
[70]

Local masked reconstruction for efficient self-supervised learning on high-resolution images

Jun Chen, Faizan Farooq Khan, Ming Hu, Ammar Sherif, Zongyuan Ge, Boyang Li, and Mohamed Elhoseiny. Local masked reconstruction for efficient self-supervised learning on high-resolution images. In2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 8046–8056. IEEE, 2025

2025
[71]

How mask matters: Towards theoretical understandings of masked autoencoders.Advances in Neural Information Processing Systems, 35:27127–27139, 2022

Qi Zhang, Yifei Wang, and Yisen Wang. How mask matters: Towards theoretical understandings of masked autoencoders.Advances in Neural Information Processing Systems, 35:27127–27139, 2022

2022
[72]

Large-scale brain networks and psychopathology: a unifying triple network model.Trends in cognitive sciences, 15(10):483–506, 2011

Vinod Menon. Large-scale brain networks and psychopathology: a unifying triple network model.Trends in cognitive sciences, 15(10):483–506, 2011

2011
[73]

Brain networks in schizophrenia.Neuropsychology review, 24(1):32–48, 2014

Martijn P Van Den Heuvel and Alex Fornito. Brain networks in schizophrenia.Neuropsychology review, 24(1):32–48, 2014

2014
[74]

Efficiently modeling long sequences with structured state spaces.ICLR, 2022

Albert Gu et al. Efficiently modeling long sequences with structured state spaces.ICLR, 2022

2022
[75]

Mamba: Linear-time sequence modeling with selective state spaces

Albert Gu and Tri Dao. Mamba: Linear-time sequence modeling with selective state spaces. InFirst Conference on Language Modeling, 2024. URL https://openreview.net/forum? id=tEYskw1VY2

2024
[76]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017

2017
[77]

Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia

Huan Jing, Chunguo Zhang, Haohao Yan, Xiaoling Li, Jiaquan Liang, Wenting Liang, Yangpan Ou, Weibin Wu, Huagui Guo, Wen Deng, et al. Deviant spontaneous neural activity as a potential early-response predictor for therapeutic interventions in patients with schizophrenia. Frontiers in Neuroscience, 17:1243168, 2023

2023
[78]

Lateralized brain connectivity in auditory verbal hallucinations: fmri insights into the superior and middle temporal gyri.Frontiers in Human Neuroscience, 19:1650178, 2025

Vyara Zaykova, Sevdalina Kandilarova, Rositsa Paunova, Ferihan Popova, and Drozdstoy Stoyanov. Lateralized brain connectivity in auditory verbal hallucinations: fmri insights into the superior and middle temporal gyri.Frontiers in Human Neuroscience, 19:1650178, 2025

2025
[79]

Structural and functional alterations of the temporal lobe in schizophrenia: a literature review.Cureus, 12(10), 2020

Arveen Kaur, Deepak M Basavanagowda, Bindu Rathod, Nupur Mishra, Sehrish Fuad, Sadia Nosher, Zaid A Alrashid, Devyani Mohan, and Stacey E Heindl. Structural and functional alterations of the temporal lobe in schizophrenia: a literature review.Cureus, 12(10), 2020

2020
[80]

Occipital alpha connectivity dur- ing resting-state electroencephalography in patients with ultra-high risk for psychosis and schizophrenia.Frontiers in psychiatry, 10:553, 2019

Tiantian Liu, Jian Zhang, Xiaonan Dong, Zhucheng Li, Xiaorui Shi, Yizhou Tong, Ruobing Yang, Jinglong Wu, Changming Wang, and Tianyi Yan. Occipital alpha connectivity dur- ing resting-state electroencephalography in patients with ultra-high risk for psychosis and schizophrenia.Frontiers in psychiatry, 10:553, 2019

2019

Showing first 80 references.