Learning Robust and Task-Invariant Functional Representation from fMRI through Siamese Self-Supervised Learning
Pith reviewed 2026-06-29 13:29 UTC · model grok-4.3
The pith
Self-supervised learning from positive-only fMRI pairs produces task-general representations that beat supervised baselines.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
BrainSimSiam leverages positive-only data pairs to learn robust and generalizable features from fMRI, achieving strong performance across multiple downstream classification and regression tasks, outperforming fully supervised baselines and approaching the performance of large-scale models.
What carries the argument
BrainSimSiam, a Siamese self-supervised network that learns task-invariant representations solely from positive pairs without negative samples or labels.
If this is right
- Representations can be fine-tuned for new classification tasks on psychiatric conditions with only small labeled sets.
- Regression on continuous brain-function measures becomes feasible without task-specific pretraining.
- Research groups with limited compute can reach performance levels previously requiring large-scale pretraining.
- The same positive-pair approach may reduce reliance on combining multiple datasets for foundation-model training.
Where Pith is reading between the lines
- The method could extend to other high-dimensional time-series signals such as EEG if the positive-pair structure transfers.
- If the representations prove scanner-invariant, they might enable pooling across sites without explicit harmonization steps.
- Testing whether the features remain stable when the positive pairs come from different acquisition protocols would clarify robustness limits.
Load-bearing premise
Positive-only pairs drawn from fMRI scans contain sufficient structure to learn task-invariant features that generalize without being dominated by noise, scanner effects, or dataset-specific artifacts.
What would settle it
Train BrainSimSiam on one fMRI collection and evaluate on a completely held-out multi-site dataset for a new psychiatric condition; failure to exceed a supervised model trained on the target data would falsify the generalization claim.
Figures
read the original abstract
Functional magnetic resonance imaging (fMRI) is a powerful tool for investigating human brain function. However, the high cost of data acquisition and the inherent subjectivity of psychiatric rating scales often lead to datasets with small sample sizes and variable label quality, especially when targeting a specific neurological condition. Combined with the inherently high dimensionality of fMRI data, these limitations substantially increase the risk of model overfitting. Recent years have seen growing interest in developing fMRI foundation models by combining multiple datasets; however, the computational resources needed for pretraining and fine-tuning are often prohibitive. We show that a lightweight self-supervised framework yields representations that generalize across diverse downstream tasks, outperforming fully supervised baselines and approaching the performance of large-scale models. We introduce BrainSimSiam, a data-efficient self-supervised representation learning framework that leverages positive-only data pairs to learn robust and generalizable features. We demonstrate that the learned representations achieve strong performance across multiple downstream classification and regression tasks, highlighting the potential of BrainSimSiam for data-limited neuroimaging applications.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces BrainSimSiam, a lightweight Siamese self-supervised learning framework that uses positive-only pairs from fMRI scans to learn robust, task-invariant functional representations. It claims these representations generalize across diverse downstream classification and regression tasks, outperforming fully supervised baselines while approaching the performance of large-scale foundation models, and are particularly suited to data-limited neuroimaging settings with small samples and noisy labels.
Significance. If the empirical results hold, the contribution would be significant for data-efficient fMRI representation learning: it offers a computationally lightweight alternative to large-scale pretraining that could reduce overfitting risks in psychiatric neuroimaging without requiring extensive labeled data or prohibitive resources.
major comments (1)
- Abstract: the central claim that the learned representations 'outperform fully supervised baselines and approach the performance of large-scale models' on classification and regression tasks is asserted without any quantitative results, error bars, dataset sizes, baseline implementations, or ablation evidence. This prevents evaluation of whether the positive-only pair construction actually yields task-invariant features that generalize beyond scanner or dataset artifacts.
Simulated Author's Rebuttal
We thank the referee for their thoughtful review and for highlighting the need for stronger substantiation in the abstract. We address the major comment below.
read point-by-point responses
-
Referee: [—] Abstract: the central claim that the learned representations 'outperform fully supervised baselines and approach the performance of large-scale models' on classification and regression tasks is asserted without any quantitative results, error bars, dataset sizes, baseline implementations, or ablation evidence. This prevents evaluation of whether the positive-only pair construction actually yields task-invariant features that generalize beyond scanner or dataset artifacts.
Authors: We agree that the abstract, as currently written, asserts performance claims without accompanying quantitative details. The full manuscript contains the requested elements: quantitative results with error bars across multiple datasets and tasks, descriptions of dataset sizes and preprocessing, implementation details for baselines (including supervised models and large-scale foundation models), and ablations on the positive-only pair construction. These results are presented in Sections 4 and 5 with statistical comparisons. To address the concern directly, we will revise the abstract to include key quantitative metrics (e.g., mean accuracy or correlation values with standard deviations) and a brief note on the evaluation setup. Regarding generalization beyond scanner or dataset artifacts, the experiments include cross-dataset and cross-scanner evaluations that support task-invariance; we will ensure the revised abstract references this evidence more explicitly. revision: yes
Circularity Check
No significant circularity detected
full rationale
The paper introduces BrainSimSiam, a Siamese self-supervised framework for fMRI representations using positive-only pairs. No equations, derivations, or parameter-fitting steps are described that reduce claimed performance or task-invariance to inputs by construction. The central claims rest on empirical generalization across downstream tasks rather than any self-definitional, fitted-input, or self-citation load-bearing logic. The provided abstract and method summary contain no quoted reductions matching the enumerated circularity patterns, making the result self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
F. Abuhantash, M. Abuhantash, A. AlShehhi, Comorbidity-based framework for alzheimer’s disease classification using graph neural networks, Scientific Reports (09 2024). doi:10.1038/s41598-024-72321-2
-
[2]
L. Chen, Y. Yang, A. Yu, S. Guo, K. Ren, Q. Liu, C. Qiao, An explainable spatio-temporal graph convolutional network for the biomark- ers identification of ADHD, Biomedical Sig- nal Processing and Control 99 (2025) 106913. doi:https://doi.org/10.1016/j.bspc.2024.106913
-
[3]
X. Li, Y. Zhou, N. Dvornek, M. Zhang, S. Gao, J. Zhuang, D. Scheinost, L. H. Staib, P. Ven- tola, J. S. Duncan, BrainGNN: Interpretable brain graph neural network for fmri analy- sis, Medical Image Analysis 74 (2021) 102233. doi:https://doi.org/10.1016/j.media.2021.102233
-
[4]
D. Ferreira, A. Nordberg, E. West- man, Biological subtypes of alzheimer dis- ease, Neurology 94 (10) (2020) 436–448. doi:10.1212/WNL.0000000000009058
-
[5]
H. M. Geurts, S. Verté, J. Oosterlaan, H. Roey- ers, J. A. Sergeant, ADHD subtypes: do they dif- fer in their executive functioning profile?, Archives of Clinical Neuropsychology 20 (4) (2005) 457–477. doi:10.1016/j.acn.2004.11.001
-
[6]
F. Craig, A. Crippa, M. Ruggiero, V. Rizzato, L. Russo, I. Fanizza, A. Trabacca, Charac- terization of autism spectrum disorder (asd) subtypes based on the relationship between mo- tor skills and social communication abilities, Human Movement Science 77 (2021) 102802. doi:https://doi.org/10.1016/j.humov.2021.102802
-
[7]
H. Song, M. Kim, D. Park, Y. Shin, J.-G. Lee, Learning from noisy labels with deep neural net- works: A survey, IEEE Transactions on Neural Net- works and Learning Systems 34 (11) (2023) 8135–
2023
-
[8]
doi:10.1109/TNNLS.2022.3152527
-
[9]
N. C. Dvornek, D. Yang, P. Ventola, J. S. Dun- can, Learning generalizable recurrent neural networks from small task-fmri datasets, in: International Con- ference on Medical Image Computing and Computer- Assisted Intervention, Springer, 2018, pp. 329–337
2018
-
[10]
J. Wang, N. C. Dvornek, P. Duan, L. H. Staib, P. Ven- tola, J. S. Duncan, STNAGNN: Data-driven spatio- temporal brain connectivity beyond FC, in: Medical Imaging with Deep Learning, 2025
2025
-
[11]
G. Shi, Y. Yao, Y. Zhu, X. Lin, L. Ji, W. Liu, X. Li, Contrastive hierarchical augmentation learning for modeling cognitive and multimodal brain network, IEEE Transactions on Computational Social Systems (2024) 1–11doi:10.1109/TCSS.2024.3402328
-
[12]
X. Wang, L. Yao, I. Rekik, Y. Zhang, Contrastive functional connectivity graph learning for population- based fmri classification, in: L. Wang, Q. Dou, P. T. Fletcher, S.Speidel, S.Li(Eds.), MedicalImageCom- puting and Computer Assisted Intervention – MIC- CAI 2022, Springer Nature Switzerland, Cham, 2022, pp. 221–230
2022
-
[13]
S. I. Ktena, S. Parisot, E. Ferrante, M. Rajchl, M. Lee, B. Glocker, D. Rueckert, Metric learning with spectral graph convolutions on brain connec- tivity networks, NeuroImage 169 (2018) 431–442. doi:https://doi.org/10.1016/j.neuroimage.2017.12.052
-
[14]
X. Wang, Y. Chu, Q. Wang, L. Cao, L. Qiao, L. Zhang, M. Liu, Unsupervised contrastive graph learning for resting-state functional mri analysis and brain disorder detection, Hu- man Brain Mapping 44 (17) (2023) 5672–5692. doi:https://doi.org/10.1002/hbm.26469
-
[15]
Y. Zhou, P. Duan, Y. Du, N. C. Dvornek, Self- supervised pre-training tasks for an fmri time-series transformer in autism detection, in: International Workshop on Machine Learning in Clinical Neu- roimaging, Springer, 2024, pp. 145–154
2024
-
[16]
R. Jiang, N. Zuo, J. M. Ford, S. Qi, D. Zhi, C. Zhuo, Y. Xu, Z. Fu, J. Bustillo, J. A. Turner, V. D. Calhoun, J. Sui, Task-induced brain connectivity promotes the detection of individual differences in brain-behavior relationships, NeuroImage 207 (2020) 116370. doi:https://doi.org/10.1016/j.neuroimage.2019.116370
-
[17]
W. Zhao, C. Makowski, D. J. Hagler, H. P. Garavan, W. K. Thompson, D. J. Greene, T. L. Jernigan, A. M. 10 Dale, Task fmri paradigms may capture more behav- iorally relevant information than resting-state func- tional connectivity, NeuroImage 270 (2023) 119946. doi:https://doi.org/10.1016/j.neuroimage.2023.119946
-
[18]
T. N. Kipf, M. Welling, Semi-supervised classifi- cation with graph convolutional networks, CoRR abs/1609.02907 (2016). arXiv:1609.02907
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[19]
Veličković, G
P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, Y. Bengio, Graph attention networks, in: In- ternational Conference on Learning Representations, 2018
2018
-
[20]
W. L. Hamilton, R. Ying, J. Leskovec, Induc- tive representation learning on large graphs, CoRR abs/1706.02216 (2017). arXiv:1706.02216
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[21]
Gadgil, Q
S. Gadgil, Q. Zhao, A. Pfefferbaum, E. V. Sullivan, E. Adeli, K. M. Pohl, Spatio-temporal graph convo- lution for resting-state fmri analysis, Medical image computing and computer-assisted intervention (MIC- CAI) 12267 (2020) 528–538
2020
-
[22]
X. Chen, H. Fan, R. B. Girshick, K. He, Improved baselines with momentum contrastive learning, CoRR abs/2003.04297 (2020). arXiv:2003.04297
work page internal anchor Pith review Pith/arXiv arXiv 2003
-
[23]
T. Chen, S. Kornblith, M. Norouzi, G. E. Hinton, A simple framework for contrastive learning of vi- sual representations, CoRR abs/2002.05709 (2020). arXiv:2002.05709
work page internal anchor Pith review Pith/arXiv arXiv 2002
- [24]
-
[25]
https://doi.org/10.1016/j.neuroimage.2012.02.018
D. Van Essen, K. Ugurbil, E. Auerbach, D. Barch, T. Behrens, R. Bucholz, A. Chang, L. Chen, M. Cor- betta, S. Curtiss, S. Della Penna, D. Feinberg, M. Glasser, N. Harel, A. Heath, L. Larson-Prior, D. Marcus, G. Michalareas, S. Moeller, R. Oosten- veld, S. Petersen, F. Prior, B. Schlaggar, S. Smith, A. Snyder, J. Xu, E. Yacoub, The human con- nectome proje...
-
[26]
M. D. Kaiser, C. M. Hudac, S. Shultz, S. M. Lee, C. Cheung, A. M. Berken, B. Deen, N. B. Pitskel, D. R. Sugrue, A. C. Voos, C. A. Saulnier, P. Ventola, J. M. Wolf, A. Klin, B. C. V. Wyk, K. A. Pelphrey, Neural signatures of autism, Proceedings of the Na- tional Academy of Sciences 107 (49) (2010) 21223– 21228. doi:10.1073/pnas.1010412107
-
[27]
D. Yang, K. A. Pelphrey, D. G. Sukhodolsky, M. J. Crowley, E. Dayan, N. C. Dvornek, A. Venkatara- man, J. Duncan, L. Staib, P. Ventola, et al., Brain responses to biological motion predict treatment out- come in young children with autism, Translational Psychiatry 6 (11) (2016). doi:10.1038/tp.2016.213
-
[28]
X. Shen, F. Tokoglu, X. Papademetris, R. Con- stable, Groupwise whole-brain parcellation from resting-state fmri data for network node identification, NeuroImage 82 (2013) 403–415. doi:https://doi.org/10.1016/j.neuroimage.2013.05.081
-
[29]
R. S. Desikan, F. Ségonne, B. Fischl, B. T. Quinn, B. C. Dickerson, D. Blacker, R. L. Buck- ner, A. M. Dale, R. P. Maguire, B. T. Hyman, M. S. Albert, R. J. Killiany, An automated la- beling system for subdividing the human cerebral cortex on mri scans into gyral based regions of interest, NeuroImage 31 (3) (2006) 968–980. doi:https://doi.org/10.1016/j.ne...
-
[30]
W. Ding, X. Shen, J. Huang, H. Ju, Y. Chen, T. Yin, Brain age prediction based on resting-state func- tional mri using similarity metric convolutional neu- ral network, IEEE Access 11 (2023) 57071–57082. doi:10.1109/ACCESS.2023.3283148
-
[31]
X. Li, N. C. Dvornek, X. Papademetris, J. Zhuang, L. H. Staib, P. Ventola, J. S. Duncan, 2-channel convolutional 3d deep neural network (2cc3d) for fmri analysis: Asd classification and feature learn- ing, in: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), 2018, pp. 1252–
2018
-
[32]
doi:10.1109/ISBI.2018.8363798
-
[33]
T. Hahn, J. Ernsting, N. R. Winter, V. Holstein, R. Leenings, M. Beisemann, L. Fisch, K. Sarink, D. Emden, N. Opel, R. Redlich, J. Repple, D. Grote- gerd, S. Meinert, J. G. Hirsch, T. Niendorf, B. En- demann, F. Bamberg, T. Kröncke, R. Bülow, H. Völzke, O. von Stackelberg, R. F. Sowade, L. Umutlu, B. Schmidt, S. Caspers, H. Kugel, T. Kircher, B. Risse, C....
-
[34]
H. Li, T. D. Satterthwaite, Y. Fan, Brain age prediction based on resting-state functional con- nectivity patterns using convolutional neural net- works, Proceedings. IEEE International Sympo- sium on Biomedical Imaging 2018 (2018) 101–104. doi:10.1109/ISBI.2018.8363532
- [35]
- [36]
-
[37]
X. Shen, E. S. Finn, D. Scheinost, M. D. Rosen- berg, M. M. Chun, X. Papademetris, R. T. Con- stable, Using connectome-based predictive model- ing to predict individual behavior from brain con- nectivity, Nature Protocols 12 (3) (2017) 506–518. doi:10.1038/nprot.2016.178
- [38]
-
[39]
Ortega Caro, A
J. Ortega Caro, A. H. de Oliveira Fonseca, S. Rizvi, M. Rosati, C. Averill, J. Cross, P. Mittal, E. Zappala, R. Dhodapkar, C. Abdallah, D. van Dijk, BrainLM: A foundation model for brain activity recordings, in: B. Kim, Y. Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, Y. Sun (Eds.), International Conference on Representation Learning, Vol. 2024, 2024, pp. 565– 576
2024
- [40]
-
[41]
D. Tomasi, L. Chang, E. Caparelli, T. Ernst, Sex differences in sensory gating of the thalamus during auditory interference of visual attention tasks, Neuroscience 151 (4) (2008) 1006–1015. doi:https://doi.org/10.1016/j.neuroscience.2007.08.040
-
[42]
S. Kennepohl, V. Sziklas, K. Garver, D. Wag- ner, M. Jones-Gotman, Memory and the medial temporal lobe: Hemispheric specialization re- considered, NeuroImage 36 (3) (2007) 969–978. doi:https://doi.org/10.1016/j.neuroimage.2007.03.049
-
[43]
S. Xu, M. Li, C. Yang, X. Fang, M. Ye, L. Wei, J. Liu, B. Li, Y. Gan, B. Yang, W. Huang, P. Li, X. Meng, Y. Wu, G. Jiang, Altered functional connectivity in childrenwithlow-functionautismspectrumdisorders, Frontiers in Neuroscience Volume 13 - 2019 (2019). doi:10.3389/fnins.2019.00806
-
[44]
H. Jeon, A. Hur, H. Lee, Y.-W. Shin, S.-I. Lee, C.- J. Shin, S. Kim, G. Ju, J. Lee, J. Jung, S. Chung, J.-W. Son, The relationship between brain acti- vation for taking others’ perspective and intero- ceptive abilities in autism spectrum disorder: An fmri study, Journal of the Korean Academy of Child and Adolescent Psychiatry 35 (2024) 197–209. doi:10.576...
-
[45]
Y. Xiao, A. Friederici, D. Margulies, J. Brauer, Longitudinal changes in resting-state fmri from age 5 to age 6 years covary with lan- guage development, NeuroImage 128 (12 2015). doi:10.1016/j.neuroimage.2015.12.008
-
[46]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkor- eit, L. Jones, A. N. Gomez, L. Kaiser, I. Polo- sukhin, Attention is all you need (2017). doi:10.48550/ARXIV.1706.03762
work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.1706.03762 2017
-
[47]
O. Siméoni, H. V. Vo, M. Seitzer, F. Baldassarre, M. Oquab, C. Jose, V. Khalidov, M. Szafraniec, S. Yi, M. Ramamonjisoa, F. Massa, D. Haziza, L. Wehrst- edt, J. Wang, T. Darcet, T. Moutakanni, L. Sentana, C. Roberts, A. Vedaldi, J. Tolan, J. Brandt, C. Cou- prie, J. Mairal, H. Jégou, P. Labatut, P. Bojanowski, Dinov3 (2025). arXiv:2508.10104
work page internal anchor Pith review Pith/arXiv arXiv 2025
-
[48]
A. A. Chen, D. Srinivasan, R. Pomponio, Y. Fan, I. M. Nasrallah, S. M. Resnick, L. L. Beason- Held, C. Davatzikos, T. D. Satterthwaite, D. S. Bassett, R. T. Shinohara, H. Shou, Harmonizing functional connectivity reduces scanner effects in communitydetection, NeuroImage256(2022)119198. doi:https://doi.org/10.1016/j.neuroimage.2022.119198. 12
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.