pith. machine review for the scientific record. sign in

arxiv: 2604.14259 · v1 · submitted 2026-04-15 · 🧬 q-bio.TO · cs.LG· eess.IV

Recognition: unknown

Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Generative Replay

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:21 UTC · model grok-4.3

classification 🧬 q-bio.TO cs.LGeess.IV
keywords continual learningfMRIfunctional connectivityvariational autoencoderbrain disorder diagnosiscatastrophic forgettinggenerative replaymulti-site learning
0
0 comments X

The pith

A structure-aware variational autoencoder enables continual learning for fMRI brain disorder diagnosis from sequentially arriving clinical sites.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that diagnostic models for brain disorders can be trained on fMRI data arriving one clinical site at a time without losing accuracy on earlier sites. It does this by generating synthetic functional connectivity matrices to replay prior knowledge and aligning new and old representations through distillation. A sympathetic reader would care because real medical data rarely arrives all at once from every institution, so methods that require full multi-site access or suffer forgetting cannot be used in practice. The approach is tested on datasets for major depressive disorder, schizophrenia, and autism spectrum disorder, where it outperforms standard continual learning baselines in preserving performance.

Core claim

The authors present the first continual learning framework tailored to fMRI-based diagnosis. A structure-aware variational autoencoder synthesizes realistic functional connectivity matrices for both patient and control groups from previous sites. These replayed samples are used with a multi-level knowledge distillation strategy that aligns both predictions and graph representations, plus a hierarchical contextual bandit that selects which samples to replay. Experiments across multi-site MDD, SZ, and ASD datasets show the generative model improves augmentation quality and the full framework substantially reduces catastrophic forgetting compared with existing methods.

What carries the argument

Structure-aware variational autoencoder that generates synthetic functional connectivity matrices, supported by multi-level knowledge distillation for prediction and graph alignment and a hierarchical contextual bandit for adaptive replay sampling.

If this is right

  • The framework maintains higher diagnostic accuracy on prior sites than existing continual learning approaches when new sites are added sequentially.
  • Synthetic matrices produced by the structure-aware autoencoder improve data augmentation for both patient and control groups.
  • The hierarchical bandit reduces the number of replay samples needed while preserving performance gains.
  • The method supports continual training across heterogeneous sites for multiple disorders without requiring simultaneous access to all data.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the generated matrices preserve privacy-sensitive properties, the approach could allow institutions to share only model updates rather than raw scans.
  • The same generative replay pattern might extend to other sequential neuroimaging tasks, such as longitudinal tracking within a single disorder.
  • Performance on rare subtypes of each disorder would need separate testing, since the current experiments aggregate across typical cases.

Load-bearing premise

The synthetic functional connectivity matrices generated by the autoencoder must match the statistical and graph properties of real data from earlier sites closely enough that replay does not create harmful distribution shift.

What would settle it

Measure whether classification accuracy on data from the first site drops sharply after the model has been trained on data from a second site using the replay method, or whether graph metrics such as clustering coefficients and edge-weight distributions of the generated matrices deviate systematically from those of held-out real matrices.

Figures

Figures reproduced from arXiv: 2604.14259 by Qianyu Chen, Shujian Yu.

Figure 1
Figure 1. Figure 1: Pipeline for transforming raw rs-fMRI data into brain networks for the classification task. [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The continual learning workflow of FORGE across consecutive sites. At site [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Framework of FCM-VAE. Given an input FC graph G, the encoder extracts a hidden representation h, which parameterizes a diagonal Gaussian distribution with mean µ and variance σ, from which the latent variable z is sampled. The decoder, implemented as a gated GNN, reconstructs the adjacency matrix A and predicts the phenotype y through a linear Gaussian supervision head applied directly to z. In the encoder… view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of the averaged AAA and FOR across sequential tasks under a fixed task [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 1
Figure 1. Figure 1: Continual Learning pipeline for transforming raw rs-fMRI data into inputs for the classifi [PITH_FULL_IMAGE:figures/full_fig_p017_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Dual-Level Knowledge Distillation Framework for Continual Graph Learning [PITH_FULL_IMAGE:figures/full_fig_p017_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Visualization of functional connectivity (FC) matrices generated by different FC-graph [PITH_FULL_IMAGE:figures/full_fig_p020_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Sensitivity analysis of λ1, λ2, and λ3 on the ASD dataset. For each hyperparameter, we vary its value while keeping the other two coefficients fixed at their default settings. As illustrated in [PITH_FULL_IMAGE:figures/full_fig_p023_4.png] view at source ↗
read the original abstract

Functional magnetic resonance imaging (fMRI) is widely used for studying and diagnosing brain disorders, with functional connectivity (FC) matrices providing powerful representations of large-scale neural interactions. However, existing diagnostic models are trained either on a single site or under full multi-site access, making them unsuitable for real-world scenarios where clinical data arrive sequentially from different institutions. This results in limited generalization and severe catastrophic forgetting. This paper presents the first continual learning framework specifically designed for fMRI-based diagnosis across heterogeneous clinical sites. Our framework introduces a structure-aware variational autoencoder that synthesizes realistic FC matrices for both patient and control groups. Built on this generative backbone, we develop a multi-level knowledge distillation strategy that aligns predictions and graph representations between new-site data and replayed samples. To further enhance efficiency, we incorporate a hierarchical contextual bandit scheme for adaptive replay sampling. Experiments on multi-site datasets for major depressive disorder (MDD), schizophrenia (SZ), and autism spectrum disorder (ASD) show that the proposed generative model enhances data augmentation quality, and the overall continual learning framework substantially outperforms existing methods in mitigating catastrophic forgetting. Our code is available at https://github.com/4me808/FORGE.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims to present the first continual learning framework for fMRI-based brain disorder diagnosis across heterogeneous clinical sites. It introduces a structure-aware variational autoencoder to synthesize realistic FC matrices for replay, a multi-level knowledge distillation strategy to align predictions and graph representations, and a hierarchical contextual bandit scheme for adaptive replay sampling. Experiments on multi-site datasets for MDD, SZ, and ASD show that the generative model enhances data augmentation quality and the overall framework substantially outperforms existing methods in mitigating catastrophic forgetting.

Significance. If the results hold, the framework could enable practical deployment of diagnostic models under sequential data arrival from different institutions, addressing a key limitation in clinical ML applications. The public code release at the GitHub link is a clear strength that supports reproducibility and extension by the community.

major comments (2)
  1. [Experiments] Experiments section: The abstract and results claim consistent outperformance and enhanced augmentation quality on three disorder datasets, but provide no details on full baseline implementations, statistical significance tests, ablation studies, or data split procedures. This leaves the support for the central claim of substantial forgetting mitigation at a moderate level.
  2. [Generative Model and Replay] Generative model and replay sections: The structure-aware VAE is presented as producing synthetic FC matrices whose statistical and graph properties (correlation structure, modularity, hub connectivity) match real prior-site data, yet no quantitative fidelity metrics (MMD, Wasserstein distances on eigenvalue spectra or graph features) are reported comparing synthetic vs. real samples. This assumption is load-bearing for the multi-level distillation and bandit replay to avoid harmful shift or artifact alignment.
minor comments (2)
  1. [Method] The hierarchical contextual bandit scheme description would benefit from pseudocode or an explicit diagram to clarify context parameters, exploration, and sampling efficiency.
  2. [Method] Notation for the multi-level distillation losses could be made more explicit with equation numbers to facilitate understanding of how predictions and graph representations are aligned.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments. We have addressed each major point below with clarifications and revisions to strengthen the manuscript's experimental rigor and validation of the generative component.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: The abstract and results claim consistent outperformance and enhanced augmentation quality on three disorder datasets, but provide no details on full baseline implementations, statistical significance tests, ablation studies, or data split procedures. This leaves the support for the central claim of substantial forgetting mitigation at a moderate level.

    Authors: We agree that additional experimental details are needed to fully support the claims. In the revised manuscript, we have expanded the Experiments section with: (i) complete specifications of all baseline implementations, including hyperparameters, architectures, and training protocols for methods such as EWC, SI, GEM, and standard replay; (ii) statistical significance testing via paired Wilcoxon signed-rank tests with reported p-values across all comparisons; (iii) comprehensive ablation studies isolating the contributions of the structure-aware VAE, multi-level distillation, and hierarchical bandit sampling; and (iv) explicit data split procedures, including sequential site-wise arrival order, per-site 70/30 train/test splits, and cross-site validation to ensure no data leakage. These additions provide stronger quantitative backing for the forgetting mitigation results. revision: yes

  2. Referee: [Generative Model and Replay] Generative model and replay sections: The structure-aware VAE is presented as producing synthetic FC matrices whose statistical and graph properties (correlation structure, modularity, hub connectivity) match real prior-site data, yet no quantitative fidelity metrics (MMD, Wasserstein distances on eigenvalue spectra or graph features) are reported comparing synthetic vs. real samples. This assumption is load-bearing for the multi-level distillation and bandit replay to avoid harmful shift or artifact alignment.

    Authors: We acknowledge the value of quantitative fidelity metrics to rigorously validate the generative replay. While the original manuscript demonstrated fidelity through qualitative visualizations of correlation structures and improved downstream diagnostic performance, we have added explicit quantitative comparisons in the revision. These include Maximum Mean Discrepancy (MMD) scores, Wasserstein distances computed on the eigenvalue spectra of the FC matrices, and direct comparisons of graph features (modularity indices, hub node degrees, and clustering coefficients) between real and synthetic samples from each prior site. The results confirm close distributional alignment, supporting that the replayed data do not introduce harmful shifts that would undermine the distillation or bandit components. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on external empirical evaluation

full rationale

The paper's derivation chain consists of training a structure-aware VAE on site-specific FC data, using generated samples for replay, applying multi-level distillation, and evaluating diagnostic performance on held-out real multi-site datasets for MDD, SZ, and ASD. No load-bearing step reduces a claimed prediction or result to its own fitted inputs by construction, nor relies on self-citation chains or imported uniqueness theorems. The reported gains in mitigating catastrophic forgetting are measured against external baselines on actual patient data, making the framework self-contained against independent benchmarks.

Axiom & Free-Parameter Ledger

3 free parameters · 2 axioms · 3 invented entities

The central claim rests on standard VAE reconstruction assumptions plus domain-specific assumptions about FC matrix structure and the effectiveness of replay for forgetting mitigation; several new algorithmic components are introduced without independent theoretical guarantees.

free parameters (3)
  • VAE architecture and loss weights
    Latent dimension, reconstruction loss balance, and structure-preserving regularizers are chosen or tuned to generate realistic FC matrices.
  • Distillation hyperparameters
    Temperature, feature alignment weights, and prediction alignment coefficients for multi-level knowledge distillation.
  • Bandit exploration and context parameters
    Parameters controlling the hierarchical contextual bandit for adaptive replay sample selection.
axioms (2)
  • domain assumption Functional connectivity matrices from fMRI capture diagnostically relevant large-scale neural interactions
    Invoked throughout the introduction and method as the basis for using FC matrices as input representation.
  • ad hoc to paper Synthetic FC matrices generated by the structure-aware VAE can serve as faithful replay buffers without introducing systematic bias
    Core premise of the generative replay strategy; required for the continual learning claim to hold.
invented entities (3)
  • Structure-aware variational autoencoder no independent evidence
    purpose: Synthesize realistic patient and control FC matrices that preserve graph topology
    New generative backbone introduced to support replay in the continual learning setting.
  • Multi-level knowledge distillation strategy no independent evidence
    purpose: Align both final predictions and intermediate graph representations between new data and replayed samples
    Novel distillation mechanism proposed to improve knowledge transfer in this domain.
  • Hierarchical contextual bandit scheme no independent evidence
    purpose: Adaptively select which generated samples to replay for training efficiency
    New sampling policy introduced to optimize replay buffer usage.

pith-pipeline@v0.9.0 · 5514 in / 1734 out tokens · 38731 ms · 2026-05-10T12:21:45.797957+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

56 extracted references · 13 canonical work pages · 3 internal anchors

  1. [1]

    Leveraging clinical data across healthcare institutions for continual learning of predictive risk models

    Fatemeh Amrollahi, Supreeth P Shashikumar, Andre L Holder, and Shamim Nemati. Leveraging clinical data across healthcare institutions for continual learning of predictive risk models. Scientific reports, 12(1):8380, 2022

  2. [2]

    Dark experience for general continual learning: a strong, simple baseline.Advances in neural information processing systems, 33:15920–15930, 2020

    Pietro Buzzega, Matteo Boschini, Angelo Porrello, Davide Abati, and Simone Calderara. Dark experience for general continual learning: a strong, simple baseline.Advances in neural information processing systems, 33:15920–15930, 2020

  3. [3]

    Catastrophic forgetting in deep graph networks: an introductory benchmark for graph classification.arXiv preprint arXiv:2103.11750, 2021

    Antonio Carta, Andrea Cossu, Federico Errica, and Davide Bacciu. Catastrophic forgetting in deep graph networks: an introductory benchmark for graph classification.arXiv preprint arXiv:2103.11750, 2021

  4. [4]

    Efficient lifelong learning with a-GEM

    Arslan Chaudhry, Marc’Aurelio Ranzato, Marcus Rohrbach, and Mohamed Elhoseiny. Efficient lifelong learning with a-GEM. InInternational Conference on Learning Representations, 2019

  5. [5]

    On Tiny Episodic Memories in Continual Learning

    Arslan Chaudhry, Marcus Rohrbach, Mohamed Elhoseiny, Thalaiyasingam Ajanthan, Puneet K Dokania, Philip HS Torr, and Marc’Aurelio Ranzato. On tiny episodic memories in continual learning.arXiv preprint arXiv:1902.10486, 2019

  6. [6]

    Psychosis biotypes: replication and validation from the b-snip consortium.Schizophrenia bulletin, 48(1):56–68, 2022

    Brett A Clementz, David A Parker, Rebekah L Trotti, Jennifer E McDowell, Sarah K Keedy, Matcheri S Keshavan, Godfrey D Pearlson, Elliot S Gershon, Elena I Ivleva, Ling-Yu Huang, et al. Psychosis biotypes: replication and validation from the b-snip consortium.Schizophrenia bulletin, 48(1):56–68, 2022

  7. [7]

    Continual learning with differential privacy

    Pradnya Desai, Phung Lai, NhatHai Phan, and My T Thai. Continual learning with differential privacy. InInternational Conference on Neural Information Processing, pages 334–343. Springer, 2021

  8. [8]

    The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014

    Adriana Di Martino, Chao-Gan Yan, Qingyang Li, Erin Denio, Francisco X Castellanos, Kaat Alaerts, Jeffrey S Anderson, Michal Assaf, Susan Y Bookheimer, Mirella Dapretto, et al. The autism brain imaging data exchange: towards a large-scale evaluation of the intrinsic brain architecture in autism.Molecular psychiatry, 19(6):659–667, 2014

  9. [9]

    Online continual decoding of streaming eeg signal with a balanced and informative memory buffer.Neural Networks, 176:106338, 2024

    Tiehang Duan, Zhenyi Wang, Fang Li, Gianfranco Doretto, Donald A Adjeroh, Yiyi Yin, and Cui Tao. Online continual decoding of streaming eeg signal with a balanced and informative memory buffer.Neural Networks, 176:106338, 2024

  10. [10]

    A generalization of transformer networks to graphs.arXiv preprint arXiv:2012.09699, 2020a

    Vijay Prakash Dwivedi and Xavier Bresson. A generalization of transformer networks to graphs. arXiv preprint arXiv:2012.09699, 2020

  11. [11]

    Func- tional connectivity signatures of major depressive disorder: machine learning analysis of two multicenter neuroimaging studies.Molecular Psychiatry, 28(7):3013–3022, 2023

    Selene Gallo, Ahmed El-Gazzar, Paul Zhutovsky, Rajat M Thomas, Nooshin Javaheripour, Meng Li, Lucie Bartova, Deepti Bathula, Udo Dannlowski, Christopher Davey, et al. Func- tional connectivity signatures of major depressive disorder: machine learning analysis of two multicenter neuroimaging studies.Molecular Psychiatry, 28(7):3013–3022, 2023. 12

  12. [12]

    Introduction to medical data privacy

    Aris Gkoulalas-Divanis and Grigorios Loukides. Introduction to medical data privacy. In Medical data privacy handbook, pages 1–14. Springer, 2015

  13. [13]

    Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

    Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs.Advances in neural information processing systems, 30, 2017

  14. [14]

    Universal graph continual learning.arXiv preprint arXiv:2308.13982, 2023

    Thanh Duc Hoang, Do Viet Tung, Duy-Hung Nguyen, Bao-Sinh Nguyen, Huy Hoang Nguyen, and Hung Le. Universal graph continual learning.arXiv preprint arXiv:2308.13982, 2023

  15. [15]

    Latent space approaches to social network analysis.Journal of the american Statistical association, 97(460):1090–1098, 2002

    Peter D Hoff, Adrian E Raftery, and Mark S Handcock. Latent space approaches to social network analysis.Journal of the american Statistical association, 97(460):1090–1098, 2002

  16. [16]

    Brainnetcnn: Convolutional neural networks for brain networks; towards predicting neurodevelopment.NeuroImage, 146:1038–1049, 2017

    Jeremy Kawahara, Colin J Brown, Steven P Miller, Brian G Booth, Vann Chau, Ruth E Grunau, Jill G Zwicker, and Ghassan Hamarneh. Brainnetcnn: Convolutional neural networks for brain networks; towards predicting neurodevelopment.NeuroImage, 146:1038–1049, 2017

  17. [17]

    Sddgr: Stable diffusion-based deep generative replay for class incremental object detection

    Junsu Kim, Hoseong Cho, Jihyeon Kim, Yihalem Yimolal Tiruneh, and Seungryul Baek. Sddgr: Stable diffusion-based deep generative replay for class incremental object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 28772–28781, 2024

  18. [18]

    Kipf et al.Variational Graph Auto- Encoders

    Thomas N Kipf and Max Welling. Variational graph auto-encoders.arXiv preprint arXiv:1611.07308, 2016

  19. [19]

    Semi-Supervised Classification with Graph Convolutional Networks

    Thomas N Kipf and Max Welling. Semi-supervised classification with graph convolutional networks.arXiv preprint arXiv:1609.02907, 2016

  20. [20]

    Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13):3521–3526, 2017

    James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, et al. Overcoming catastrophic forgetting in neural networks.Proceedings of the national academy of sciences, 114(13):3521–3526, 2017

  21. [21]

    Begin: Extensive benchmark scenarios and an easy-to-use framework for graph continual learning

    Jihoon Ko, Shinhwan Kang, Taehyung Kwon, Heechan Moon, and Kijung Shin. Begin: Extensive benchmark scenarios and an easy-to-use framework for graph continual learning. ACM Transactions on Intelligent Systems and Technology, 16(1):1–22, 2025

  22. [22]

    Brainnetgan: Data augmentation of brain connectivity using generative adversarial network for dementia classification

    Chao Li, Yiran Wei, Xi Chen, and Carola-Bibiane Schönlieb. Brainnetgan: Data augmentation of brain connectivity using generative adversarial network for dementia classification. In MICCAI Workshop on Deep Generative Models, pages 103–111. Springer, 2021

  23. [23]

    Multi-site fmri analysis using privacy-preserving federated learning and domain adaptation: Abide results.Medical image analysis, 65:101765, 2020

    Xiaoxiao Li, Yufeng Gu, Nicha Dvornek, Lawrence H Staib, Pamela Ventola, and James S Duncan. Multi-site fmri analysis using privacy-preserving federated learning and domain adaptation: Abide results.Medical image analysis, 65:101765, 2020

  24. [24]

    Learning without forgetting.IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017

    Zhizhong Li and Derek Hoiem. Learning without forgetting.IEEE transactions on pattern analysis and machine intelligence, 40(12):2935–2947, 2017

  25. [25]

    Overcoming catastrophic forgetting in graph neural networks

    Huihui Liu, Yiding Yang, and Xinchao Wang. Overcoming catastrophic forgetting in graph neural networks. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 8653–8661, 2021

  26. [26]

    Graph auto-encoding brain networks with applications to analyzing large-scale brain imaging datasets.Neuroimage, 245:118750, 2021

    Meimei Liu, Zhengwu Zhang, and David B Dunson. Graph auto-encoding brain networks with applications to analyzing large-scale brain imaging datasets.Neuroimage, 245:118750, 2021

  27. [27]

    What we can do and what we cannot do with fmri.Nature, 453(7197): 869–878, 2008

    Nikos K Logothetis. What we can do and what we cannot do with fmri.Nature, 453(7197): 869–878, 2008

  28. [28]

    Packnet: Adding multiple tasks to a single network by iterative pruning

    Arun Mallya and Svetlana Lazebnik. Packnet: Adding multiple tasks to a single network by iterative pruning. InProceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 7765–7773, 2018

  29. [29]

    Catastrophic interference in connectionist networks: The sequential learning problem

    Michael McCloskey and Neal J Cohen. Catastrophic interference in connectionist networks: The sequential learning problem. InPsychology of learning and motivation, volume 24, pages 109–165. Elsevier, 1989. 13

  30. [30]

    Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

    German I Parisi, Ronald Kemker, Jose L Part, Christopher Kanan, and Stefan Wermter. Continual lifelong learning with neural networks: A review.Neural networks, 113:54–71, 2019

  31. [31]

    Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards.NeuroImage: Clinical, 7:359–366, 2015

    Mark Plitt, Kelly Anne Barnes, and Alex Martin. Functional connectivity classification of autism identifies highly predictive brain features but falls short of biomarker standards.NeuroImage: Clinical, 7:359–366, 2015

  32. [32]

    icarl: Incremental classifier and representation learning

    Sylvestre-Alvise Rebuffi, Alexander Kolesnikov, Georg Sperl, and Christoph H Lampert. icarl: Incremental classifier and representation learning. InProceedings of the IEEE conference on Computer Vision and Pattern Recognition, pages 2001–2010, 2017

  33. [33]

    Expe- rience replay for continual learning.Advances in neural information processing systems, 32, 2019

    David Rolnick, Arun Ahuja, Jonathan Schwarz, Timothy Lillicrap, and Gregory Wayne. Expe- rience replay for continual learning.Advances in neural information processing systems, 32, 2019

  34. [34]

    Progressive Neural Networks

    Andrei A Rusu, Neil C Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. Progressive neural networks.arXiv preprint arXiv:1606.04671, 2016

  35. [35]

    Towards a new approach to reveal dynamical organization of the brain using topological data analysis.Nature communications, 9(1):1399, 2018

    Manish Saggar, Olaf Sporns, Javier Gonzalez-Castillo, Peter A Bandettini, Gunnar Carlsson, Gary Glover, and Allan L Reiss. Towards a new approach to reveal dynamical organization of the brain using topological data analysis.Nature communications, 9(1):1399, 2018

  36. [36]

    Continual learning for seizure prediction via memory projection strategy.Computers in Biology and Medicine, 181:109028, 2024

    Yufei Shi, Shishi Tang, Yuxuan Li, Zhipeng He, Shengsheng Tang, Ruixuan Wang, Weishi Zheng, Ziyi Chen, and Yi Zhou. Continual learning for seizure prediction via memory projection strategy.Computers in Biology and Medicine, 181:109028, 2024

  37. [37]

    Llm-based privacy data augmentation guided by knowledge distillation with a distribution tutor for medical text classification.arXiv preprint arXiv:2402.16515, 2024

    Yiping Song, Juhua Zhang, Zhiliang Tian, Yuxin Yang, Minlie Huang, and Dongsheng Li. Llm-based privacy data augmentation guided by knowledge distillation with a distribution tutor for medical text classification.arXiv preprint arXiv:2402.16515, 2024

  38. [38]

    Graph-regularized manifold-aware conditional wasserstein gan for brain functional connectivity generation.Human Brain Mapping, 46(12):e70322, 2025

    Yee-Fan Tan, Fuad Noman, Raphaël C-W Phan, Hernando Ombao, and Chee-Ming Ting. Graph-regularized manifold-aware conditional wasserstein gan for brain functional connectivity generation.Human Brain Mapping, 46(12):e70322, 2025

  39. [39]

    Continual learning on graphs: A survey.arXiv preprint arXiv:2402.06330, 2024

    Zonggui Tian, Du Zhang, and Hong-Ning Dai. Continual learning on graphs: A survey.arXiv preprint arXiv:2402.06330, 2024

  40. [40]

    Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri single-subject brain.Neuroimage, 15(1):273–289, 2002

    Nathalie Tzourio-Mazoyer, Brigitte Landeau, Dimitri Papathanassiou, Fabrice Crivello, Octave Etard, Nicolas Delcroix, Bernard Mazoyer, and Marc Joliot. Automated anatomical labeling of activations in spm using a macroscopic anatomical parcellation of the mni mri single-subject brain.Neuroimage, 15(1):273–289, 2002

  41. [41]

    Three scenarios for continual learning

    Gido M Van de Ven and Andreas S Tolias. Three scenarios for continual learning.arXiv preprint arXiv:1904.07734, 2019

  42. [42]

    Graph Attention Networks

    Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903, 2017

  43. [43]

    Dpabi: data processing & analysis for (resting-state) brain imaging.Neuroinformatics, 14(3):339–351, 2016

    Chao-Gan Yan, Xin-Di Wang, Xi-Nian Zuo, and Yu-Feng Zang. Dpabi: data processing & analysis for (resting-state) brain imaging.Neuroinformatics, 14(3):339–351, 2016

  44. [44]

    Reduced default mode network functional connectivity in patients with recurrent major depressive disorder.Proceedings of the National Academy of Sciences, 116(18):9078–9083, 2019

    Chao-Gan Yan, Xiao Chen, Le Li, Francisco Xavier Castellanos, Tong-Jian Bai, Qi-Jing Bo, Jun Cao, Guan-Mao Chen, Ning-Xuan Chen, Wei Chen, et al. Reduced default mode network functional connectivity in patients with recurrent major depressive disorder.Proceedings of the National Academy of Sciences, 116(18):9078–9083, 2019

  45. [45]

    Do transformers really perform badly for graph representation?Advances in neural information processing systems, 34:28877–28888, 2021

    Chengxuan Ying, Tianle Cai, Shengjie Luo, Shuxin Zheng, Guolin Ke, Di He, Yanming Shen, and Tie-Yan Liu. Do transformers really perform badly for graph representation?Advances in neural information processing systems, 34:28877–28888, 2021

  46. [46]

    Lifelong learning with dynamically expandable networks

    Jaehong Yoon, Eunho Yang, Jeongtae Lee, and Sung Ju Hwang. Lifelong learning with dynamically expandable networks. InInternational Conference on Learning Representations, 2018. 14

  47. [47]

    Data-free knowledge distillation via feature exchange and activation region constraint

    Shikang Yu, Jiachen Chen, Hu Han, and Shuqiang Jiang. Data-free knowledge distillation via feature exchange and activation region constraint. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 24266–24275, 2023

  48. [48]

    Continual learning through synaptic intelligence

    Friedemann Zenke, Ben Poole, and Surya Ganguli. Continual learning through synaptic intelligence. InInternational conference on machine learning, pages 3987–3995. Pmlr, 2017

  49. [49]

    Efficient and robust continual graph learning for graph classification in biology.IEEE Transactions on Signal and Information Processing over Networks, 2025

    Ding Zhang, Jane Downer, Can Chen, and Ren Wang. Efficient and robust continual graph learning for graph classification in biology.IEEE Transactions on Signal and Information Processing over Networks, 2025

  50. [50]

    Continual distillation learning: Knowledge distillation in prompt-based continual learning.arXiv preprint arXiv:2407.13911, 2024

    Qifan Zhang, Yunhui Guo, and Yu Xiang. Continual distillation learning: Knowledge distillation in prompt-based continual learning.arXiv preprint arXiv:2407.13911, 2024

  51. [51]

    Hierarchical prototype networks for continual graph representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4622–4636, 2022

    Xikun Zhang, Dongjin Song, and Dacheng Tao. Hierarchical prototype networks for continual graph representation learning.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(4):4622–4636, 2022

  52. [52]

    Topology-aware embedding memory for continual learning on expanding networks

    Xikun Zhang, Dongjin Song, Yixin Chen, and Dacheng Tao. Topology-aware embedding memory for continual learning on expanding networks. InProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4326–4337, 2024

  53. [53]

    A generative adaptive replay continual learning model for temporal knowledge graph reasoning

    Zhiyu Zhang, Wei Chen, Youfang Lin, and Huaiyu Wan. A generative adaptive replay continual learning model for temporal knowledge graph reasoning. InProceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 10964–10977, 2025

  54. [54]

    Kaizhong Zheng, Shujian Yu, Baojuan Li, Robert Jenssen, and Badong Chen. Brainib: Inter- pretable brain network-based psychiatric diagnosis with graph information bottleneck.IEEE Transactions on Neural Networks and Learning Systems, 36(7):13066–13079, 2024

  55. [55]

    Overcoming catastrophic forgetting in graph neural networks with experience replay

    Fan Zhou and Chengtai Cao. Overcoming catastrophic forgetting in graph neural networks with experience replay. InProceedings of the AAAI conference on artificial intelligence, volume 35, pages 4714–4722, 2021

  56. [56]

    Brainuicl: An unsupervised individual continual learning framework for eeg applications

    Yangxuan Zhou, Sha Zhao, Jiquan Wang, Haiteng Jiang, Shijian Li, Tao Li, and Gang Pan. Brainuicl: An unsupervised individual continual learning framework for eeg applications. In The Thirteenth International Conference on Learning Representations, 2025. 15 Continual Learning for fMRI-Based Brain Disorder Diagnosis via Functional Connectivity Matrices Gene...