MSCGC-KAN: Multi-scale Causal Graph Convolution and Kolmogorov-Arnold Feature Mapping for EEG Emotion Recognition

Haoliang Gong; Jiale Xu; Qingshan She; Xugang Xi; Yunyan Gao

arxiv: 2605.26624 · v2 · pith:VHZURSSInew · submitted 2026-05-26 · 💻 cs.CV

MSCGC-KAN: Multi-scale Causal Graph Convolution and Kolmogorov-Arnold Feature Mapping for EEG Emotion Recognition

Haoliang Gong , Qingshan She , Jiale Xu , Yunyan Gao , Xugang Xi This is my paper

Pith reviewed 2026-06-29 18:19 UTC · model grok-4.3

classification 💻 cs.CV

keywords EEG emotion recognitionmulti-scale causal graph convolutionKolmogorov-Arnold networksfine-tuningpre-trained EEG modelstask-specific headaffective computingbrain-computer interfaces

0 comments

The pith

A task head with multi-scale causal graph convolution and Kolmogorov-Arnold mapping improves fine-tuning of pre-trained EEG models for emotion recognition.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that three limitations in adapting pre-trained EEG foundation models to emotion recognition—insufficient multi-scale temporal modeling, weak inter-channel connectivity exploitation, and linear classification heads—can be addressed by a compact structured task head. This head combines multi-scale causal graph convolution to capture dynamic patterns across time scales and channels with Kolmogorov-Arnold networks to enable nonlinear feature transformations. Experiments on the FACED and SEED-VII datasets demonstrate concrete gains in balanced accuracy, Cohen's Kappa, and weighted F1-score over a simple linear baseline attached to the same backbone. A sympathetic reader would care because the approach keeps the benefits of large pre-trained representations while making the final stage more attuned to emotion-specific signals, offering a practical route to higher performance without retraining the entire model.

Core claim

Built on a pre-trained CBraMod backbone, MSCGC-KAN introduces a structured task head composed of multi-scale causal graph convolution and Kolmogorov-Arnold feature mapping. This design jointly strengthens multi-scale temporal modeling, learnable inter-channel connectivity modeling, and nonlinear discriminative mapping within a compact task-specific head. The method preserves the representation advantage of the foundation model while making the classifier more sensitive to emotion-related spatiotemporal patterns, resulting in balanced accuracy of 60.66% on FACED and 33.27% on SEED-VII, with gains of 5.91 and 2.03 percentage points over the linear baseline.

What carries the argument

MSCGC-KAN task head, which uses multi-scale causal graph convolution to model temporal dynamics and inter-channel relations, followed by Kolmogorov-Arnold networks to perform nonlinear feature mapping.

If this is right

The method reaches 60.66% balanced accuracy, 0.5525 Cohen's Kappa, and 60.40% weighted F1 on FACED.
It reaches 33.27% balanced accuracy, 0.2223 Cohen's Kappa, and 33.64% weighted F1 on SEED-VII.
Balanced accuracy improves by 5.91 percentage points on FACED and 2.03 percentage points on SEED-VII over the CBraMod+Linear baseline.
Structured task-head design provides an effective route to better emotion recognition performance during fine-tuning of pre-trained EEG models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same head architecture could be attached to other pre-trained EEG backbones to test whether the gains depend on the specific CBraMod representations.
The multi-scale causal graph and KAN combination might transfer to other EEG classification tasks such as motor imagery or sleep staging.
Ablation studies that isolate the contribution of each scale in the graph convolution could clarify which temporal resolutions drive the observed improvements.

Load-bearing premise

The measured accuracy gains arise specifically because the multi-scale causal graph convolution and Kolmogorov-Arnold components resolve the three listed limitations in fine-tuning rather than from other experimental choices.

What would settle it

Re-running the fine-tuning experiments on the same datasets and backbone but replacing the proposed head with an alternative nonlinear head that lacks the graph convolution component and observing no comparable gains in balanced accuracy would falsify the central claim.

Figures

Figures reproduced from arXiv: 2605.26624 by Haoliang Gong, Jiale Xu, Qingshan She, Xugang Xi, Yunyan Gao.

**Figure 2.** Figure 2: Confusion matrix on FACED. Major confusions are concentrated among emotionally similar [PITH_FULL_IMAGE:figures/full_fig_p023_2.png] view at source ↗

**Figure 3.** Figure 3: Confusion matrix on SEED-VII. The diagonal-dominant structure indicates that the model learns [PITH_FULL_IMAGE:figures/full_fig_p023_3.png] view at source ↗

**Figure 4.** Figure 4: t-SNE visualization of stage-wise feature spaces on FACED. The three-stage comparison shows [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗

**Figure 5.** Figure 5: t-SNE visualization of stage-wise feature spaces on SEED-VII. Compared with the backbone [PITH_FULL_IMAGE:figures/full_fig_p024_5.png] view at source ↗

**Figure 6.** Figure 6: Visualization of ablation results on FACED and SEED-VII. The left panel corresponds to FACED [PITH_FULL_IMAGE:figures/full_fig_p025_6.png] view at source ↗

**Figure 7.** Figure 7: Visualization of the learnable adjacency matrix on FACED, including the learned connectivity [PITH_FULL_IMAGE:figures/full_fig_p025_7.png] view at source ↗

**Figure 8.** Figure 8: Scalp topography visualization on FACED. Different emotions exhibit distinguishable spatial [PITH_FULL_IMAGE:figures/full_fig_p026_8.png] view at source ↗

**Figure 9.** Figure 9: Grad-CAM temporal heat map on FACED. High-response regions indicate the temporal segments [PITH_FULL_IMAGE:figures/full_fig_p027_9.png] view at source ↗

**Figure 10.** Figure 10: Basis-response distributions and projection-weight importance of the KAN layer on FACED. The [PITH_FULL_IMAGE:figures/full_fig_p027_10.png] view at source ↗

read the original abstract

Electroencephalogram (EEG)-based emotion recognition is an important affective computing task, and recent EEG foundation models provide useful generic representations for downstream adaptation. However, under the fine-tuning setting, three limitations remain prominent: insufficient modeling of multi-scale emotional dynamics, inadequate exploitation of inter-channel functional connectivity, and the limited expressive power of simple linear classification heads. To address these issues, this paper proposes a new EEG emotion recognition method, termed MSCGC-KAN, which introduces a structured task head composed of multi-scale causal graph convolution and Kolmogorov--Arnold feature mapping. Built on a pre-trained CBraMod backbone, MSCGC-KAN enhances downstream adaptation by jointly strengthening multi-scale temporal modeling, learnable inter-channel connectivity modeling, and nonlinear discriminative mapping within a compact task-specific head. This design preserves the representation advantage of the foundation model while making the classifier more sensitive to emotion-related spatiotemporal patterns. Extensive experiments are conducted on the public FACED and SEED-VII datasets. The proposed method achieves a balanced accuracy of 60.66\%, a Cohen's Kappa of 0.5525, and a weighted F1-score of 60.40\% on FACED, and obtains 33.27\%, 0.2223, and 33.64\%, respectively, on SEED-VII. Compared with the CBraMod+Linear baseline, the balanced accuracy is improved by 5.91 and 2.03 percentage points on the two datasets, respectively. These results indicate that structured task-head design is an effective way to improve EEG emotion recognition when fine-tuning pre-trained EEG models.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

MSCGC-KAN adds a graph-conv plus KAN task head on CBraMod and reports 2-6 point accuracy lifts, but the abstract supplies no ablations, stats, or protocols so the attribution cannot be checked.

read the letter

The main thing to know is that this paper puts forward MSCGC-KAN, a compact task head that stacks multi-scale causal graph convolution for temporal and channel modeling with Kolmogorov-Arnold mapping for the final classifier, all on top of the frozen CBraMod backbone. It reports balanced accuracy of 60.66% on FACED and 33.27% on SEED-VII, which is 5.91 and 2.03 points above the linear-head baseline.

The work does identify three practical bottlenecks in fine-tuning EEG foundation models and designs one component for each. Using graph convolution to learn inter-channel relations and KAN to replace a linear layer is a reasonable way to keep the expensive backbone fixed while improving the adaptation stage. That direction is worth noting for anyone doing similar downstream work.

The soft spots are straightforward. The abstract states the final numbers but gives no training details, no ablation results that remove one module at a time, no error bars, and no significance tests. Without those, there is no way to know whether the reported deltas come from the proposed modules or from other differences in the fine-tuning setup. The stress-test concern about unisolated gains is accurate on the evidence supplied.

This paper is aimed at the small group of researchers working on EEG emotion recognition with foundation models. Someone already running similar experiments might pick up the head design as an idea to test, but the current text does not give enough to evaluate the claims.

I would not send it for peer review in this form. If the full manuscript contains the missing ablations and controls, then it could be worth a referee's time as an incremental methods note; otherwise it stays at the level of an unverified claim.

Referee Report

1 major / 0 minor

Summary. The manuscript proposes MSCGC-KAN, a structured task head for fine-tuning the pre-trained CBraMod EEG foundation model on emotion recognition. The head combines multi-scale causal graph convolution (for temporal dynamics and learnable inter-channel connectivity) with Kolmogorov-Arnold feature mapping (for nonlinear classification). On the FACED and SEED-VII datasets the method reports balanced accuracies of 60.66% and 33.27%, respectively, corresponding to gains of 5.91 and 2.03 percentage points over the CBraMod+Linear baseline.

Significance. If the reported gains can be isolated to the proposed components, the work would demonstrate that compact, domain-structured task heads can meaningfully improve adaptation of EEG foundation models while preserving the backbone's representations. This would be a practical contribution to affective computing pipelines that rely on pre-trained models.

major comments (1)

[Abstract] Abstract (results paragraph): The central claim attributes the 5.91 pp and 2.03 pp balanced-accuracy improvements specifically to the multi-scale causal graph convolution, learnable inter-channel connectivity modeling, and KAN components. However, the manuscript supplies only the end-to-end comparison against the linear baseline; no ablation studies that remove or replace individual modules, no matched-capacity controls, no optimizer/training-protocol details, and no statistical significance tests on the deltas are referenced. This prevents verification that the observed lifts arise from the claimed mechanisms rather than other experimental factors.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback. We address the major comment on the attribution of performance gains point by point below.

read point-by-point responses

Referee: [Abstract] Abstract (results paragraph): The central claim attributes the 5.91 pp and 2.03 pp balanced-accuracy improvements specifically to the multi-scale causal graph convolution, learnable inter-channel connectivity modeling, and KAN components. However, the manuscript supplies only the end-to-end comparison against the linear baseline; no ablation studies that remove or replace individual modules, no matched-capacity controls, no optimizer/training-protocol details, and no statistical significance tests on the deltas are referenced. This prevents verification that the observed lifts arise from the claimed mechanisms rather than other experimental factors.

Authors: We agree that the current version reports only the end-to-end comparison and does not include the requested controls. In the revised manuscript we will add (i) ablation variants that successively remove the multi-scale causal graph convolution, the learnable inter-channel connectivity, and the KAN mapping, (ii) matched-capacity MLP and linear baselines trained under identical protocols, (iii) explicit optimizer, learning-rate schedule, and training-hyperparameter details, and (iv) statistical significance tests (paired t-test or Wilcoxon signed-rank) on the reported deltas. These additions will allow direct verification that the observed gains are attributable to the proposed components. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical performance claims rest on external dataset comparisons, not self-referential definitions or fitted inputs

full rationale

The paper reports balanced-accuracy gains of 5.91 pp and 2.03 pp on FACED and SEED-VII relative to a CBraMod+Linear baseline. These are presented as experimental outcomes from fine-tuning a pre-trained backbone with an added task head; no equations, parameter-fitting steps, or derivations are supplied in the abstract that would allow any reported metric to reduce to its own inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked. The central claim therefore remains an ordinary empirical comparison whose validity can be checked against the stated datasets and protocols, satisfying the self-contained criterion.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract supplies no information on free parameters, axioms, or invented entities; all such elements remain unknown.

pith-pipeline@v0.9.1-grok · 5839 in / 1373 out tokens · 62738 ms · 2026-06-29T18:19:37.452603+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

43 extracted references · 16 canonical work pages · 2 internal anchors

[1]

G.Assuncao,B.Patrao,M.Castelo-Branco,P.Menezes,Anoverviewofemotion in artificial intelligence, IEEE Trans. Artif. Intell. 3 (6) (2022) 867–886

2022
[2]

Zhang, Y

T. Zhang, Y. Zong, W. Zheng, et al., Cross-database micro-expression recogni- tion: a benchmark, IEEE Trans. Knowl. Data Eng. 34 (2) (2022) 544–559

2022
[3]

Y. Lei, S. Yang, X. Wang, L. Xie, MsEmoTTS: Multi-scale emotion transfer, prediction,andcontrolforemotionalspeechsynthesis,IEEE/ACMTrans.Audio Speech Lang. Process. 30 (2022) 853–864

2022
[4]

Zhang, X

T. Zhang, X. Gong, C. L. P. Chen, BMT-Net: Broad multitask transformer network for sentiment analysis, IEEE Trans. Cybern. 52 (7) (2022) 6232–6243

2022
[5]

Q.She,X.Shi,F.Fang,Y.Ma,Y.Zhang,Cross-subjectEEGemotionrecognition using multi-source domain manifold feature selection, Comput. Biol. Med. 159 (2023) 106860.doi:10.1016/j.compbiomed.2023.106860

work page doi:10.1016/j.compbiomed.2023.106860 2023
[6]

W.-L.Zheng,J.-Y.Zhu,B.-L.Lu,Identifyingstablepatternsovertimeforemotion recognition from EEG, IEEE Trans. Affect. Comput. 10 (3) (2019) 417–429

2019
[7]

Samal, M

P. Samal, M. F. Hashmi, Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review, Artificial Intelligence Review 57 (50) (2024). 31

2024
[8]

Y. Ding, S. Zhang, C. Tang, C. Guan, MASA-TCN: Multi-anchor space-aware temporalconvolutionalneuralnetworksforcontinuousanddiscreteEEGemotion recognition, IEEE J. Biomed. Health Inform. 28 (7) (2024) 3953–3964.doi: 10.1109/JBHI.2024.3392564

work page doi:10.1109/jbhi.2024.3392564 2024
[9]

D. Li, B. Chai, Z. Wang, H. Yang, W. Du, EEG emotion recognition based on 3-d feature representation and dilated fully convolutional networks, IEEE Trans. Cogn.Dev.Syst.13(4)(2021)885–897.doi:10.1109/TCDS.2021.3051465

work page doi:10.1109/tcds.2021.3051465 2021
[10]

Z.Cheng,X.Bu,Q.Wang,T.Yang,J.Tu,EEG-basedemotionrecognitionusing multi-scale dynamic CNN and gated transformer, Scientific Reports 14 (2024) 31319

2024
[11]

Z. Jia, Y. Lin, J. Cai, et al., SST-EmotionNet: Spatial-spectral-temporal based attention 3D dense network for EEG emotion recognition, in: Proc. ACM Int. Conf. Multimedia, 2020

2020
[12]

W. Tao, C. Li, R. Song, et al., EEG-based emotion recognition via channel-wise attentionandselfattention,IEEETrans.Affect.Comput.14(1)(2023)382–393

2023
[13]

R. Liu, Y. Chao, X. Ma, X. Sha, L. Sun, S. Li, S. Chang, ERTNet: an inter- pretabletransformer-basedframeworkforEEGemotionrecognition,Frontiersin Neuroscience 18 (2024) 1320645

2024
[14]

G.Zhang,M.Yu,Y.-J.Liu,G.Zhao,D.Zhang,W.Zheng,SparseDGCNN:Rec- ognizing emotion from multichannel EEG signals, IEEE Trans. Affect. Comput. 14 (1) (2023) 537–548.doi:10.1109/TAFFC.2021.3051332

work page doi:10.1109/taffc.2021.3051332 2023
[15]

Zhong, D

P. Zhong, D. Wang, C. Miao, EEG-based emotion recognition using regularized graph neural networks, IEEE Trans. Affect. Comput. 13 (3) (2022) 1290–1301

2022
[16]

T. Song, W. Zheng, P. Song, Z. Cui, EEG emotion recognition using dynamical 32 graphconvolutionalneuralnetworks,IEEETrans.Affect.Comput.11(3)(2020) 532–541

2020
[17]

W. Chen, Y. Liao, R. Dai, Y. Dong, L. Huang, EEG-based emotion recogni- tion using graph convolutional neural network with dual attention mechanism, Frontiers in Computational Neuroscience 18 (2024) 1416494

2024
[18]

R.Li,X.Yang,J.Lou,J.Zhang,Atemporal-spectralgraphconvolutionalneural network model for EEG emotion recognition within and across subjects, Brain Informatics 11 (30) (2024)

2024
[19]

K. Shen, Q. She, X. Yang, Y. Gao, Y. Fan, Dynamic sparse directed graph convolutional network with attention mechanisms for EEG emotion recognition, Neurocomputing 658 (2025) 131749

2025
[20]

J. Wang, S. Zhao, Z. Luo, Y. Zhou, H. Jiang, S. Li, T. Li, G. Pan, CBraMod: A criss-cross brain foundation model for EEG decoding, in: Proc. ICLR, 2025

2025
[21]

C. Yang, D. Westover, Q. Sun, et al., BIOT: Biosignal transformer for cross-data learning in the wild, in: Proc. NeurIPS, 2023

2023
[22]

ICLR, 2024

W.Jiang,L.Zhao,B.Lu,Largebrainmodelforlearninggenericrepresentations with tremendous EEG data in BCI, in: Proc. ICLR, 2024

2024
[23]

L. Wang, T. Suzumura, H. Kanezashi, GEFM: Graph-enhanced EEG foundation model, arXiv:2411.19507 (2024)

work page arXiv 2024
[24]

WaveNet: A Generative Model for Raw Audio

A. van den Oord, et al., WaveNet: A generative model for raw audio, arXiv:1609.03499 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016
[25]

CVPR, 2015, pp

C.Szegedy,W.Liu,Y.Jia,etal.,Goingdeeperwithconvolutions,in: Proc.IEEE Conf. CVPR, 2015, pp. 1–9. 33

2015
[26]

A. N. Kolmogorov, On the representation of continuous functions of many vari- ables by superposition of continuous functions of one variable and addition, Dokl. Akad. Nauk SSSR 114 (1957) 953–956

1957
[27]

Z. Liu, Y. Wang, S. Vaidya, et al., KAN: Kolmogorov-Arnold Networks, arXiv:2404.19756 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024
[28]

J.Chen,X.Wang,C.Huang,etal.,Alargefiner-grainedaffectivecomputingEEG dataset, Scientific Data 10 (2023) 740.doi:10.1038/s41597-023-02650-w

work page doi:10.1038/s41597-023-02650-w 2023
[29]

W.-B.Jiang,X.-H.Liu,W.-L.Zheng,B.-L.Lu,SEED-VII:Amultimodaldataset of six basic emotions with continuous labels for emotion recognition, IEEE Transactions on Affective Computing 16 (2) (2025) 969–985.doi:10.1109/ TAFFC.2024.3485057

work page arXiv 2025
[30]

I.Loshchilov,F.Hutter,Decoupledweightdecayregularization,in: Proc.ICLR, 2019

2019
[31]

V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung, B. J. Lance,EEGNet: AcompactconvolutionalneuralnetworkforEEG-basedbrain– computer interfaces, Journal of Neural Engineering 15 (5) (2018) 056013

2018
[32]

Y. Song, Q. Zheng, B. Liu, X. Gao, EEG conformer: Convolutional transformer for EEG decoding and visualization, IEEE Trans. Neural Syst. Rehabil. Eng. 31 (2023) 710–719.doi:10.1109/TNSRE.2022.3230250

work page doi:10.1109/tnsre.2022.3230250 2023
[33]

J. Jing, W. Ge, S. Hong, M. B. Fernandes, Z. Lin, C. Yang, S. An, A. F. Struck, A. Herlopian, I. Karakis, et al., Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation, Neurol- ogy 100 (17) (2023) e1750–e1762.doi:10.1212/WNL.0000000000207127

work page doi:10.1212/wnl.0000000000207127 2023
[34]

W. Y. Peh, Y. Yao, J. Dauwels, Transformer convolutional neural networks for 34 automated artifact detection in scalp EEG, in: Proc. 44th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2022, pp. 3599–3602

2022
[35]

H. Li, M. Ding, R. Zhang, C. Xiu, Motor imagery EEG classification algorithm basedonCNN-LSTMfeaturefusionnetwork,BiomedicalSignalProcessingand Control 72 (2022) 103342

2022
[36]

Y. Song, X. Jia, L. Yang, L. Xie, Transformer-based spatial-temporal feature learning for EEG decoding, arXiv:2106.11170 (2021)

work page arXiv 2021
[37]

C. Yang, D. Xiao, M. B. Westover, J. Sun, Self-supervised EEG representation learning for automatic sleep staging, arXiv:2110.15278 (2021)

work page arXiv 2021
[38]

J. Wang, S. Zhao, Y. Zhou, Y. Kang, S. Li, G. Pan, DeeperBrain: A neuro- grounded EEG foundation model towards universal BCI, arXiv:2601.06134 (2026)

work page arXiv 2026
[39]

Y.Zhou,J.Wu,Z.Ren,Z.Yao,W.Lu,K.Peng,Q.Zheng,C.Song,W.Ouyang, C.Gou,CSBrain: Across-scalespatiotemporalbrainfoundationmodelforEEG decoding, arXiv:2506.23075 (2025)

work page arXiv 2025
[40]

Lioi, REVE: A foundation model for EEG–adapting to any setup with large- scale pretraining on 25,000 subjects, arXiv:2510.21585 (2025)

Y.E.Ouahidi,J.Lys,P.Thoelke,N.Farrugia,B.Pasdeloup,V.Gripon,K.Jerbi, G. Lioi, REVE: A foundation model for EEG–adapting to any setup with large- scale pretraining on 25,000 subjects, arXiv:2510.21585 (2025)

work page arXiv 2025
[41]

C.Cheng,W.Liu,L.Feng,Z.Jia,Emotionrecognitionusinghierarchicalspatial– temporal learning transformer from regional to global brain, Neural Networks 179 (2024) 106624

2024
[42]

P.Vuilleumier,G.Pourtois,Distributedandinteractivebrainmechanismsduring emotion face perception: evidence from functional neuroimaging, Neuropsy- chologia 45 (1) (2007) 174–194. 35

2007
[43]

J. A. Russell, A circumplex model of affect, Journal of Personality and Social Psychology 39 (6) (1980) 1161–1178. Appendix A. Supplementary Interpretability Visualizations This appendix provides the SEED-VII interpretability visualizations that comple- menttherepresentativeFACEDvisualizationsinSectionIII.Thesefiguresareincluded as supplementary material ...

1980

[1] [1]

G.Assuncao,B.Patrao,M.Castelo-Branco,P.Menezes,Anoverviewofemotion in artificial intelligence, IEEE Trans. Artif. Intell. 3 (6) (2022) 867–886

2022

[2] [2]

Zhang, Y

T. Zhang, Y. Zong, W. Zheng, et al., Cross-database micro-expression recogni- tion: a benchmark, IEEE Trans. Knowl. Data Eng. 34 (2) (2022) 544–559

2022

[3] [3]

Y. Lei, S. Yang, X. Wang, L. Xie, MsEmoTTS: Multi-scale emotion transfer, prediction,andcontrolforemotionalspeechsynthesis,IEEE/ACMTrans.Audio Speech Lang. Process. 30 (2022) 853–864

2022

[4] [4]

Zhang, X

T. Zhang, X. Gong, C. L. P. Chen, BMT-Net: Broad multitask transformer network for sentiment analysis, IEEE Trans. Cybern. 52 (7) (2022) 6232–6243

2022

[5] [5]

Q.She,X.Shi,F.Fang,Y.Ma,Y.Zhang,Cross-subjectEEGemotionrecognition using multi-source domain manifold feature selection, Comput. Biol. Med. 159 (2023) 106860.doi:10.1016/j.compbiomed.2023.106860

work page doi:10.1016/j.compbiomed.2023.106860 2023

[6] [6]

W.-L.Zheng,J.-Y.Zhu,B.-L.Lu,Identifyingstablepatternsovertimeforemotion recognition from EEG, IEEE Trans. Affect. Comput. 10 (3) (2019) 417–429

2019

[7] [7]

Samal, M

P. Samal, M. F. Hashmi, Role of machine learning and deep learning techniques in EEG-based BCI emotion recognition system: a review, Artificial Intelligence Review 57 (50) (2024). 31

2024

[8] [8]

Y. Ding, S. Zhang, C. Tang, C. Guan, MASA-TCN: Multi-anchor space-aware temporalconvolutionalneuralnetworksforcontinuousanddiscreteEEGemotion recognition, IEEE J. Biomed. Health Inform. 28 (7) (2024) 3953–3964.doi: 10.1109/JBHI.2024.3392564

work page doi:10.1109/jbhi.2024.3392564 2024

[9] [9]

D. Li, B. Chai, Z. Wang, H. Yang, W. Du, EEG emotion recognition based on 3-d feature representation and dilated fully convolutional networks, IEEE Trans. Cogn.Dev.Syst.13(4)(2021)885–897.doi:10.1109/TCDS.2021.3051465

work page doi:10.1109/tcds.2021.3051465 2021

[10] [10]

Z.Cheng,X.Bu,Q.Wang,T.Yang,J.Tu,EEG-basedemotionrecognitionusing multi-scale dynamic CNN and gated transformer, Scientific Reports 14 (2024) 31319

2024

[11] [11]

Z. Jia, Y. Lin, J. Cai, et al., SST-EmotionNet: Spatial-spectral-temporal based attention 3D dense network for EEG emotion recognition, in: Proc. ACM Int. Conf. Multimedia, 2020

2020

[12] [12]

W. Tao, C. Li, R. Song, et al., EEG-based emotion recognition via channel-wise attentionandselfattention,IEEETrans.Affect.Comput.14(1)(2023)382–393

2023

[13] [13]

R. Liu, Y. Chao, X. Ma, X. Sha, L. Sun, S. Li, S. Chang, ERTNet: an inter- pretabletransformer-basedframeworkforEEGemotionrecognition,Frontiersin Neuroscience 18 (2024) 1320645

2024

[14] [14]

G.Zhang,M.Yu,Y.-J.Liu,G.Zhao,D.Zhang,W.Zheng,SparseDGCNN:Rec- ognizing emotion from multichannel EEG signals, IEEE Trans. Affect. Comput. 14 (1) (2023) 537–548.doi:10.1109/TAFFC.2021.3051332

work page doi:10.1109/taffc.2021.3051332 2023

[15] [15]

Zhong, D

P. Zhong, D. Wang, C. Miao, EEG-based emotion recognition using regularized graph neural networks, IEEE Trans. Affect. Comput. 13 (3) (2022) 1290–1301

2022

[16] [16]

T. Song, W. Zheng, P. Song, Z. Cui, EEG emotion recognition using dynamical 32 graphconvolutionalneuralnetworks,IEEETrans.Affect.Comput.11(3)(2020) 532–541

2020

[17] [17]

W. Chen, Y. Liao, R. Dai, Y. Dong, L. Huang, EEG-based emotion recogni- tion using graph convolutional neural network with dual attention mechanism, Frontiers in Computational Neuroscience 18 (2024) 1416494

2024

[18] [18]

R.Li,X.Yang,J.Lou,J.Zhang,Atemporal-spectralgraphconvolutionalneural network model for EEG emotion recognition within and across subjects, Brain Informatics 11 (30) (2024)

2024

[19] [19]

K. Shen, Q. She, X. Yang, Y. Gao, Y. Fan, Dynamic sparse directed graph convolutional network with attention mechanisms for EEG emotion recognition, Neurocomputing 658 (2025) 131749

2025

[20] [20]

J. Wang, S. Zhao, Z. Luo, Y. Zhou, H. Jiang, S. Li, T. Li, G. Pan, CBraMod: A criss-cross brain foundation model for EEG decoding, in: Proc. ICLR, 2025

2025

[21] [21]

C. Yang, D. Westover, Q. Sun, et al., BIOT: Biosignal transformer for cross-data learning in the wild, in: Proc. NeurIPS, 2023

2023

[22] [22]

ICLR, 2024

W.Jiang,L.Zhao,B.Lu,Largebrainmodelforlearninggenericrepresentations with tremendous EEG data in BCI, in: Proc. ICLR, 2024

2024

[23] [23]

L. Wang, T. Suzumura, H. Kanezashi, GEFM: Graph-enhanced EEG foundation model, arXiv:2411.19507 (2024)

work page arXiv 2024

[24] [24]

WaveNet: A Generative Model for Raw Audio

A. van den Oord, et al., WaveNet: A generative model for raw audio, arXiv:1609.03499 (2016)

work page internal anchor Pith review Pith/arXiv arXiv 2016

[25] [25]

CVPR, 2015, pp

C.Szegedy,W.Liu,Y.Jia,etal.,Goingdeeperwithconvolutions,in: Proc.IEEE Conf. CVPR, 2015, pp. 1–9. 33

2015

[26] [26]

A. N. Kolmogorov, On the representation of continuous functions of many vari- ables by superposition of continuous functions of one variable and addition, Dokl. Akad. Nauk SSSR 114 (1957) 953–956

1957

[27] [27]

Z. Liu, Y. Wang, S. Vaidya, et al., KAN: Kolmogorov-Arnold Networks, arXiv:2404.19756 (2024)

work page internal anchor Pith review Pith/arXiv arXiv 2024

[28] [28]

J.Chen,X.Wang,C.Huang,etal.,Alargefiner-grainedaffectivecomputingEEG dataset, Scientific Data 10 (2023) 740.doi:10.1038/s41597-023-02650-w

work page doi:10.1038/s41597-023-02650-w 2023

[29] [29]

W.-B.Jiang,X.-H.Liu,W.-L.Zheng,B.-L.Lu,SEED-VII:Amultimodaldataset of six basic emotions with continuous labels for emotion recognition, IEEE Transactions on Affective Computing 16 (2) (2025) 969–985.doi:10.1109/ TAFFC.2024.3485057

work page arXiv 2025

[30] [30]

I.Loshchilov,F.Hutter,Decoupledweightdecayregularization,in: Proc.ICLR, 2019

2019

[31] [31]

V. J. Lawhern, A. J. Solon, N. R. Waytowich, S. M. Gordon, C. P. Hung, B. J. Lance,EEGNet: AcompactconvolutionalneuralnetworkforEEG-basedbrain– computer interfaces, Journal of Neural Engineering 15 (5) (2018) 056013

2018

[32] [32]

Y. Song, Q. Zheng, B. Liu, X. Gao, EEG conformer: Convolutional transformer for EEG decoding and visualization, IEEE Trans. Neural Syst. Rehabil. Eng. 31 (2023) 710–719.doi:10.1109/TNSRE.2022.3230250

work page doi:10.1109/tnsre.2022.3230250 2023

[33] [33]

J. Jing, W. Ge, S. Hong, M. B. Fernandes, Z. Lin, C. Yang, S. An, A. F. Struck, A. Herlopian, I. Karakis, et al., Development of expert-level classification of seizures and rhythmic and periodic patterns during EEG interpretation, Neurol- ogy 100 (17) (2023) e1750–e1762.doi:10.1212/WNL.0000000000207127

work page doi:10.1212/wnl.0000000000207127 2023

[34] [34]

W. Y. Peh, Y. Yao, J. Dauwels, Transformer convolutional neural networks for 34 automated artifact detection in scalp EEG, in: Proc. 44th Annu. Int. Conf. IEEE Eng. Med. Biol. Soc., 2022, pp. 3599–3602

2022

[35] [35]

H. Li, M. Ding, R. Zhang, C. Xiu, Motor imagery EEG classification algorithm basedonCNN-LSTMfeaturefusionnetwork,BiomedicalSignalProcessingand Control 72 (2022) 103342

2022

[36] [36]

Y. Song, X. Jia, L. Yang, L. Xie, Transformer-based spatial-temporal feature learning for EEG decoding, arXiv:2106.11170 (2021)

work page arXiv 2021

[37] [37]

C. Yang, D. Xiao, M. B. Westover, J. Sun, Self-supervised EEG representation learning for automatic sleep staging, arXiv:2110.15278 (2021)

work page arXiv 2021

[38] [38]

J. Wang, S. Zhao, Y. Zhou, Y. Kang, S. Li, G. Pan, DeeperBrain: A neuro- grounded EEG foundation model towards universal BCI, arXiv:2601.06134 (2026)

work page arXiv 2026

[39] [39]

Y.Zhou,J.Wu,Z.Ren,Z.Yao,W.Lu,K.Peng,Q.Zheng,C.Song,W.Ouyang, C.Gou,CSBrain: Across-scalespatiotemporalbrainfoundationmodelforEEG decoding, arXiv:2506.23075 (2025)

work page arXiv 2025

[40] [40]

Lioi, REVE: A foundation model for EEG–adapting to any setup with large- scale pretraining on 25,000 subjects, arXiv:2510.21585 (2025)

Y.E.Ouahidi,J.Lys,P.Thoelke,N.Farrugia,B.Pasdeloup,V.Gripon,K.Jerbi, G. Lioi, REVE: A foundation model for EEG–adapting to any setup with large- scale pretraining on 25,000 subjects, arXiv:2510.21585 (2025)

work page arXiv 2025

[41] [41]

C.Cheng,W.Liu,L.Feng,Z.Jia,Emotionrecognitionusinghierarchicalspatial– temporal learning transformer from regional to global brain, Neural Networks 179 (2024) 106624

2024

[42] [42]

P.Vuilleumier,G.Pourtois,Distributedandinteractivebrainmechanismsduring emotion face perception: evidence from functional neuroimaging, Neuropsy- chologia 45 (1) (2007) 174–194. 35

2007

[43] [43]

J. A. Russell, A circumplex model of affect, Journal of Personality and Social Psychology 39 (6) (1980) 1161–1178. Appendix A. Supplementary Interpretability Visualizations This appendix provides the SEED-VII interpretability visualizations that comple- menttherepresentativeFACEDvisualizationsinSectionIII.Thesefiguresareincluded as supplementary material ...

1980