pith. sign in

arxiv: 2606.16615 · v2 · pith:ZX4X7C4Anew · submitted 2026-06-15 · 💻 cs.CV

SUP-MCRL: Subject-aware Unified Pseudo-feature Coded Multimodal Contrastive Representation Learning for EEG Visual Decoding

Pith reviewed 2026-06-27 03:04 UTC · model grok-4.3

classification 💻 cs.CV
keywords EEG visual decodingmultimodal contrastive learningzero-shot learningbrain-computer interfacenatural imagessemantic alignmentsubject variabilityTHINGS-EEG
0
0 comments X

The pith

Structured alignment supervision via semantic attention and pseudo-feature coding overcomes geometric-only limitations in EEG decoding of natural images.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Standard multimodal contrastive models align EEG signals to images only by minimizing geometric distances, which ignores semantic content in the visuals and variability across subjects, causing many incorrect zero-shot matches on natural scenes. The paper introduces SUP-MCRL to supply more structured supervision through three linked components that learn semantic spatial attention, adapt EEG features across subjects with multi-scale and attention operations, and maintain a running pool of pseudo-features to stabilize training. Experiments on the THINGS-EEG dataset report 66.0 percent top-1 and 91.9 percent top-5 accuracy for the same subject, plus 24.0 percent top-1 and 52.9 percent top-5 when leaving one subject out, both well above prior methods. A reader would care because the result suggests a concrete route to making non-invasive brain-computer interfaces work with everyday visual stimuli rather than only controlled lab pictures.

Core claim

The paper claims that a unified multimodal contrastive framework called SUP-MCRL, built around a Semantic-entity Aware Visual Encoder that extracts semantic content via learned spatial attention, a Unified EEG Enhancer that applies multi-scale atrous convolutions and inter-band attention for cross-subject robustness, and a Prototype-based Progressive Augmenter that maintains an EMA-updated pseudo-feature pool, produces subject-aware representations that achieve markedly higher zero-shot accuracy on natural-image EEG decoding than models limited to geometric alignment.

What carries the argument

The three collaborative mechanisms SAVE, UEE, and PPA inside SUP-MCRL that together supply structured alignment supervision beyond pure geometric distance optimization.

If this is right

  • The framework reaches 66.0 percent top-1 and 91.9 percent top-5 intra-subject accuracy on THINGS-EEG natural images.
  • Leave-one-subject-out performance reaches 24.0 percent top-1 and 52.9 percent top-5, showing better cross-subject generalization.
  • Accounting for semantic consistency and inter-subject variability reduces spurious zero-shot matches compared with geometric-only contrastive models.
  • The same structured supervision approach yields consistent gains across both intra-subject and leave-one-subject-out protocols.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The pseudo-feature pool mechanism might stabilize contrastive training in other modalities where representation collapse is common.
  • If the gains hold, the same attention and enhancement blocks could be tested on EEG decoding of non-visual stimuli.
  • Online updating of the pseudo-feature pool suggests a possible path toward real-time adaptation in deployed brain-computer interfaces.
  • The emphasis on subject-aware robustness points to value in combining this method with few-shot personalization techniques.

Load-bearing premise

The accuracy gains come chiefly from the three proposed mechanisms rather than from other details of training, data handling, or evaluation on the THINGS-EEG dataset.

What would settle it

A controlled re-run of a baseline multimodal contrastive model on identical THINGS-EEG splits and protocol, without SAVE, UEE, or PPA, that reaches comparable intra-subject and LOSO top-1 and top-5 accuracies.

Figures

Figures reproduced from arXiv: 2606.16615 by Hongjie Yan, Nizhuan Wang, Shengyu Gong, Wai Ting Siok, Weiming Zeng, Yueyang Li, Zijian Kang.

Figure 1
Figure 1. Figure 1: The overall framework of the proposed SUP-MCRL. The model consists of an image branch and an EEG [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overall architecture of the proposed Semantic-Entity Aware Visual Encoder (SAVE). (A) The main encoder– [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Overall architecture of the proposed EEG encoder. [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Main pipeline of the Prototype-based Progressive Augmenter (PPA). [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: The overall architecture of the proposed hierarchical codebook framework, consisting of [PITH_FULL_IMAGE:figures/full_fig_p011_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Visual comparison of images enhanced by the SAVE module. Top: original images; bottom: SAVE-enhanced [PITH_FULL_IMAGE:figures/full_fig_p015_6.png] view at source ↗
Figure 8
Figure 8. Figure 8: Distribution of temperature-scaled cosine similar [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗
Figure 14
Figure 14. Figure 14: PIES time-gate heatmap (channels vs. down [PITH_FULL_IMAGE:figures/full_fig_p016_14.png] view at source ↗
Figure 10
Figure 10. Figure 10: Adaptive channel scale factors. The red dashed [PITH_FULL_IMAGE:figures/full_fig_p017_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Channel-wise attention weights from the en [PITH_FULL_IMAGE:figures/full_fig_p017_11.png] view at source ↗
Figure 17
Figure 17. Figure 17: Frobenius norms of band-specific convolutional [PITH_FULL_IMAGE:figures/full_fig_p017_17.png] view at source ↗
Figure 15
Figure 15. Figure 15: Fusion weight allocation between the frequency [PITH_FULL_IMAGE:figures/full_fig_p018_15.png] view at source ↗
Figure 19
Figure 19. Figure 19: Category-reordered similarity heatmap of the [PITH_FULL_IMAGE:figures/full_fig_p018_19.png] view at source ↗
read the original abstract

Non-invasive brain-computer interfaces exhibit significant performance degradation when moving from controlled laboratory stimuli to real-world natural images. This degradation occurs because conventional multimodal contrastive representation learning models focus exclusively on optimizing geometric distance alignment, thereby failing to account for semantic consistency and inter-subject variability in neural representation and selective attention. As a result, these models are prone to producing spurious zero-shot matches. To address these limitations, we propose SUP-MCRL, a unified framework integrating three collaborative mechanisms: (1) a Semantic-entity Aware Visual Encoder (SAVE) that learns spatial attention to extract semantic content without relying on pre-trained saliency models; (2) a Unified EEG Enhancer (UEE) that employs multi-scale atrous convolutions and inter-band attention for adaptive cross-subject robustness; and (3) a Prototype-based Progressive Augmenter (PPA) that maintains an EMA-updated pseudo-feature pool to prevent representation collapse. Zero-shot experiments on the THINGS-EEG achieve 66.0%/91.9% (Top-1/Top-5) intra-subject and 24.0%/52.9% LOSO accuracy, significantly surpassing state-of-the-art methods and demonstrating that structured alignment supervision is key to overcoming the limitations of cross-modal decoding. Code is available at https://github.com/NZWANG/SUP-MCRL.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces SUP-MCRL, a multimodal contrastive representation learning framework for EEG-based visual decoding of natural images. It integrates three mechanisms—Semantic-entity Aware Visual Encoder (SAVE) for semantic spatial attention, Unified EEG Enhancer (UEE) using multi-scale atrous convolutions and inter-band attention, and Prototype-based Progressive Augmenter (PPA) with EMA-updated pseudo-feature pools—to address limitations in geometric alignment, semantic consistency, and inter-subject variability. Zero-shot experiments on THINGS-EEG report intra-subject accuracies of 66.0% Top-1 / 91.9% Top-5 and LOSO accuracies of 24.0% Top-1 / 52.9% Top-5, claiming these significantly surpass prior state-of-the-art methods due to the structured alignment supervision.

Significance. If the reported gains prove robustly attributable to SAVE, UEE, and PPA rather than implementation details, the work could meaningfully advance non-invasive BCI by improving cross-modal decoding for real-world stimuli. Public code release at the cited GitHub repository is a clear strength that aids reproducibility and allows independent verification of the central claims.

major comments (2)
  1. [Abstract and Experiments] Abstract and Experiments: The central claim attributes the 66.0%/91.9% intra-subject and 24.0%/52.9% LOSO gains (and SOTA superiority) specifically to the three proposed mechanisms providing structured alignment supervision. No ablation studies isolating the contribution of SAVE, UEE, or PPA (e.g., variants with each component removed) are described, leaving open the possibility that gains arise from unstated factors such as optimizer choices, loss scaling, data augmentation, or subject-split details on THINGS-EEG.
  2. [Results] Results: The reported accuracy numbers are presented without error bars, standard deviations across multiple runs, or statistical significance tests against baselines. This undermines the load-bearing claim that the results 'significantly surpass state-of-the-art methods,' as it is impossible to assess whether observed differences exceed expected variability.
minor comments (1)
  1. [Abstract] The abstract states that code is available, which is helpful; the full manuscript should expand the methods section with complete training protocol, hyperparameter values, exact data splits, and evaluation details to support reproduction.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive comments, which help strengthen the manuscript. We address each major point below and commit to revisions that directly respond to the concerns raised.

read point-by-point responses
  1. Referee: [Abstract and Experiments] Abstract and Experiments: The central claim attributes the 66.0%/91.9% intra-subject and 24.0%/52.9% LOSO gains (and SOTA superiority) specifically to the three proposed mechanisms providing structured alignment supervision. No ablation studies isolating the contribution of SAVE, UEE, or PPA (e.g., variants with each component removed) are described, leaving open the possibility that gains arise from unstated factors such as optimizer choices, loss scaling, data augmentation, or subject-split details on THINGS-EEG.

    Authors: We agree that ablation studies are required to isolate the contributions of SAVE, UEE, and PPA. The revised manuscript will add a dedicated ablation section with experiments that remove each component individually (and in combinations) while keeping all other implementation details fixed. Results will be reported on the same THINGS-EEG splits to quantify the performance drop attributable to each mechanism. revision: yes

  2. Referee: [Results] Results: The reported accuracy numbers are presented without error bars, standard deviations across multiple runs, or statistical significance tests against baselines. This undermines the load-bearing claim that the results 'significantly surpass state-of-the-art methods,' as it is impossible to assess whether observed differences exceed expected variability.

    Authors: We acknowledge that the absence of variability measures and statistical tests weakens the strength of the superiority claims. In the revision we will rerun all experiments across multiple random seeds, report mean accuracies with standard deviations and error bars, and include statistical significance tests (e.g., paired t-tests or Wilcoxon tests with p-values) against the reproduced baselines. revision: yes

Circularity Check

0 steps flagged

No significant circularity: claims rest on experimental results

full rationale

The paper presents its core claims as empirical outcomes: zero-shot accuracies of 66.0%/91.9% intra-subject and 24.0%/52.9% LOSO on the public THINGS-EEG dataset, attributed to the three proposed mechanisms (SAVE, UEE, PPA). No equations, fitted parameters, or self-citations are shown in the provided text that reduce these measured accuracies to inputs by construction. The derivation chain consists of architectural descriptions followed by benchmark evaluation, which is self-contained against external data and does not exhibit self-definitional, fitted-prediction, or load-bearing self-citation patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no explicit free parameters, axioms, or invented entities; typical deep-learning models contain many hyperparameters whose values are not stated here.

pith-pipeline@v0.9.1-grok · 5797 in / 1187 out tokens · 50993 ms · 2026-06-27T03:04:48.692954+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

37 extracted references · 5 linked inside Pith

  1. [1]

    Yueyang Li, Weiming Zeng, Wenhao Dong, Di Han, Lei Chen, Hongyu Chen, Zijian Kang, Shengyu Gong, Hongjie Yan, Wai Ting Siok, and Nizhuan Wang. A tale of single-channel electroencephalography: Devices, datasets, signal processing, applications, and future directions.IEEE Transactions on Instrumentation and Measurement, 74:1–20, 2025

  2. [2]

    Linguistics and human brain: A perspective of computational neuroscience.arXiv preprint arXiv:2602.08275, 2026

    Fudong Zhang, Bo Chai, Yujie Wu, Wai Ting Siok, and Nizhuan Wang. Linguistics and human brain: A perspective of computational neuroscience.arXiv preprint arXiv:2602.08275, 2026

  3. [3]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In Marina Meila and Tong Zhang, editors,Proceedings of the 38th International Conference on Machine ...

  4. [4]

    Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning

    Victor Weixin Liang, Yuhui Zhang, Yongchan Kwon, Serena Yeung, and James Zou. Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors,Advances in Neural Information Processing Systems, volume 35, pages 17612–17625. Curran Associates, Inc., 2022

  5. [5]

    Mitigate the gap: Improving cross-modal alignment in clip

    Sedigheh Eslami and Gerard de Melo. Mitigate the gap: Improving cross-modal alignment in clip. InThe Thirteenth International Conference on Learning Representations, 2025. 19 SUP-MCRL: Subject-aware Unified Pseudo-feature Coded Multimodal Contrastive Representation Learning for EEG Visual Decoding

  6. [6]

    Causality-inspired brain-visual contrastive learning for zero-shot visual decoding.Knowledge-Based Systems, 346:116182, 2026

    Yi Xiao, Xuyi Qiao, Yu-Xuan Zhang, and Xianchuan Yu. Causality-inspired brain-visual contrastive learning for zero-shot visual decoding.Knowledge-Based Systems, 346:116182, 2026

  7. [7]

    A large and rich eeg dataset for modeling human visual object recognition.NeuroImage, 264:119754, 2022

    Alessandro T Gifford, Kshitij Dwivedi, Gemma Roig, and Radoslaw M Cichy. A large and rich eeg dataset for modeling human visual object recognition.NeuroImage, 264:119754, 2022

  8. [8]

    Neural-mcrl: Neural multimodal contrastive representation learning for eeg-based visual decoding

    Yueyang Li, Zijian Kang, Shengyu Gong, Wenhao Dong, Weiming Zeng, Hongjie Yan, Wai Ting Siok, and Nizhuan Wang. Neural-mcrl: Neural multimodal contrastive representation learning for eeg-based visual decoding. In2025 IEEE International Conference on Multimedia and Expo (ICME), pages 1–6, 2025

  9. [9]

    An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

    Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020

  10. [10]

    Changde Du, Kaicheng Fu, Jinpeng Li, and Huiguang He. Decoding visual neural representations by multimodal learning of brain-visual-linguistic features.IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(9):10760–10777, 2023

  11. [11]

    Decoding natural images from eeg for object recognition

    Yonghao Song, Bingchuan Liu, Xiang Li, Nanlin Shi, Yijun Wang, and Xiaorong Gao. Decoding natural images from eeg for object recognition. In B. Kim, Y . Yue, S. Chaudhuri, K. Fragkiadaki, M. Khan, and Y . Sun, editors, International Conference on Learning Representations, volume 2024, pages 47648–47665, 2024

  12. [12]

    Neuro-3d: Towards 3d visual decoding from eeg signals

    Zhanqiang Guo, Jiamin Wu, Yonghao Song, Jiahui Bu, Weijian Mai, Qihao Zheng, Wanli Ouyang, and Chunfeng Song. Neuro-3d: Towards 3d visual decoding from eeg signals. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 23870–23880, 2025

  13. [13]

    Eeg-driven natural image reconstruc- tion with regional semantic awareness.Pattern Recognition, 172:112589, 2026

    Xin Xiang, Wenhui Zhou, Haonan Zhu, Yunrui Li, Guojun Dai, and Lili Lin. Eeg-driven natural image reconstruc- tion with regional semantic awareness.Pattern Recognition, 172:112589, 2026

  14. [14]

    Eeg2vision: A multimodal eeg-based framework for 2d visual reconstruction in cognitive neuroscience.arXiv preprint arXiv:2604.08063, 2026

    Emanuele Balloni, Emanuele Frontoni, Chiara Matti, Marina Paolanti, Roberto Pierdicca, and Emiliano Santarnec- chi. Eeg2vision: A multimodal eeg-based framework for 2d visual reconstruction in cognitive neuroscience.arXiv preprint arXiv:2604.08063, 2026

  15. [15]

    Visual decoding and reconstruction via eeg embeddings with guided diffusion.arXiv preprint arXiv:2403.07721, 2024

    Dongyang Li, Chen Wei, Shiying Li, Jiachen Zou, Haoyang Qin, and Quanying Liu. Visual decoding and reconstruction via eeg embeddings with guided diffusion.arXiv preprint arXiv:2403.07721, 2024

  16. [16]

    Neurobridge: Bio-inspired self-supervised eeg-to-image decoding via cognitive priors and bidirectional semantic alignment

    Wenjiang Zhang, Sifeng Wang, Yuwei Su, Xinyu Li, Chen Zhang, and Suyu Zhong. Neurobridge: Bio-inspired self-supervised eeg-to-image decoding via cognitive priors and bidirectional semantic alignment. InProceedings of the AAAI Conference on Artificial Intelligence, volume 40, pages 18028–18036, 2026

  17. [17]

    Damind: Zero-shot visual cross-domain alignment and representation for eeg decoding.IEEE Transactions on Image Processing, 35:3214–3227, 2026

    Haodong Jing, Yongqiang Ma, Panqi Yang, Haoyu Li, Shuai Huang, Badong Chen, and Nanning Zheng. Damind: Zero-shot visual cross-domain alignment and representation for eeg decoding.IEEE Transactions on Image Processing, 35:3214–3227, 2026

  18. [18]

    Mindsae: Advancing semantic perception for m/eeg-based visual decoding via unified multimodal alignment framework.Biomedical Signal Processing and Control, 123:110390, 2026

    Chengjian Xu, Yonghao Song, Qiong Wang, and Qingqing Zheng. Mindsae: Advancing semantic perception for m/eeg-based visual decoding via unified multimodal alignment framework.Biomedical Signal Processing and Control, 123:110390, 2026

  19. [19]

    Need: Cross-subject and cross-task generalization for video and image reconstruction from eeg signals

    Shuai Huang, Huan Luo, Haodong Jing, Qixian Zhang, Litao Chang, Yating Feng, Xiao Lin, Chendong Qin, Han Chen, Shuwen Jia, Siyi Sun, and Yongxiong Wang. Need: Cross-subject and cross-task generalization for video and image reconstruction from eeg signals. In D. Belgrave, C. Zhang, H. Lin, R. Pascanu, P. Koniusz, M. Ghassemi, and N. Chen, editors,Advances ...

  20. [20]

    Wei Li, Penglu Zhao, Cheng Xu, Yingting Hou, Wenhao Jiang, and Aiguo Song. Deep learning for eeg-based visual classification and reconstruction: Panorama, trends, challenges and opportunities.IEEE Transactions on Biomedical Engineering, 72(11):3374–3390, 2025

  21. [21]

    Interpretable cross-modal alignment network for eeg visual decoding with algorithm unrolling.IEEE Transactions on Neural Networks and Learning Systems, 36(11):19894–19908, 2025

    Daowen Xiong, Liangliang Hu, Jiahao Jin, Yikang Ding, Congming Tan, Jing Zhang, and Yin Tian. Interpretable cross-modal alignment network for eeg visual decoding with algorithm unrolling.IEEE Transactions on Neural Networks and Learning Systems, 36(11):19894–19908, 2025

  22. [22]

    Stambridge: Spectral-temporal amplitude-aware mid-feature bridge for eeg visual decoding.arXiv preprint arXiv:2605.23137, 2026

    Jiahe Meng, Weiming Zeng, Yueyang Li, Bo Chai, Hongjie Yan, Zhiguo Zhang, Wai Ting Siok, and Nizhuan Wang. Stambridge: Spectral-temporal amplitude-aware mid-feature bridge for eeg visual decoding.arXiv preprint arXiv:2605.23137, 2026

  23. [23]

    Seeeeg: Semantic-aware eeg-based multi-modal retrieval-augmented generation for high-fidelity visual brain decoding

    Jun-Mo Kim, Woohyeok Choi, Sang-Jun Park, Keun-Soo Heo, Young-Han Son, Ji-Hye Oh, Dong-Hee Shin, and Tae-Eui Kam. Seeeeg: Semantic-aware eeg-based multi-modal retrieval-augmented generation for high-fidelity visual brain decoding. In2025 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), pages 4883–4892, 2025. 20 SUP-MCRL: Subject-awa...

  24. [24]

    Eeg conformer: Convolutional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

    Yonghao Song, Qingqing Zheng, Bingchuan Liu, and Xiaorong Gao. Eeg conformer: Convolutional transformer for eeg decoding and visualization.IEEE Transactions on Neural Systems and Rehabilitation Engineering, 31:710–719, 2023

  25. [25]

    Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra

    Ramprasaath R. Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. In2017 IEEE International Conference on Computer Vision (ICCV), pages 618–626, 2017

  26. [26]

    Eeg-itnet: An explainable inception temporal convolutional network for motor imagery classification.IEEE Access, 10:36672–36685, 2022

    Abbas Salami, Javier Andreu-Perez, and Helge Gillmeister. Eeg-itnet: An explainable inception temporal convolutional network for motor imagery classification.IEEE Access, 10:36672–36685, 2022

  27. [27]

    Xiong Xiong, Li Su, Jinjie Guo, Tianyuan Song, Ying Wang, Jinguo Huang, and Guixia Kang. Enhancing motor imagery decoding in brain–computer interfaces using riemann tangent space mapping and cross frequency coupling.Biomedical Signal Processing and Control, 99:106797, 2025

  28. [28]

    Dtp-net: Learning to reconstruct eeg signals in time-frequency domain by multi-scale feature reuse.IEEE Journal of Biomedical and Health Informatics, 28(5):2662–2673, 2024

    Yan Pei, Jiahui Xu, Qianhao Chen, Chenhao Wang, Feng Yu, Lisan Zhang, and Wei Luo. Dtp-net: Learning to reconstruct eeg signals in time-frequency domain by multi-scale feature reuse.IEEE Journal of Biomedical and Health Informatics, 28(5):2662–2673, 2024

  29. [29]

    O’Connor, and Kevin McGuinness

    Eric Arazo, Diego Ortego, Paul Albert, Noel E. O’Connor, and Kevin McGuinness. Pseudo-labeling and confirmation bias in deep semi-supervised learning. In2020 International Joint Conference on Neural Networks (IJCNN), pages 1–8, 2020

  30. [30]

    Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks

    Dong-Hyun Lee et al. Pseudo-label: The simple and efficient semi-supervised learning method for deep neural networks. InWorkshop on challenges in representation learning, ICML, volume 3, page 896. Atlanta, 2013

  31. [31]

    Neurodecoder: A new framework for image decoding and reconstruction of eeg signals.IEEE Journal of Biomedical and Health Informatics, pages 1–14, 2026

    Wenxuan Ma, Hongxin Zhang, Yexuan Li, and Mingyi Wei. Neurodecoder: A new framework for image decoding and reconstruction of eeg signals.IEEE Journal of Biomedical and Health Informatics, pages 1–14, 2026

  32. [32]

    Representation learning with contrastive predictive coding

    Aaron van den Oord, Yazhe Li, and Oriol Vinyals. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748, 2018

  33. [33]

    Lel: Lipschitz continuity constrained ensemble learning for efficient eeg-based intrasubject emotion recognition.IEEE Sensors Journal, 26(9):13446–13456, 2026

    Shengyu Gong, Yueyang Li, Zijian Kang, Bo Chai, Weiming Zeng, Hongjie Yan, Zhiguo Zhang, Wai Ting Siok, and Nizhuan Wang. Lel: Lipschitz continuity constrained ensemble learning for efficient eeg-based intrasubject emotion recognition.IEEE Sensors Journal, 26(9):13446–13456, 2026

  34. [34]

    Mb2c: Multimodal bidirectional cycle consistency for learning robust visual neural representations

    Yayun Wei, Lei Cao, Hao Li, and Yilin Dong. Mb2c: Multimodal bidirectional cycle consistency for learning robust visual neural representations. InProceedings of the 32nd ACM International Conference on Multimedia, MM ’24, page 8992–9000, New York, NY , USA, 2024. Association for Computing Machinery

  35. [35]

    Visual neural decoding via improved visual-eeg semantic consistency.arXiv preprint arXiv:2408.06788, 2024

    Hongzhou Chen, Lianghua He, Yihang Liu, Longzhen Yang, Shaohua Shang, and MengChu Zhou. Visual neural decoding via improved visual-eeg semantic consistency.arXiv preprint arXiv:2408.06788, 2024

  36. [36]

    Bridging the vision-brain gap with an uncertainty-aware blur prior

    Haitao Wu, Qing Li, Changqing Zhang, Zhen He, and Xiaomin Ying. Bridging the vision-brain gap with an uncertainty-aware blur prior. In2025 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2246–2257, 2025

  37. [37]

    Spatial-functional awareness transformer-based graph archetype contrastive learning for decoding visual neural representations from eeg.arXiv preprint arXiv:2509.24761, 2025

    Yueming Sun and Long Yang. Spatial-functional awareness transformer-based graph archetype contrastive learning for decoding visual neural representations from eeg.arXiv preprint arXiv:2509.24761, 2025. 21 SUP-MCRL: Subject-aware Unified Pseudo-feature Coded Multimodal Contrastive Representation Learning for EEG Visual Decoding Appendix Supplementary Mater...