pith. sign in

arxiv: 2506.18962 · v2 · submitted 2025-06-23 · 💻 cs.HC

UniMind: Unleashing the Power of LLMs for Unified Multi-Task Brain Decoding

Pith reviewed 2026-05-19 07:40 UTC · model grok-4.3

classification 💻 cs.HC
keywords EEG decodingmulti-task brain decodinglarge language modelsneuro-language connectortask-aware query selectionfoundation modelbrain-computer interface
0
0 comments X

The pith

UniMind connects EEG brain signals to large language models for decoding many tasks at once without separate tuning per task.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents UniMind as a foundation model that unifies brain decoding from EEG across heterogeneous tasks by tapping into the reasoning power of large language models. It targets the problem that prior multi-task EEG models still need per-task retraining due to differences in what each decoding task asks of the brain signals. A reader would care if this works because it could make brain-signal applications like mental-state monitoring or human-machine interfaces more practical and general. The method relies on turning raw EEG patterns into a form language models can use and on selecting task-specific queries to focus the alignment.

Core claim

UniMind is a general-purpose EEG foundation model for unified multi-task brain decoding that unleashes large language models by first using a Neuro-Language Connector to distill and transform spatiotemporal neural patterns from EEG into language-model-understandable representations and second using a Task-aware Query Selection module to generate dynamic task-adaptive query tokens that enable learning of task-relevant patterns across diverse tasks, delivering an average 12 percent gain over prior multi-task models on ten datasets plus neuroscientific insights into functional correlations.

What carries the argument

Neuro-Language Connector that bridges EEG spatiotemporal patterns to language-model inputs, combined with Task-aware Query Selection that injects task awareness through adaptive query tokens for cross-modal alignment.

If this is right

  • A single model can handle multiple EEG decoding tasks without retraining for each new task.
  • Average performance improves by 12 percent across diverse datasets compared with prior multi-task approaches.
  • The same architecture yields measurable insights into how neural activity patterns relate across different brain tasks.
  • Applications in clinical monitoring and human-machine interaction become feasible with less customization.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Direct EEG-to-LLM connections could let future systems draw on the semantic knowledge already inside language models when interpreting brain signals.
  • If the connector proves robust, the approach might extend to combining EEG with other signals such as fMRI or eye tracking in one model.
  • The reported neural correlations across tasks could be tested by checking whether they predict behavior in new experimental designs.

Load-bearing premise

The Neuro-Language Connector successfully distills and transforms the spatiotemporal neural patterns of EEG data into representations understandable by language models, and the Task-aware Query Selection enables learning of task-relevant patterns across diverse heterogeneous tasks without task-specific tuning.

What would settle it

Running UniMind on a fresh collection of EEG datasets from previously unseen tasks and finding that it still requires task-specific fine-tuning or shows no consistent performance gain over strong baselines would falsify the unified decoding claim.

Figures

Figures reproduced from arXiv: 2506.18962 by Chunfeng Song, Jiamin Wu, Pengyu Zhu, Qihao Zheng, Wanli Ouyang, Weiheng Lu, Weijian Mai, Yuchen Zhou, Zhouheng Yao.

Figure 1
Figure 1. Figure 1: UniMind leverages LLMs to interpret brain signals, enabling multi-task EEG decoding [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Overview of the UniMind architecture. Raw EEG signals are encoded into EEG embeddings [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Configuration of Query Pool Size nq. 68 69 70 71 B - A c c ( % ) (a) Average 59 60 61 62 63 64 B - A c c ( % ) (b) TUEV 76 77 78 79 B - A c c ( % ) (c) Workl oad 65 66 67 68 69 70 71 B - A c c ( % ) (d) SEED 73 74 75 76 B - A c c ( % ) (e) HMC 77 78 79 80 81 82 83 B - A c c ( % ) (f) TUSL Si ze = 7 B Si ze =1 . 8 B Si ze =0. 5 B [PITH_FULL_IMAGE:figures/full_fig_p008_3.png] view at source ↗
Figure 5
Figure 5. Figure 5: (a): t-SNE-based visualization of neural routing distributions across datasets; (b): task [PITH_FULL_IMAGE:figures/full_fig_p009_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: (a): Similarity of task-adaptive query distributions across tasks; (b): The attention values of [PITH_FULL_IMAGE:figures/full_fig_p009_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Comparison of individual and joint training balanced accuracy across multiple tasks. [PITH_FULL_IMAGE:figures/full_fig_p009_7.png] view at source ↗
read the original abstract

Decoding human brain activity from electroencephalography (EEG) signals is a central challenge at the intersection of neuroscience and artificial intelligence, enabling diverse applications in mental state assessment, clinical monitoring, and human-machine interaction. Recent efforts have extensively explored EEG-based brain foundation models for generalized brain decoding, employing large-scale training on multiple datasets. However, most of these attempts struggle with generalizability and fail to achieve satisfactory performance without task-specific tuning due to pronounced inherent heterogeneity among decoding tasks. To address these challenges, we present UniMind, a general-purpose EEG foundation model for unified multi-task brain decoding by uniquely unleashing the power of large language models to comprehend complex neural patterns. UniMind offers several advantages. First, we design a Neuro-Language Connector to bridge the modality gap between neural signals and large language models, distilling and transforming the spatiotemporal neural patterns of EEG data into representations understandable by language models. Second, a Task-aware Query Selection module is proposed to inject task-awareness into the cross-modal alignment by dynamically generating task-adaptive query tokens, enabling learning of task-relevant neural patterns across diverse tasks. Extensive experiments across ten datasets demonstrate that UniMind substantially outperforms state-of-the-art multi-task decoding models, with an average gain of 12 percent, while also offering valuable neuroscientific insights into neural functional correlations across tasks. The code is available at https://github.com/kaleidoyao/UniMind.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper presents UniMind, a general-purpose EEG foundation model for unified multi-task brain decoding that integrates large language models. It introduces a Neuro-Language Connector to transform spatiotemporal EEG patterns into LLM-compatible representations and a Task-aware Query Selection module to dynamically generate task-adaptive query tokens for handling task heterogeneity. Experiments across ten external datasets report an average 12% performance gain over state-of-the-art multi-task decoding models, along with neuroscientific insights into neural functional correlations. Code is released at a public GitHub repository.

Significance. If the performance gains and truly unified (no task-specific tuning) nature of the model are substantiated, this could advance generalizable brain decoding by bridging neural signals with LLMs, supporting applications in mental state assessment, clinical monitoring, and human-machine interaction. The multi-dataset evaluation and open code are strengths that aid reproducibility and allow testing of the claimed generalization.

major comments (2)
  1. [Abstract and Method section on Task-aware Query Selection] Abstract and Method section on Task-aware Query Selection: The central claim of unified multi-task decoding 'without task-specific tuning' and 'general-purpose' applicability requires that task identity not be supplied at inference. The module description indicates it 'dynamically generating task-adaptive query tokens' to inject task-awareness; if these tokens derive from task embeddings or identifiers provided at test time (rather than inferred solely from EEG input), each evaluation becomes conditioned on task identity. This would make the 12% gain comparable only to task-aware baselines and preclude zero-shot use on novel tasks, directly undermining the unified claim.
  2. [Results section (performance claims)] Results section (performance claims): The abstract reports a 12% average gain but the soundness assessment notes absence of details on exact baselines, statistical significance tests, error bars, dataset characteristics, and handling of task heterogeneity. These elements are load-bearing for validating the outperformance and cross-task generalization assertions.
minor comments (2)
  1. [Method] The notation for the Neuro-Language Connector and query tokens could be formalized with explicit equations or pseudocode to improve clarity of the cross-modal alignment process.
  2. [Figures] Figure captions and axis labels in experimental result plots should explicitly state the metrics, number of runs, and whether error bars represent standard deviation or standard error.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the detailed and constructive review. The comments help clarify key aspects of our claims regarding unified multi-task decoding. We address each major comment point by point below and indicate the corresponding revisions to the manuscript.

read point-by-point responses
  1. Referee: [Abstract and Method section on Task-aware Query Selection] Abstract and Method section on Task-aware Query Selection: The central claim of unified multi-task decoding 'without task-specific tuning' and 'general-purpose' applicability requires that task identity not be supplied at inference. The module description indicates it 'dynamically generating task-adaptive query tokens' to inject task-awareness; if these tokens derive from task embeddings or identifiers provided at test time (rather than inferred solely from EEG input), each evaluation becomes conditioned on task identity. This would make the 12% gain comparable only to task-aware baselines and preclude zero-shot use on novel tasks, directly undermining the unified claim.

    Authors: We appreciate this important point on the distinction between task-aware and truly unified decoding. In the UniMind architecture, the Task-aware Query Selection module operates by learning a dynamic selection process over a shared query pool that is conditioned solely on the input EEG spatiotemporal features via the Neuro-Language Connector; no explicit task identifiers, embeddings, or labels are provided or required at inference time. Task-adaptive behavior emerges from the learned alignment during multi-task training, enabling the model to handle heterogeneity without task-specific tuning or conditioning. This design supports the general-purpose claim and opens the possibility for zero-shot transfer to unseen tasks. To eliminate any ambiguity, we have expanded the Method section with a formal description of the inference procedure, including pseudocode showing that only raw EEG is input at test time, and we have updated the Abstract to explicitly state that task identity is not supplied. revision: yes

  2. Referee: [Results section (performance claims)] Results section (performance claims): The abstract reports a 12% average gain but the soundness assessment notes absence of details on exact baselines, statistical significance tests, error bars, dataset characteristics, and handling of task heterogeneity. These elements are load-bearing for validating the outperformance and cross-task generalization assertions.

    Authors: We agree that these details are essential for rigorous validation. The original manuscript included baseline comparisons and dataset descriptions, but we acknowledge they were not presented with sufficient granularity. In the revised Results section we now provide: (i) a complete table listing all baselines with citations and implementation details, (ii) paired t-test results with p-values and confidence intervals for the 12% average improvement, (iii) error bars as standard deviation across 5 random seeds and subject-wise variability, (iv) a supplementary table with dataset characteristics (subject count, channel count, task labels, recording duration, and preprocessing), and (v) an extended analysis subsection explaining how the Neuro-Language Connector and Task-aware Query Selection jointly mitigate task heterogeneity. These additions directly substantiate the reported gains and generalization claims. revision: yes

Circularity Check

0 steps flagged

No significant circularity; claims rest on external dataset experiments

full rationale

The paper presents UniMind as an architectural framework with a Neuro-Language Connector and Task-aware Query Selection module, validated through performance gains on ten external datasets. No equations or derivations reduce by construction to fitted inputs or self-definitions. Core claims of unified multi-task decoding without task-specific tuning are supported by empirical results rather than internal self-referential loops or load-bearing self-citations. The derivation chain remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 2 invented entities

The central claim rests on the effectiveness of newly introduced modules and the domain assumption that EEG signals contain shared patterns across tasks that LLMs can process after transformation.

free parameters (1)
  • Model hyperparameters and training parameters
    Deep learning models typically involve fitted hyperparameters not specified in the abstract.
axioms (1)
  • domain assumption EEG signals contain decodable spatiotemporal patterns that can be unified across heterogeneous tasks
    This underpins the multi-task unified decoding approach described in the abstract.
invented entities (2)
  • Neuro-Language Connector no independent evidence
    purpose: Bridge the modality gap between EEG neural signals and large language models
    Newly proposed component to transform EEG data into LLM-compatible representations.
  • Task-aware Query Selection module no independent evidence
    purpose: Inject task-awareness into cross-modal alignment by generating task-adaptive query tokens
    Newly proposed module for dynamic adaptation across tasks.

pith-pipeline@v0.9.0 · 5811 in / 1343 out tokens · 43446 ms · 2026-05-19T07:40:07.810392+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 3 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity

    cs.LG 2026-04 unverdicted novelty 7.0

    NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.

  2. Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs

    cs.AI 2026-05 unverdicted novelty 6.0

    Generative Visual Grounding creates visual proxy images from EEG to enhance MLLM understanding of brain signals beyond text-only alignment.

  3. SCOPE: Structured Prototype-Guided Adaptation for EEG Foundation Models with Limited Labels

    cs.LG 2026-02 unverdicted novelty 6.0

    SCOPE uses cohort-level external supervision, confidence-aware pseudo-labels, and a lightweight prototype-conditioned adapter (ProAdapter) to adapt frozen EEG foundation models in label-limited settings, reporting con...

Reference graph

Works this paper leans on

83 extracted references · 83 canonical work pages · cited by 3 Pith papers · 10 internal anchors

  1. [1]

    A review of feature extraction and performance evaluation in epileptic seizure detection using eeg

    Poomipat Boonyakitanont, Apiwat Lek-Uthai, Krisnachai Chomtho, and Jitkomut Songsiri. A review of feature extraction and performance evaluation in epileptic seizure detection using eeg. Biomedical Signal Processing and Control, 57:101702, 2020

  2. [2]

    Sleep stage classification using eeg signal analysis: A comprehensive survey and new investigation.Entropy, 18(9):272, 2016

    Khald Ali I Aboalayon, Miad Faezipour, Wafaa S Almuhammadi, and Saeid Moslehpour. Sleep stage classification using eeg signal analysis: A comprehensive survey and new investigation.Entropy, 18(9):272, 2016

  3. [3]

    Deep learning for eeg motor imagery classification based on multi-layer cnns feature fusion

    Syed Umar Amin, Mansour Alsulaiman, Ghulam Muhammad, Mohamed Amine Mekhtiche, and M Shamim Hossain. Deep learning for eeg motor imagery classification based on multi-layer cnns feature fusion. Future Generation Computer Systems, 101:542–554, 2019

  4. [4]

    Chrononet: A deep recurrent neural network for abnormal eeg identification

    Subhrajit Roy, Isabell Kiral-Kornek, and Stefan Harrer. Chrononet: A deep recurrent neural network for abnormal eeg identification. In International Conference on Artificial Intelligence in Medicine, volume 11526 of Lecture Notes in Computer Science, pages 47–56. Springer, 2019

  5. [5]

    Eeg-based emotion recognition: A state- of-the-art review of current trends and opportunities

    Nazmi Sofian Suhaimi, James Mountstephens, and Jason Teo. Eeg-based emotion recognition: A state- of-the-art review of current trends and opportunities. Computational Intelligence and Neuroscience , 2020

  6. [6]

    Evolutionary inspired approach for mental stress detection using eeg signal

    Lakhan Dev Sharma, Vijay Kumar Bohat, Maria Habib, Al-Zoubi Ala’M, Hossam Faris, and Ibrahim Aljarah. Evolutionary inspired approach for mental stress detection using eeg signal. Expert Systems with Applications, 197:116634, 2022

  7. [7]

    Development of expert- level classification of seizures and rhythmic and periodic patterns during eeg interpretation

    Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, et al. Development of expert- level classification of seizures and rhythmic and periodic patterns during eeg interpretation. Neurology, 100(17):e1750–e1762, 2023

  8. [8]

    Nagabushanam, S

    P. Nagabushanam, S. Thomas George, Praharsha Davu, P. Bincy, Meghana Naidu, and S. Radha. Artifact removal using elliptic filter and classification using 1d-cnn for eeg signals. In International Conference on Advanced Computing and Communication Systems (ICACCS), pages 551–556, 2020

  9. [9]

    Cnn and lstm-based emotion charting using physiological signals

    Muhammad Najam Dar, Muhammad Usman Akram, Sajid Gul Khawaja, and Amit N. Cnn and lstm-based emotion charting using physiological signals. Sensors, 20(16):4551, 2020

  10. [10]

    Atd: Augmenting cp tensor decomposition by self supervision

    Chaoqi Yang, Cheng Qian, Navjot Singh, Cao Xiao, M Brandon Westover, Edgar Solomonik, and Jimeng Sun. Atd: Augmenting cp tensor decomposition by self supervision. Advances in Neural Information Processing Systems, 2022

  11. [11]

    A study on user recognition using 2d ecg based on ensemble of deep convolutional neural networks

    Min-Gu Kim, Hoon Ko, and Sung Bum Pan. A study on user recognition using 2d ecg based on ensemble of deep convolutional neural networks. Journal of Ambient Intelligence and Humanized Computing , 11:1859–1867, 2020

  12. [12]

    Development of expert-level automated detection of epileptiform discharges during electroencephalogram interpretation

    Jin Jing, Haoqi Sun, Jennifer A Kim, et al. Development of expert-level automated detection of epileptiform discharges during electroencephalogram interpretation. JAMA Neurology, 77(1):103–108, 2020

  13. [13]

    Detection of obstructive sleep apnoea by ecg signals using deep learning architectures

    Haifa Almutairi, Ghulam Mubashar Hassan, and Amitava Datta. Detection of obstructive sleep apnoea by ecg signals using deep learning architectures. In European Signal Processing Conference , pages 1382–1386, 2021

  14. [14]

    Brandon Westover, Jimeng Sun, et al

    Chaoqi Yang, Cao Xiao, M. Brandon Westover, Jimeng Sun, et al. Self-supervised electroencephalogram representation learning for automatic sleep staging: Model development and evaluation study. JMIR AI, 2(1):e46769, 2023

  15. [15]

    Transformer convolutional neural networks for automated artifact detection in scalp eeg

    Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. Transformer convolutional neural networks for automated artifact detection in scalp eeg. In International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3599–3602, 2022

  16. [16]

    Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network

    Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network. Biomedical Signal Processing and Control, 72:103342, 2022

  17. [17]

    arXiv preprint arXiv:2106.11170 (2021)

    Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. Transformer-based spatial-temporal feature learning for eeg decoding. arXiv preprint arXiv:2106.11170, 2021

  18. [18]

    Spatial-temporal transformers for eeg emotion recognition

    Jiyao Liu, Hao Wu, Li Zhang, and Yanxi Zhao. Spatial-temporal transformers for eeg emotion recognition. In Proceedings of the 6th International Conference on Advances in Artificial Intelligence, pages 116–120, 2022

  19. [19]

    Biot: Biosignal transformer for cross-data learning in the wild

    Chaoqi Yang, M Brandon Westover, and Jimeng Sun. Biot: Biosignal transformer for cross-data learning in the wild. In Advances in Neural Information Processing Systems, 2023

  20. [20]

    Large brain model for learning generic representations with tremendous EEG data in BCI

    Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous EEG data in BCI. In International Conference on Learning Representations, 2024. 10

  21. [21]

    NeuroLM: A universal multi-task foundation model for bridging the gap between language and EEG signals

    Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, and Dongsheng Li. NeuroLM: A universal multi-task foundation model for bridging the gap between language and EEG signals. In International Conference on Learning Representations, 2025

  22. [22]

    Investigating critical frequency bands and channels for eeg-based emotion recognition with deep neural networks

    Wei-Long Zheng and Bao-Liang Lu. Investigating critical frequency bands and channels for eeg-based emotion recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, 7(3):162–175, 2015

  23. [23]

    Harati, M

    A. Harati, M. Golmohammadi, S. Lopez, I. Obeid, and J. Picone. Improved eeg event classification using differential energy. In IEEE Signal Processing in Medicine and Biology Symposium (SPMB), pages 1–3. IEEE, 2015

  24. [24]

    Kaplan, Andrew A

    Alexander Ya. Kaplan, Andrew A. Fingelkurts, Alexander A. Fingelkurts, Sergei V . Borisov, and Boris S. Darkhovsky. Nonstationary nature of the brain activity as revealed by eeg/meg: Methodological, practical and conceptual challenges. Signal Processing, 85(11):2190–2212, 2005. Neuronal Coordination in the Brain: A Signal Processing Perspective

  25. [25]

    Spatial-temporal feature fusion neural network for eeg-based emotion recognition

    Zhe Wang, Yongxiong Wang, Jiapeng Zhang, Chuanfei Hu, Zhong Yin, and Yu Song. Spatial-temporal feature fusion neural network for eeg-based emotion recognition. IEEE Transactions on Instrumentation and Measurement, 71:1–12, 2022

  26. [26]

    The helsinki university sleep corpus (hmc): A benchmark dataset for sleep staging algorithms

    Diego Alvarez-Estevez and Rene Rijsman. The helsinki university sleep corpus (hmc): A benchmark dataset for sleep staging algorithms. PhysioNet, 2021

  27. [27]

    Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the eeg

    Bob Kemp, Aeilko H Zwinderman, Bert Tuk, H A C Kamphuisen, and J J L Oberyé. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the eeg. IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000

  28. [28]

    The sleep heart health study: design, rationale, and methods

    Stuart F Quan, Barbara V Howard, Conrad Iber, James P Kiley, F Javier Nieto, George T O’Connor, David M Rapoport, Susan Redline, John Robbins, Jonathan M Samet, and Peter W Wahl. The sleep heart health study: design, rationale, and methods. Sleep, 20(12):1077–1085, 1997

  29. [29]

    Emotionmeter: A multimodal framework for recognizing human emotions

    Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. Emotionmeter: A multimodal framework for recognizing human emotions. IEEE Transactions on Cybernetics, 49(3):1110–1122, 2019

  30. [30]

    A public eeg database for the evaluation of eeg abnormality detection algorithms

    Emily von Weltin, Thomas V oorhis, Yijun Cui, Vinay Shah, Xiaoxiao Jiang, Yanshan Li, Yuxuan Yang, Mohammad Golmohammadi, Iyad Obeid, and Joseph Picone. A public eeg database for the evaluation of eeg abnormality detection algorithms. Clinical Neurophysiology, 128(8):1524–1532, 2017

  31. [31]

    Eeg-based mental workload estimation with data fusion and transfer learning

    Iryna Zyma, Serhii Tukaev, Andrii Seleznov, Andrii Karpov, Oleksandr Tkachenko, and Radek Martinek. Eeg-based mental workload estimation with data fusion and transfer learning. Frontiers in Neuroscience, 13:702, 2019

  32. [32]

    A large eeg dataset for studying cross-session variability in motor imagery brain–computer interface

    Jianqun Ma, Banghua Yang, Wenhua Qiu, Fenqi Rong, Xueyuan Zhang, Yijun Liu, and Haibo Lu. A large eeg dataset for studying cross-session variability in motor imagery brain–computer interface. Scientific Data, 9(1):531, 2022

  33. [33]

    Internlm2 technical report, 2024

    Zheng Cai, Maosong Cao, Haojiong Chen, and et al. Internlm2 technical report, 2024

  34. [34]

    Sarhan, Eishi Asano, Aimee Luat, and Mohammad Alhawari

    Rihat Rahman, Shiva Maleki Varnosfaderani, Omar Makke, Nabil J. Sarhan, Eishi Asano, Aimee Luat, and Mohammad Alhawari. Comprehensive analysis of eeg datasets for epileptic seizure prediction. In IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–5, 2021

  35. [35]

    Moctezuma, Y

    L.A. Moctezuma, Y . Suzuki, J. Furuki, et al. Gru-powered sleep stage classification with permutation-based eeg channel selection. Scientific Reports, 14:17952, 2024

  36. [36]

    Alessa, Mohammed H

    Faisal M. Alessa, Mohammed H. Alhaag, Ibrahim M. Al-harkan, Mohamed Z. Ramadan, and Fahad M. Alqahtani. A neurophysiological evaluation of cognitive load during augmented reality interactions in various industrial maintenance and assembly tasks. Sensors, 23(18), 2023

  37. [37]

    Flamingo: a visual language model for few-shot learning

    Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022

  38. [38]

    OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models

    Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, et al. Openflamingo: An open-source framework for training large autoregressive vision-language models. arXiv preprint arXiv:2308.01390, 2023

  39. [39]

    Mimic-it: Multi-modal in-context instruction tuning,

    Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, and Ziwei Liu. Mimic-it: Multi-modal in-context instruction tuning. arXiv preprint arXiv:2306.05425, 2023

  40. [40]

    LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention

    Renrui Zhang, Jiaming Han, Chris Liu, Peng Gao, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, and Yu Qiao. Llama-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199, 2023

  41. [41]

    InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning

    Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, and Steven Hoi. Instructblip: Towards general-purpose vision-language models with instruction tuning. arXiv preprint arXiv: 2305.06500, 2023. 11

  42. [42]

    Visual Instruction Tuning

    Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning. arXiv preprint arXiv: 2304.08485, 2023

  43. [43]

    MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

    Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, and Mohamed Elhoseiny. Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592, 2023

  44. [44]

    VideoChat: Chat-Centric Video Understanding

    KunChang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, and Yu Qiao. Videochat: Chat-centric video understanding. arXiv preprint arXiv:2305.06355, 2024

  45. [45]

    Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding

    Hang Zhang, Xin Li, and Lidong Bing. Video-llama: An instruction-tuned audio-visual language model for video understanding. arXiv preprint arXiv:2306.02858, 2023

  46. [46]

    Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models

    Muhammad Maaz, Hanoona Rasheed, Salman Khan, and Fahad Shahbaz Khan. Video-chatgpt: Towards detailed video understanding via large vision and language models. arXiv preprint arXiv:2306.05424, 2024

  47. [47]

    Valley: Video assistant with large language model enhanced ability.arXiv preprint arXiv:2306.07207,

    Ruipu Luo, Ziwang Zhao, Min Yang, Junwei Dong, Da Li, Pengcheng Lu, Tao Wang, Linmei Hu, Minghui Qiu, and Zhongyu Wei. Valley: Video assistant with large language model enhanced ability.arXiv preprint arXiv:2306.07207, 2023

  48. [48]

    WavChat: A survey of spoken dialogue models.arXiv preprint arXiv:2411.13577,

    Shengpeng Ji, Yifu Chen, Minghui Fang, Jialong Zuo, Jingyu Lu, Hanting Wang, Ziyue Jiang, Long Zhou, Shujie Liu, Xize Cheng, et al. Wavchat: A survey of spoken dialogue models. arXiv preprint arXiv:2411.13577, 2024

  49. [49]

    VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs

    Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, and Lidong Bing. Videollama 2: Advancing spatial-temporal modeling and audio understanding in video-llms. arXiv preprint arXiv:2406.07476, 2024

  50. [50]

    PandaGPT: One Model To Instruction-Follow Them All

    Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, and Deng Cai. Pandagpt: One model to instruction- follow them all. arXiv preprint arXiv:2305.16355, 2023

  51. [51]

    NExT-GPT: Any-to-any multimodal LLM

    Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, and Tat-Seng Chua. NExT-GPT: Any-to-any multimodal LLM. In International Conference on Machine Learning, pages 53366–53397, 2024. 12 A Related Works Task-Specific EEG Decoding Models. Due to the variations in EEG signal formats across different datasets, numerous deep learning models have been proposed to tackle...

  52. [52]

    Please determine the type of action based on the provided EEG signal? [Left hand, Right hand]

    This segment of EEG signal can reflect the subject’s behavioral actions. Please determine the type of action based on the provided EEG signal? [Left hand, Right hand]

  53. [53]

    Can you identify the action type from the EEG data? [Left hand, Right hand]

    The given EEG signal is indicative of the subject’s movements. Can you identify the action type from the EEG data? [Left hand, Right hand]

  54. [54]

    What is the action type shown? [Left hand, Right hand]

    Analyze this EEG signal to discern the subject’s physical actions. What is the action type shown? [Left hand, Right hand]

  55. [55]

    (similar instructions) Instruction Templates for the SEED Dataset

  56. [56]

    Given this EEG signal, which emotion does it reflect? [positive, negative, or neutral]

  57. [57]

    [positive, negative, or neutral]

    Based on this EEG signal, please identify the emotion it represents. [positive, negative, or neutral]

  58. [58]

    From this EEG signal, can you determine which emotion it corresponds to? [positive, negative, or neutral]

  59. [59]

    (similar instructions) Instruction Templates for the SEED-IV Dataset

  60. [60]

    Given this EEG signal, which emotion does it reflect? [neutral, sad, fear, happy]

  61. [61]

    [neutral, sad, fear, happy]

    Based on this EEG signal, please identify the emotion it represents. [neutral, sad, fear, happy]

  62. [62]

    From this EEG signal, can you determine which emotion it corresponds to? [neutral, sad, fear, happy]

  63. [63]

    (similar instructions) 17 Instruction Templates for the TUAB Dataset

  64. [64]

    Based on this signal, determine if there is an abnormality

    This EEG signal may indicate abnormal conditions. Based on this signal, determine if there is an abnormality. Choose one: [Normal, Abnormal]

  65. [65]

    Please select one: [Normal, Abnormal]

    Analyze this EEG signal to assess whether it reflects an abnormal condition. Please select one: [Normal, Abnormal]

  66. [66]

    Determine if the signal is normal or abnormal: [Normal, Abnormal]

    This EEG signal could suggest abnormal brain activity. Determine if the signal is normal or abnormal: [Normal, Abnormal]

  67. [67]

    (similar instructions) Instruction Templates for the TUEV Dataset

  68. [68]

    Please determine the epileptic state based on this signal

    This EEG signal reflects epileptic events. Please determine the epileptic state based on this signal

  69. [69]

    Analyze this EEG signal to classify the epileptic state

  70. [70]

    Based on the signal, identify the epileptic state

    This EEG signal may indicate epileptic activity. Based on the signal, identify the epileptic state

  71. [71]

    (similar instructions) Instruction Templates for the TUSL Dataset

  72. [72]

    Based on this signal, please determine the state

    This EEG signal reflects a slow event. Based on this signal, please determine the state. Choose one: [bckg, seiz, slow]

  73. [73]

    Select one: [bckg, seiz, slow]

    Analyze this EEG signal to classify the state it indicates. Select one: [bckg, seiz, slow]

  74. [74]

    Determine the corresponding state from the options: [bckg, seiz, slow]

    This EEG signal may suggest a slow event. Determine the corresponding state from the options: [bckg, seiz, slow]

  75. [75]

    (similar instructions) Instruction Templates for the SHHS , SleepEDF and HMC Dataset

  76. [76]

    Which sleep phase does it most likely correspond to? Choose one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

    The EEG signal provides insights into sleep stages. Which sleep phase does it most likely correspond to? Choose one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

  77. [77]

    Given the signal, which phase is it most likely indicating? Pick one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

    Sleep phases can be inferred from EEG signals. Given the signal, which phase is it most likely indicating? Pick one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

  78. [78]

    Which sleep stage does it most likely represent? Select one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

    This EEG signal reflects brain activity during sleep. Which sleep stage does it most likely represent? Select one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]

  79. [79]

    (similar instructions) 18 Instruction Templates for the Workload Dataset

  80. [80]

    Is this brainwave showing high workload or low workload? [high, low]

    This is an EEG signal. Is this brainwave showing high workload or low workload? [high, low]

Showing first 80 references.