UniMind: Unleashing the Power of LLMs for Unified Multi-Task Brain Decoding
Pith reviewed 2026-05-19 07:40 UTC · model grok-4.3
The pith
UniMind connects EEG brain signals to large language models for decoding many tasks at once without separate tuning per task.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
UniMind is a general-purpose EEG foundation model for unified multi-task brain decoding that unleashes large language models by first using a Neuro-Language Connector to distill and transform spatiotemporal neural patterns from EEG into language-model-understandable representations and second using a Task-aware Query Selection module to generate dynamic task-adaptive query tokens that enable learning of task-relevant patterns across diverse tasks, delivering an average 12 percent gain over prior multi-task models on ten datasets plus neuroscientific insights into functional correlations.
What carries the argument
Neuro-Language Connector that bridges EEG spatiotemporal patterns to language-model inputs, combined with Task-aware Query Selection that injects task awareness through adaptive query tokens for cross-modal alignment.
If this is right
- A single model can handle multiple EEG decoding tasks without retraining for each new task.
- Average performance improves by 12 percent across diverse datasets compared with prior multi-task approaches.
- The same architecture yields measurable insights into how neural activity patterns relate across different brain tasks.
- Applications in clinical monitoring and human-machine interaction become feasible with less customization.
Where Pith is reading between the lines
- Direct EEG-to-LLM connections could let future systems draw on the semantic knowledge already inside language models when interpreting brain signals.
- If the connector proves robust, the approach might extend to combining EEG with other signals such as fMRI or eye tracking in one model.
- The reported neural correlations across tasks could be tested by checking whether they predict behavior in new experimental designs.
Load-bearing premise
The Neuro-Language Connector successfully distills and transforms the spatiotemporal neural patterns of EEG data into representations understandable by language models, and the Task-aware Query Selection enables learning of task-relevant patterns across diverse heterogeneous tasks without task-specific tuning.
What would settle it
Running UniMind on a fresh collection of EEG datasets from previously unseen tasks and finding that it still requires task-specific fine-tuning or shows no consistent performance gain over strong baselines would falsify the unified decoding claim.
Figures
read the original abstract
Decoding human brain activity from electroencephalography (EEG) signals is a central challenge at the intersection of neuroscience and artificial intelligence, enabling diverse applications in mental state assessment, clinical monitoring, and human-machine interaction. Recent efforts have extensively explored EEG-based brain foundation models for generalized brain decoding, employing large-scale training on multiple datasets. However, most of these attempts struggle with generalizability and fail to achieve satisfactory performance without task-specific tuning due to pronounced inherent heterogeneity among decoding tasks. To address these challenges, we present UniMind, a general-purpose EEG foundation model for unified multi-task brain decoding by uniquely unleashing the power of large language models to comprehend complex neural patterns. UniMind offers several advantages. First, we design a Neuro-Language Connector to bridge the modality gap between neural signals and large language models, distilling and transforming the spatiotemporal neural patterns of EEG data into representations understandable by language models. Second, a Task-aware Query Selection module is proposed to inject task-awareness into the cross-modal alignment by dynamically generating task-adaptive query tokens, enabling learning of task-relevant neural patterns across diverse tasks. Extensive experiments across ten datasets demonstrate that UniMind substantially outperforms state-of-the-art multi-task decoding models, with an average gain of 12 percent, while also offering valuable neuroscientific insights into neural functional correlations across tasks. The code is available at https://github.com/kaleidoyao/UniMind.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper presents UniMind, a general-purpose EEG foundation model for unified multi-task brain decoding that integrates large language models. It introduces a Neuro-Language Connector to transform spatiotemporal EEG patterns into LLM-compatible representations and a Task-aware Query Selection module to dynamically generate task-adaptive query tokens for handling task heterogeneity. Experiments across ten external datasets report an average 12% performance gain over state-of-the-art multi-task decoding models, along with neuroscientific insights into neural functional correlations. Code is released at a public GitHub repository.
Significance. If the performance gains and truly unified (no task-specific tuning) nature of the model are substantiated, this could advance generalizable brain decoding by bridging neural signals with LLMs, supporting applications in mental state assessment, clinical monitoring, and human-machine interaction. The multi-dataset evaluation and open code are strengths that aid reproducibility and allow testing of the claimed generalization.
major comments (2)
- [Abstract and Method section on Task-aware Query Selection] Abstract and Method section on Task-aware Query Selection: The central claim of unified multi-task decoding 'without task-specific tuning' and 'general-purpose' applicability requires that task identity not be supplied at inference. The module description indicates it 'dynamically generating task-adaptive query tokens' to inject task-awareness; if these tokens derive from task embeddings or identifiers provided at test time (rather than inferred solely from EEG input), each evaluation becomes conditioned on task identity. This would make the 12% gain comparable only to task-aware baselines and preclude zero-shot use on novel tasks, directly undermining the unified claim.
- [Results section (performance claims)] Results section (performance claims): The abstract reports a 12% average gain but the soundness assessment notes absence of details on exact baselines, statistical significance tests, error bars, dataset characteristics, and handling of task heterogeneity. These elements are load-bearing for validating the outperformance and cross-task generalization assertions.
minor comments (2)
- [Method] The notation for the Neuro-Language Connector and query tokens could be formalized with explicit equations or pseudocode to improve clarity of the cross-modal alignment process.
- [Figures] Figure captions and axis labels in experimental result plots should explicitly state the metrics, number of runs, and whether error bars represent standard deviation or standard error.
Simulated Author's Rebuttal
We thank the referee for the detailed and constructive review. The comments help clarify key aspects of our claims regarding unified multi-task decoding. We address each major comment point by point below and indicate the corresponding revisions to the manuscript.
read point-by-point responses
-
Referee: [Abstract and Method section on Task-aware Query Selection] Abstract and Method section on Task-aware Query Selection: The central claim of unified multi-task decoding 'without task-specific tuning' and 'general-purpose' applicability requires that task identity not be supplied at inference. The module description indicates it 'dynamically generating task-adaptive query tokens' to inject task-awareness; if these tokens derive from task embeddings or identifiers provided at test time (rather than inferred solely from EEG input), each evaluation becomes conditioned on task identity. This would make the 12% gain comparable only to task-aware baselines and preclude zero-shot use on novel tasks, directly undermining the unified claim.
Authors: We appreciate this important point on the distinction between task-aware and truly unified decoding. In the UniMind architecture, the Task-aware Query Selection module operates by learning a dynamic selection process over a shared query pool that is conditioned solely on the input EEG spatiotemporal features via the Neuro-Language Connector; no explicit task identifiers, embeddings, or labels are provided or required at inference time. Task-adaptive behavior emerges from the learned alignment during multi-task training, enabling the model to handle heterogeneity without task-specific tuning or conditioning. This design supports the general-purpose claim and opens the possibility for zero-shot transfer to unseen tasks. To eliminate any ambiguity, we have expanded the Method section with a formal description of the inference procedure, including pseudocode showing that only raw EEG is input at test time, and we have updated the Abstract to explicitly state that task identity is not supplied. revision: yes
-
Referee: [Results section (performance claims)] Results section (performance claims): The abstract reports a 12% average gain but the soundness assessment notes absence of details on exact baselines, statistical significance tests, error bars, dataset characteristics, and handling of task heterogeneity. These elements are load-bearing for validating the outperformance and cross-task generalization assertions.
Authors: We agree that these details are essential for rigorous validation. The original manuscript included baseline comparisons and dataset descriptions, but we acknowledge they were not presented with sufficient granularity. In the revised Results section we now provide: (i) a complete table listing all baselines with citations and implementation details, (ii) paired t-test results with p-values and confidence intervals for the 12% average improvement, (iii) error bars as standard deviation across 5 random seeds and subject-wise variability, (iv) a supplementary table with dataset characteristics (subject count, channel count, task labels, recording duration, and preprocessing), and (v) an extended analysis subsection explaining how the Neuro-Language Connector and Task-aware Query Selection jointly mitigate task heterogeneity. These additions directly substantiate the reported gains and generalization claims. revision: yes
Circularity Check
No significant circularity; claims rest on external dataset experiments
full rationale
The paper presents UniMind as an architectural framework with a Neuro-Language Connector and Task-aware Query Selection module, validated through performance gains on ten external datasets. No equations or derivations reduce by construction to fitted inputs or self-definitions. Core claims of unified multi-task decoding without task-specific tuning are supported by empirical results rather than internal self-referential loops or load-bearing self-citations. The derivation chain remains self-contained against external benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- Model hyperparameters and training parameters
axioms (1)
- domain assumption EEG signals contain decodable spatiotemporal patterns that can be unified across heterogeneous tasks
invented entities (2)
-
Neuro-Language Connector
no independent evidence
-
Task-aware Query Selection module
no independent evidence
Forward citations
Cited by 3 Pith papers
-
NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
-
Visualizing the Invisible: Generative Visual Grounding Empowers Universal EEG Understanding in MLLMs
Generative Visual Grounding creates visual proxy images from EEG to enhance MLLM understanding of brain signals beyond text-only alignment.
-
SCOPE: Structured Prototype-Guided Adaptation for EEG Foundation Models with Limited Labels
SCOPE uses cohort-level external supervision, confidence-aware pseudo-labels, and a lightweight prototype-conditioned adapter (ProAdapter) to adapt frozen EEG foundation models in label-limited settings, reporting con...
Reference graph
Works this paper leans on
-
[1]
A review of feature extraction and performance evaluation in epileptic seizure detection using eeg
Poomipat Boonyakitanont, Apiwat Lek-Uthai, Krisnachai Chomtho, and Jitkomut Songsiri. A review of feature extraction and performance evaluation in epileptic seizure detection using eeg. Biomedical Signal Processing and Control, 57:101702, 2020
work page 2020
-
[2]
Khald Ali I Aboalayon, Miad Faezipour, Wafaa S Almuhammadi, and Saeid Moslehpour. Sleep stage classification using eeg signal analysis: A comprehensive survey and new investigation.Entropy, 18(9):272, 2016
work page 2016
-
[3]
Deep learning for eeg motor imagery classification based on multi-layer cnns feature fusion
Syed Umar Amin, Mansour Alsulaiman, Ghulam Muhammad, Mohamed Amine Mekhtiche, and M Shamim Hossain. Deep learning for eeg motor imagery classification based on multi-layer cnns feature fusion. Future Generation Computer Systems, 101:542–554, 2019
work page 2019
-
[4]
Chrononet: A deep recurrent neural network for abnormal eeg identification
Subhrajit Roy, Isabell Kiral-Kornek, and Stefan Harrer. Chrononet: A deep recurrent neural network for abnormal eeg identification. In International Conference on Artificial Intelligence in Medicine, volume 11526 of Lecture Notes in Computer Science, pages 47–56. Springer, 2019
work page 2019
-
[5]
Eeg-based emotion recognition: A state- of-the-art review of current trends and opportunities
Nazmi Sofian Suhaimi, James Mountstephens, and Jason Teo. Eeg-based emotion recognition: A state- of-the-art review of current trends and opportunities. Computational Intelligence and Neuroscience , 2020
work page 2020
-
[6]
Evolutionary inspired approach for mental stress detection using eeg signal
Lakhan Dev Sharma, Vijay Kumar Bohat, Maria Habib, Al-Zoubi Ala’M, Hossam Faris, and Ibrahim Aljarah. Evolutionary inspired approach for mental stress detection using eeg signal. Expert Systems with Applications, 197:116634, 2022
work page 2022
-
[7]
Jin Jing, Wendong Ge, Shenda Hong, Marta Bento Fernandes, Zhen Lin, et al. Development of expert- level classification of seizures and rhythmic and periodic patterns during eeg interpretation. Neurology, 100(17):e1750–e1762, 2023
work page 2023
-
[8]
P. Nagabushanam, S. Thomas George, Praharsha Davu, P. Bincy, Meghana Naidu, and S. Radha. Artifact removal using elliptic filter and classification using 1d-cnn for eeg signals. In International Conference on Advanced Computing and Communication Systems (ICACCS), pages 551–556, 2020
work page 2020
-
[9]
Cnn and lstm-based emotion charting using physiological signals
Muhammad Najam Dar, Muhammad Usman Akram, Sajid Gul Khawaja, and Amit N. Cnn and lstm-based emotion charting using physiological signals. Sensors, 20(16):4551, 2020
work page 2020
-
[10]
Atd: Augmenting cp tensor decomposition by self supervision
Chaoqi Yang, Cheng Qian, Navjot Singh, Cao Xiao, M Brandon Westover, Edgar Solomonik, and Jimeng Sun. Atd: Augmenting cp tensor decomposition by self supervision. Advances in Neural Information Processing Systems, 2022
work page 2022
-
[11]
A study on user recognition using 2d ecg based on ensemble of deep convolutional neural networks
Min-Gu Kim, Hoon Ko, and Sung Bum Pan. A study on user recognition using 2d ecg based on ensemble of deep convolutional neural networks. Journal of Ambient Intelligence and Humanized Computing , 11:1859–1867, 2020
work page 2020
-
[12]
Jin Jing, Haoqi Sun, Jennifer A Kim, et al. Development of expert-level automated detection of epileptiform discharges during electroencephalogram interpretation. JAMA Neurology, 77(1):103–108, 2020
work page 2020
-
[13]
Detection of obstructive sleep apnoea by ecg signals using deep learning architectures
Haifa Almutairi, Ghulam Mubashar Hassan, and Amitava Datta. Detection of obstructive sleep apnoea by ecg signals using deep learning architectures. In European Signal Processing Conference , pages 1382–1386, 2021
work page 2021
-
[14]
Brandon Westover, Jimeng Sun, et al
Chaoqi Yang, Cao Xiao, M. Brandon Westover, Jimeng Sun, et al. Self-supervised electroencephalogram representation learning for automatic sleep staging: Model development and evaluation study. JMIR AI, 2(1):e46769, 2023
work page 2023
-
[15]
Transformer convolutional neural networks for automated artifact detection in scalp eeg
Wei Yan Peh, Yuanyuan Yao, and Justin Dauwels. Transformer convolutional neural networks for automated artifact detection in scalp eeg. In International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 3599–3602, 2022
work page 2022
-
[16]
Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network
Hongli Li, Man Ding, Ronghua Zhang, and Chunbo Xiu. Motor imagery eeg classification algorithm based on cnn-lstm feature fusion network. Biomedical Signal Processing and Control, 72:103342, 2022
work page 2022
-
[17]
arXiv preprint arXiv:2106.11170 (2021)
Yonghao Song, Xueyu Jia, Lie Yang, and Longhan Xie. Transformer-based spatial-temporal feature learning for eeg decoding. arXiv preprint arXiv:2106.11170, 2021
-
[18]
Spatial-temporal transformers for eeg emotion recognition
Jiyao Liu, Hao Wu, Li Zhang, and Yanxi Zhao. Spatial-temporal transformers for eeg emotion recognition. In Proceedings of the 6th International Conference on Advances in Artificial Intelligence, pages 116–120, 2022
work page 2022
-
[19]
Biot: Biosignal transformer for cross-data learning in the wild
Chaoqi Yang, M Brandon Westover, and Jimeng Sun. Biot: Biosignal transformer for cross-data learning in the wild. In Advances in Neural Information Processing Systems, 2023
work page 2023
-
[20]
Large brain model for learning generic representations with tremendous EEG data in BCI
Wei-Bang Jiang, Li-Ming Zhao, and Bao-Liang Lu. Large brain model for learning generic representations with tremendous EEG data in BCI. In International Conference on Learning Representations, 2024. 10
work page 2024
-
[21]
Wei-Bang Jiang, Yansen Wang, Bao-Liang Lu, and Dongsheng Li. NeuroLM: A universal multi-task foundation model for bridging the gap between language and EEG signals. In International Conference on Learning Representations, 2025
work page 2025
-
[22]
Wei-Long Zheng and Bao-Liang Lu. Investigating critical frequency bands and channels for eeg-based emotion recognition with deep neural networks. IEEE Transactions on Autonomous Mental Development, 7(3):162–175, 2015
work page 2015
- [23]
-
[24]
Alexander Ya. Kaplan, Andrew A. Fingelkurts, Alexander A. Fingelkurts, Sergei V . Borisov, and Boris S. Darkhovsky. Nonstationary nature of the brain activity as revealed by eeg/meg: Methodological, practical and conceptual challenges. Signal Processing, 85(11):2190–2212, 2005. Neuronal Coordination in the Brain: A Signal Processing Perspective
work page 2005
-
[25]
Spatial-temporal feature fusion neural network for eeg-based emotion recognition
Zhe Wang, Yongxiong Wang, Jiapeng Zhang, Chuanfei Hu, Zhong Yin, and Yu Song. Spatial-temporal feature fusion neural network for eeg-based emotion recognition. IEEE Transactions on Instrumentation and Measurement, 71:1–12, 2022
work page 2022
-
[26]
The helsinki university sleep corpus (hmc): A benchmark dataset for sleep staging algorithms
Diego Alvarez-Estevez and Rene Rijsman. The helsinki university sleep corpus (hmc): A benchmark dataset for sleep staging algorithms. PhysioNet, 2021
work page 2021
-
[27]
Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the eeg
Bob Kemp, Aeilko H Zwinderman, Bert Tuk, H A C Kamphuisen, and J J L Oberyé. Analysis of a sleep-dependent neuronal feedback loop: the slow-wave microcontinuity of the eeg. IEEE Transactions on Biomedical Engineering, 47(9):1185–1194, 2000
work page 2000
-
[28]
The sleep heart health study: design, rationale, and methods
Stuart F Quan, Barbara V Howard, Conrad Iber, James P Kiley, F Javier Nieto, George T O’Connor, David M Rapoport, Susan Redline, John Robbins, Jonathan M Samet, and Peter W Wahl. The sleep heart health study: design, rationale, and methods. Sleep, 20(12):1077–1085, 1997
work page 1997
-
[29]
Emotionmeter: A multimodal framework for recognizing human emotions
Wei-Long Zheng, Wei Liu, Yifei Lu, Bao-Liang Lu, and Andrzej Cichocki. Emotionmeter: A multimodal framework for recognizing human emotions. IEEE Transactions on Cybernetics, 49(3):1110–1122, 2019
work page 2019
-
[30]
A public eeg database for the evaluation of eeg abnormality detection algorithms
Emily von Weltin, Thomas V oorhis, Yijun Cui, Vinay Shah, Xiaoxiao Jiang, Yanshan Li, Yuxuan Yang, Mohammad Golmohammadi, Iyad Obeid, and Joseph Picone. A public eeg database for the evaluation of eeg abnormality detection algorithms. Clinical Neurophysiology, 128(8):1524–1532, 2017
work page 2017
-
[31]
Eeg-based mental workload estimation with data fusion and transfer learning
Iryna Zyma, Serhii Tukaev, Andrii Seleznov, Andrii Karpov, Oleksandr Tkachenko, and Radek Martinek. Eeg-based mental workload estimation with data fusion and transfer learning. Frontiers in Neuroscience, 13:702, 2019
work page 2019
-
[32]
A large eeg dataset for studying cross-session variability in motor imagery brain–computer interface
Jianqun Ma, Banghua Yang, Wenhua Qiu, Fenqi Rong, Xueyuan Zhang, Yijun Liu, and Haibo Lu. A large eeg dataset for studying cross-session variability in motor imagery brain–computer interface. Scientific Data, 9(1):531, 2022
work page 2022
-
[33]
Internlm2 technical report, 2024
Zheng Cai, Maosong Cao, Haojiong Chen, and et al. Internlm2 technical report, 2024
work page 2024
-
[34]
Sarhan, Eishi Asano, Aimee Luat, and Mohammad Alhawari
Rihat Rahman, Shiva Maleki Varnosfaderani, Omar Makke, Nabil J. Sarhan, Eishi Asano, Aimee Luat, and Mohammad Alhawari. Comprehensive analysis of eeg datasets for epileptic seizure prediction. In IEEE International Symposium on Circuits and Systems (ISCAS), pages 1–5, 2021
work page 2021
-
[35]
L.A. Moctezuma, Y . Suzuki, J. Furuki, et al. Gru-powered sleep stage classification with permutation-based eeg channel selection. Scientific Reports, 14:17952, 2024
work page 2024
-
[36]
Faisal M. Alessa, Mohammed H. Alhaag, Ibrahim M. Al-harkan, Mohamed Z. Ramadan, and Fahad M. Alqahtani. A neurophysiological evaluation of cognitive load during augmented reality interactions in various industrial maintenance and assembly tasks. Sensors, 23(18), 2023
work page 2023
-
[37]
Flamingo: a visual language model for few-shot learning
Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katherine Millican, Malcolm Reynolds, et al. Flamingo: a visual language model for few-shot learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022
work page 2022
-
[38]
OpenFlamingo: An Open-Source Framework for Training Large Autoregressive Vision-Language Models
Anas Awadalla, Irena Gao, Josh Gardner, Jack Hessel, Yusuf Hanafy, Wanrong Zhu, Kalyani Marathe, Yonatan Bitton, Samir Gadre, Shiori Sagawa, et al. Openflamingo: An open-source framework for training large autoregressive vision-language models. arXiv preprint arXiv:2308.01390, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[39]
Mimic-it: Multi-modal in-context instruction tuning,
Bo Li, Yuanhan Zhang, Liangyu Chen, Jinghao Wang, Fanyi Pu, Jingkang Yang, Chunyuan Li, and Ziwei Liu. Mimic-it: Multi-modal in-context instruction tuning. arXiv preprint arXiv:2306.05425, 2023
-
[40]
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
Renrui Zhang, Jiaming Han, Chris Liu, Peng Gao, Aojun Zhou, Xiangfei Hu, Shilin Yan, Pan Lu, Hongsheng Li, and Yu Qiao. Llama-adapter: Efficient fine-tuning of language models with zero-init attention. arXiv preprint arXiv:2303.16199, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[41]
InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning
Wenliang Dai, Junnan Li, Dongxu Li, Anthony Meng Huat Tiong, Junqi Zhao, Weisheng Wang, Boyang Li, Pascale Fung, and Steven Hoi. Instructblip: Towards general-purpose vision-language models with instruction tuning. arXiv preprint arXiv: 2305.06500, 2023. 11
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[42]
Haotian Liu, Chunyuan Li, Qingyang Wu, and Yong Jae Lee. Visual instruction tuning. arXiv preprint arXiv: 2304.08485, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
Deyao Zhu, Jun Chen, Xiaoqian Shen, Xiang Li, and Mohamed Elhoseiny. Minigpt-4: Enhancing vision-language understanding with advanced large language models. arXiv preprint arXiv:2304.10592, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[44]
VideoChat: Chat-Centric Video Understanding
KunChang Li, Yinan He, Yi Wang, Yizhuo Li, Wenhai Wang, Ping Luo, Yali Wang, Limin Wang, and Yu Qiao. Videochat: Chat-centric video understanding. arXiv preprint arXiv:2305.06355, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[45]
Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Hang Zhang, Xin Li, and Lidong Bing. Video-llama: An instruction-tuned audio-visual language model for video understanding. arXiv preprint arXiv:2306.02858, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[46]
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
Muhammad Maaz, Hanoona Rasheed, Salman Khan, and Fahad Shahbaz Khan. Video-chatgpt: Towards detailed video understanding via large vision and language models. arXiv preprint arXiv:2306.05424, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[47]
Valley: Video assistant with large language model enhanced ability.arXiv preprint arXiv:2306.07207,
Ruipu Luo, Ziwang Zhao, Min Yang, Junwei Dong, Da Li, Pengcheng Lu, Tao Wang, Linmei Hu, Minghui Qiu, and Zhongyu Wei. Valley: Video assistant with large language model enhanced ability.arXiv preprint arXiv:2306.07207, 2023
-
[48]
WavChat: A survey of spoken dialogue models.arXiv preprint arXiv:2411.13577,
Shengpeng Ji, Yifu Chen, Minghui Fang, Jialong Zuo, Jingyu Lu, Hanting Wang, Ziyue Jiang, Long Zhou, Shujie Liu, Xize Cheng, et al. Wavchat: A survey of spoken dialogue models. arXiv preprint arXiv:2411.13577, 2024
-
[49]
VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
Zesen Cheng, Sicong Leng, Hang Zhang, Yifei Xin, Xin Li, Guanzheng Chen, Yongxin Zhu, Wenqi Zhang, Ziyang Luo, Deli Zhao, and Lidong Bing. Videollama 2: Advancing spatial-temporal modeling and audio understanding in video-llms. arXiv preprint arXiv:2406.07476, 2024
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[50]
PandaGPT: One Model To Instruction-Follow Them All
Yixuan Su, Tian Lan, Huayang Li, Jialu Xu, Yan Wang, and Deng Cai. Pandagpt: One model to instruction- follow them all. arXiv preprint arXiv:2305.16355, 2023
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[51]
NExT-GPT: Any-to-any multimodal LLM
Shengqiong Wu, Hao Fei, Leigang Qu, Wei Ji, and Tat-Seng Chua. NExT-GPT: Any-to-any multimodal LLM. In International Conference on Machine Learning, pages 53366–53397, 2024. 12 A Related Works Task-Specific EEG Decoding Models. Due to the variations in EEG signal formats across different datasets, numerous deep learning models have been proposed to tackle...
work page 2024
-
[52]
Please determine the type of action based on the provided EEG signal? [Left hand, Right hand]
This segment of EEG signal can reflect the subject’s behavioral actions. Please determine the type of action based on the provided EEG signal? [Left hand, Right hand]
-
[53]
Can you identify the action type from the EEG data? [Left hand, Right hand]
The given EEG signal is indicative of the subject’s movements. Can you identify the action type from the EEG data? [Left hand, Right hand]
-
[54]
What is the action type shown? [Left hand, Right hand]
Analyze this EEG signal to discern the subject’s physical actions. What is the action type shown? [Left hand, Right hand]
-
[55]
(similar instructions) Instruction Templates for the SEED Dataset
-
[56]
Given this EEG signal, which emotion does it reflect? [positive, negative, or neutral]
-
[57]
[positive, negative, or neutral]
Based on this EEG signal, please identify the emotion it represents. [positive, negative, or neutral]
-
[58]
From this EEG signal, can you determine which emotion it corresponds to? [positive, negative, or neutral]
-
[59]
(similar instructions) Instruction Templates for the SEED-IV Dataset
-
[60]
Given this EEG signal, which emotion does it reflect? [neutral, sad, fear, happy]
-
[61]
Based on this EEG signal, please identify the emotion it represents. [neutral, sad, fear, happy]
-
[62]
From this EEG signal, can you determine which emotion it corresponds to? [neutral, sad, fear, happy]
-
[63]
(similar instructions) 17 Instruction Templates for the TUAB Dataset
-
[64]
Based on this signal, determine if there is an abnormality
This EEG signal may indicate abnormal conditions. Based on this signal, determine if there is an abnormality. Choose one: [Normal, Abnormal]
-
[65]
Please select one: [Normal, Abnormal]
Analyze this EEG signal to assess whether it reflects an abnormal condition. Please select one: [Normal, Abnormal]
-
[66]
Determine if the signal is normal or abnormal: [Normal, Abnormal]
This EEG signal could suggest abnormal brain activity. Determine if the signal is normal or abnormal: [Normal, Abnormal]
-
[67]
(similar instructions) Instruction Templates for the TUEV Dataset
-
[68]
Please determine the epileptic state based on this signal
This EEG signal reflects epileptic events. Please determine the epileptic state based on this signal
-
[69]
Analyze this EEG signal to classify the epileptic state
-
[70]
Based on the signal, identify the epileptic state
This EEG signal may indicate epileptic activity. Based on the signal, identify the epileptic state
-
[71]
(similar instructions) Instruction Templates for the TUSL Dataset
-
[72]
Based on this signal, please determine the state
This EEG signal reflects a slow event. Based on this signal, please determine the state. Choose one: [bckg, seiz, slow]
-
[73]
Select one: [bckg, seiz, slow]
Analyze this EEG signal to classify the state it indicates. Select one: [bckg, seiz, slow]
-
[74]
Determine the corresponding state from the options: [bckg, seiz, slow]
This EEG signal may suggest a slow event. Determine the corresponding state from the options: [bckg, seiz, slow]
-
[75]
(similar instructions) Instruction Templates for the SHHS , SleepEDF and HMC Dataset
-
[76]
The EEG signal provides insights into sleep stages. Which sleep phase does it most likely correspond to? Choose one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]
-
[77]
Sleep phases can be inferred from EEG signals. Given the signal, which phase is it most likely indicating? Pick one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]
-
[78]
This EEG signal reflects brain activity during sleep. Which sleep stage does it most likely represent? Select one: [Sleep stage W, Sleep stage N1, Sleep stage N2, Sleep stage N3, Sleep stage R]
-
[79]
(similar instructions) 18 Instruction Templates for the Workload Dataset
-
[80]
Is this brainwave showing high workload or low workload? [high, low]
This is an EEG signal. Is this brainwave showing high workload or low workload? [high, low]
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.