pith. machine review for the scientific record. sign in

arxiv: 2502.16060 · v5 · submitted 2025-02-22 · 💻 cs.LG · cs.AI· eess.SP

Recognition: unknown

Tokenizing Single-Channel EEG with Time-Frequency Motif Learning

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIeess.SP
keywords foundationmodelssingle-channeltime-frequencyanalysisbaselinesconsistentdiverse
0
0 comments X
read the original abstract

Foundation models are reshaping EEG analysis, yet an important problem of EEG tokenization remains a challenge. This paper presents TFM-Tokenizer, a novel tokenization framework that learns a vocabulary of time-frequency motifs from single-channel EEG signals and encodes them into discrete tokens. We propose a dual-path architecture with time-frequency masking to capture robust motif representations, and it is model-agnostic, supporting both lightweight transformers and existing foundation models for downstream tasks. Our study demonstrates three key benefits: Accuracy: Experiments on four diverse EEG benchmarks demonstrate consistent performance gains across both single- and multi-dataset pretraining settings, achieving up to $11\%$ improvement in Cohen's Kappa over strong baselines. Generalization: Moreover, as a plug-and-play component, it consistently boosts the performance of diverse foundation models, including BIOT and LaBraM. Scalability: By operating at the single-channel level rather than relying on the strict 10-20 EEG system, our method has the potential to be device-agnostic. Experiments on ear-EEG sleep staging, which differs from the pretraining data in signal format, channel configuration, recording device, and task, show that our tokenizer outperforms baselines by $14\%$. A comprehensive token analysis reveals strong class-discriminative, frequency-aware, and consistent structure, enabling improved representation quality and interpretability. Code is available at https://github.com/Jathurshan0330/TFM-Tokenizer.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Neural Signals Generate Clinical Notes in the Wild

    cs.LG 2026-01 unverdicted novelty 8.0

    CELM is the first EEG-to-language foundation model that generates clinical reports from variable-length EEG recordings using a new dataset of 9,922 reports paired with 11,000 hours of data from 9,048 patients.

  2. Making Conformal Predictors Robust in Healthcare Settings: a Case Study on EEG Classification

    cs.LG 2026-02 unverdicted novelty 4.0

    Personalized calibration for conformal predictors raises coverage by over 20 percentage points on EEG seizure classification while keeping prediction set sizes comparable.