Brain-IT-VQA decodes visual question answers from fMRI using a transformer to extract language tokens and introduces the NSD-VQA benchmark with 20 controlled questions per image across 20 categories.
hub
Mindeye2: Shared-subject models enable fmri-to-image with 1 hour of data
16 Pith papers cite this work. Polarity classification is still indexing.
hub tools
citation-role summary
citation-polarity summary
roles
background 3polarities
background 3representative citing papers
BrainCause recovers known visual localizations and finds new candidate representations by validating causal specificity via counterfactual stimuli and encoding models, showing activation alone produces many false positives.
TRIBE v2 is a multimodal AI model that predicts human brain activity more accurately than linear encoding models and recovers established neuroscientific findings through in-silico testing.
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
Neuroprobe is a new suite of decoding tasks on the BrainTreebank iEEG dataset for evaluating multi-modal language processing in the brain during naturalistic movie viewing.
Introduces a channel-oriented design using per-electrode tokenization, multi-view self-distillation, and structured channel dropout within an encoding-alignment-decoding pipeline to improve EEG-to-music reconstruction over baselines.
A tri-modal contrastive learning method for EEG-based zero-shot visual decoding reports 54.1% top-1 accuracy on the Things-EEG2 200-way benchmark, outperforming prior baselines of 32.4%.
CineNeuron improves fMRI-to-video reconstruction by combining bottom-up semantic enrichment with top-down Mixture-of-Memories integration and outperforms prior methods on benchmarks.
DINA is a dual-tower contrastive model that aligns images with mouse V1 neural activity to enable decoding and shows that low-level visual structure, not semantics or fine details, primarily supports the alignment.
A meta-optimized in-context learning approach enables training-free cross-subject semantic visual decoding from fMRI by inferring individual neural encoding patterns via hierarchical inference on a few examples.
A cross-species pretrained neural encoder combined with end-to-end training and audio LLMs reduces word error rate in neural speech decoding from 24.69% to 10.22% while aligning attempted and imagined speech.
BrainJanus presents a unified autoregressive model with a brain tokenizer that maps between neural activity, vision, and language for encoding and decoding tasks.
FPED is a functional-network prior-guided MoE framework for fMRI visual reconstruction that claims competitive performance at 0.68B parameters and biologically meaningful routing interpretability.
MB2L achieves 80.5% top-1 and 97.6% top-5 accuracy on zero-shot EEG-to-image retrieval by using biomimetic modules and bidirectional contrastive learning to align neural and visual features.
Graph-informed saliency masks derived from fMRI signals are used to condition a single diffusion model, improving object structure and semantic fidelity in visual brain decoding.
BRAIN uses bias-mitigation continual learning with a new de-bias contrastive loss and angular forgetting mitigation to achieve SOTA performance on vision-brain understanding benchmarks despite brain signal inconsistencies across sessions.
citing papers explorer
-
Brain-IT-VQA: From Brain Signals to Answers
Brain-IT-VQA decodes visual question answers from fMRI using a transformer to extract language tokens and introduces the NSD-VQA benchmark with 20 controlled questions per image across 20 categories.
-
From Activation to Causality: Discovery of Causal Visual Representations in the Human Brain
BrainCause recovers known visual localizations and finds new candidate representations by validating causal specificity via counterfactual stimuli and encoding models, showing activation alone produces many false positives.
-
A foundation model of vision, audition, and language for in-silico neuroscience
TRIBE v2 is a multimodal AI model that predicts human brain activity more accurately than linear encoding models and recovers established neuroscientific findings through in-silico testing.
-
NeuroFlow: Toward Unified Visual Encoding and Decoding from Neural Activity
NeuroFlow is the first unified flow model for bidirectional visual encoding and decoding from neural activity using NeuroVAE and cross-modal flow matching.
-
Neuroprobe: Evaluating Intracranial Brain Responses to Naturalistic Stimuli
Neuroprobe is a new suite of decoding tasks on the BrainTreebank iEEG dataset for evaluating multi-modal language processing in the brain during naturalistic movie viewing.
-
Channel-Oriented Design for EEG-to-Music Reconstruction
Introduces a channel-oriented design using per-electrode tokenization, multi-view self-distillation, and structured channel dropout within an encoding-alignment-decoding pipeline to improve EEG-to-music reconstruction over baselines.
-
MindAlign: Bridging EEG, Vision, and Language for Zero-Shot Visual Decoding
A tri-modal contrastive learning method for EEG-based zero-shot visual decoding reports 54.1% top-1 accuracy on the Things-EEG2 200-way benchmark, outperforming prior baselines of 32.4%.
-
Bridging Brain and Semantics: A Hierarchical Framework for Semantically Enhanced fMRI-to-Video Reconstruction
CineNeuron improves fMRI-to-video reconstruction by combining bottom-up semantic enrichment with top-down Mixture-of-Memories integration and outperforms prior methods on benchmarks.
-
Interpreting V1 Population Activity via Image-Neural Latent Representation Alignment
DINA is a dual-tower contrastive model that aligns images with mouse V1 neural activity to enable decoding and shows that low-level visual structure, not semantics or fine details, primarily supports the alignment.
-
Meta-learning In-Context Enables Training-Free Cross Subject Brain Decoding
A meta-optimized in-context learning approach enables training-free cross-subject semantic visual decoding from fMRI by inferring individual neural encoding patterns via hierarchical inference on a few examples.
-
A cross-species neural foundation model for end-to-end speech decoding
A cross-species pretrained neural encoder combined with end-to-end training and audio LLMs reduces word error rate in neural speech decoding from 24.69% to 10.22% while aligning attempted and imagined speech.
-
BrainJanus: A Unified Model for Understanding and Generation across Brain, Vision, and Language
BrainJanus presents a unified autoregressive model with a brain tokenizer that maps between neural activity, vision, and language for encoding and decoding tasks.
-
FPED: A Functional-Network Prior-Guided Mixture-of-Experts Framework for Interpretable Brain Decoding
FPED is a functional-network prior-guided MoE framework for fMRI visual reconstruction that claims competitive performance at 0.68B parameters and biologically meaningful routing interpretability.
-
Multi-Level Bidirectional Biomimetic Learning for EEG-Based Visual Decoding
MB2L achieves 80.5% top-1 and 97.6% top-5 accuracy on zero-shot EEG-to-image retrieval by using biomimetic modules and bidirectional contrastive learning to align neural and visual features.
-
Brain-Grasp: Graph-based Saliency Priors for Improved fMRI-based Visual Brain Decoding
Graph-informed saliency masks derived from fMRI signals are used to condition a single diffusion model, improving object structure and semantic fidelity in visual brain decoding.
-
BRAIN: Bias-Mitigation Continual Learning Approach to Vision-Brain Understanding
BRAIN uses bias-mitigation continual learning with a new de-bias contrastive loss and angular forgetting mitigation to achieve SOTA performance on vision-brain understanding benchmarks despite brain signal inconsistencies across sessions.