Recognition: unknown
Probabilistic Prediction of Neural Dynamics via Autoregressive Flow Matching
Pith reviewed 2026-05-10 16:20 UTC · model grok-4.3
The pith
A flow-matching model that conditions on recent neural history plus sensory input outperforms standard baselines at predicting short-term brain activity.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By training a generative model with autoregressive flow matching to predict the conditional distribution of future neural activity given recent history and multimodal sensory input, the approach achieves superior performance in forecasting parcel-wise blood oxygenation level-dependent signals compared to non-autoregressive variants and linear baselines, with ablation studies indicating that past dynamics are the primary contributor to accuracy.
What carries the argument
Autoregressive flow matching, a transport-based generative technique that builds the prediction sequentially across time steps to capture the evolving conditional distribution of neural states.
If this is right
- Probabilistic predictions of neural responses become feasible at scale from sensory inputs.
- Improved accuracy and generalization appear in short-term forecasting of cortical activity.
- Access to past neural states emerges as the dominant factor for prediction quality.
- Autoregressive factorization supplies consistent gains in context-rich, short-horizon settings.
- Flow-based generative modeling offers a viable path for short-term forecasting of brain dynamics.
Where Pith is reading between the lines
- Such models could support real-time applications in adaptive neurotechnologies by generating likely future brain states on the fly.
- Extensions to longer prediction horizons or different recording modalities would test whether the temporal conditioning remains effective.
- The emphasis on history-dependent conditioning invites direct comparisons with predictive-coding accounts of cortical function.
- Integration with other generative techniques might clarify when flow matching holds advantages over diffusion or autoregressive transformer alternatives.
Load-bearing premise
Neural activity can be modeled as a temporally evolving conditional process whose future states depend primarily on recent history and concurrent sensory input in a way that is learnable from the available fMRI recordings.
What would settle it
A direct test would be to evaluate the model on a new set of subjects or stimuli not used in training and check whether its prediction errors remain lower than those of the general linear model baseline; if errors are comparable or higher, the claimed advantage would not hold.
Figures
read the original abstract
Forecasting neural activity in response to naturalistic stimuli remains a key challenge for understanding brain dynamics and enabling downstream neurotechnological applications. Here, we introduce a generative forecasting framework for modeling neural dynamics based on autoregressive flow matching (AFM). Building on recent advances in transport-based generative modeling, our approach probabilistically predicts neural responses at scale from multimodal sensory input. Specifically, we learn the conditional distribution of future neural activity given past neural dynamics and concurrent sensory input, explicitly modeling neural activity as a temporally evolving process in which future states depend on recent neural history. We evaluate our framework on the Algonauts project 2025 challenge functional magnetic resonance imaging dataset using subject-specific models. AFM significantly outperforms both a non-autoregressive flow-matching baseline and the official challenge general linear model baseline in predicting short-term parcel-wise blood oxygenation level-dependent (BOLD) activity, demonstrating improved generalization and widespread cortical prediction performance. Ablation analyses show that access to past BOLD dynamics is a dominant driver of performance, while autoregressive factorization yields consistent, modest gains under short-horizon, context-rich conditions. Together, these findings position autoregressive flow-based generative modeling as an effective approach for short-term probabilistic forecasting of neural dynamics with promising applications in closed-loop neurotechnology.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces autoregressive flow matching (AFM) as a generative framework for short-term probabilistic forecasting of parcel-wise BOLD activity. It models the conditional distribution of future neural states given recent history and concurrent multimodal sensory input, trained subject-specifically on the Algonauts 2025 fMRI dataset. AFM is reported to outperform both a non-autoregressive flow-matching baseline and the official GLM challenge baseline, with ablations attributing dominant gains to past-BOLD conditioning and only modest additional benefit from autoregressive factorization under short horizons.
Significance. If the performance gains are robustly attributable to the proposed autoregressive flow-matching construction rather than input differences, the work would provide a scalable transport-based approach to probabilistic neural dynamics modeling with potential utility for closed-loop neurotechnology. The emphasis on explicit temporal evolution and generative sampling distinguishes it from standard encoding models.
major comments (3)
- [Results / Ablations] Results section (and abstract): the claim that AFM 'significantly outperforms' the non-autoregressive flow-matching baseline and GLM is difficult to interpret without explicit confirmation that both baselines receive identical past-BOLD conditioning. The ablation statement that 'access to past BOLD dynamics is a dominant driver' raises the possibility that reported gains largely reflect this additional temporal input rather than the autoregressive flow-matching mechanism itself; a direct comparison table showing input features for each model is needed.
- [Methods] Methods (model specification): the autoregressive context length is listed among free parameters, yet no sensitivity analysis or justification for the chosen length is provided relative to the short-horizon prediction task. This choice directly affects the 'temporally evolving process' assumption and should be quantified.
- [Evaluation / Experiments] Evaluation: no statistical tests, error bars, or cross-validation details (e.g., subject-wise splits, number of runs) are referenced for the reported outperformance on the Algonauts 2025 dataset, making it impossible to assess whether the modest autoregressive gains are reliable or dataset-specific.
minor comments (2)
- [Methods] Notation for the conditional flow-matching objective and the autoregressive factorization should be introduced with explicit equations rather than prose description only.
- [Figures] Figure captions for cortical prediction maps should include the exact metric (e.g., Pearson r or MSE) and the number of parcels shown.
Simulated Author's Rebuttal
Thank you for the detailed and constructive review of our manuscript. We appreciate the referee's focus on clarifying experimental controls, model hyperparameters, and evaluation rigor. We address each major comment below and indicate the revisions planned for the next version.
read point-by-point responses
-
Referee: [Results / Ablations] Results section (and abstract): the claim that AFM 'significantly outperforms' the non-autoregressive flow-matching baseline and GLM is difficult to interpret without explicit confirmation that both baselines receive identical past-BOLD conditioning. The ablation statement that 'access to past BOLD dynamics is a dominant driver' raises the possibility that reported gains largely reflect this additional temporal input rather than the autoregressive flow-matching mechanism itself; a direct comparison table showing input features for each model is needed.
Authors: We thank the referee for identifying this ambiguity. In our setup, the non-autoregressive flow-matching baseline receives identical multimodal sensory inputs and past-BOLD conditioning as the AFM model; the ablation isolating past BOLD was performed by ablating it from the AFM architecture while keeping other factors fixed. The modest gains attributed to autoregressive factorization are therefore measured under matched conditioning. We will add an explicit input-features comparison table in the revised Results and Methods sections to document this for all models (AFM, non-AR FM, and GLM). revision: yes
-
Referee: [Methods] Methods (model specification): the autoregressive context length is listed among free parameters, yet no sensitivity analysis or justification for the chosen length is provided relative to the short-horizon prediction task. This choice directly affects the 'temporally evolving process' assumption and should be quantified.
Authors: We agree that a sensitivity analysis is needed to support the chosen context length. The value was selected via preliminary validation to balance predictive accuracy and compute for the short-horizon regime. In the revised manuscript we will include a sensitivity plot of performance versus context length together with a brief justification tied to the task horizons. revision: yes
-
Referee: [Evaluation / Experiments] Evaluation: no statistical tests, error bars, or cross-validation details (e.g., subject-wise splits, number of runs) are referenced for the reported outperformance on the Algonauts 2025 dataset, making it impossible to assess whether the modest autoregressive gains are reliable or dataset-specific.
Authors: We acknowledge the need for these details. Results were obtained via subject-specific models with cross-validation over the dataset runs. The revised manuscript will report error bars (standard error across subjects), paired statistical tests on performance differences, and full cross-validation specifications including subject-wise splits and run counts. revision: yes
Circularity Check
No circularity detected in modeling or claims
full rationale
The paper presents an empirical generative modeling approach (autoregressive flow matching) trained on the Algonauts 2025 fMRI dataset to learn the conditional distribution p(future BOLD | past BOLD + sensory input). Reported performance consists of out-of-sample predictions on held-out data, benchmarked against independent baselines (non-autoregressive flow matching and the official GLM), with explicit ablations quantifying the separate contributions of temporal conditioning versus autoregressive factorization. No equation, result, or central claim reduces to its own inputs by construction, self-definition, or a load-bearing self-citation chain; the derivation is a standard supervised learning pipeline whose outputs are falsifiable against external data and baselines.
Axiom & Free-Parameter Ledger
free parameters (2)
- neural network parameters
- autoregressive context length
axioms (2)
- domain assumption Neural activity evolves as a Markovian process conditioned on recent history and concurrent sensory input
- domain assumption Flow matching can approximate the target conditional distribution from finite fMRI samples
Reference graph
Works this paper leans on
-
[1]
https://doi.org/10.1016/S1053-8119(03)00202-7 Garg, R., Cecchi, G. A., & Rao, A. R. (2011). Full-brain auto-regressive modeling (FARM) using fMRI. NeuroImage,58(2), 416–441. https://doi.org/10.1016/j.neuroimage.2011.02.074 Gifford, A. T., Bersch, D., St-Laurent, M., Pinsard, B., Boyle, J., Bellec, L., Oliva, A., Roig, G., & Cichy, R. M. (2025). The Algona...
-
[2]
C., Dixit, S., Keck, J., Studenyak, V., Shpilevoi, A., & Bicanski, A
https://doi.org/10.3233/978-1-61499-101-4-133 Schad, D. C., Dixit, S., Keck, J., Studenyak, V., Shpilevoi, A., & Bicanski, A. (2025). VIBE: Video-input brain encoder for fMRI response modeling. https://arxiv.org/abs/2507.17958 Schaefer, A., Kong, R., Gordon, E. M., Laumann, T. O., Zuo, X.-N., Holmes, A. J., Eickhoff, S. B., & Yeo, B. T. T. (2017). Local-g...
-
[3]
https://doi.org/10.1007/s00170-021-07682-3 Seth, A. K. (2010). A matlab toolbox for Granger causal connectivity analysis.Journal of Neuroscience Methods,186(2), 262–273. https://doi.org/10.1016/j.jneumeth.2009.11.020 Simony,E.,&Chang,C.(2020).Analysisofstimulus-inducedbraindynamicsduringnaturalisticparadigms. NeuroImage,216, 116461. https://doi.org/10.101...
-
[4]
https://doi.org/10.1093/cercor/bhaa260 Sonkusare, S., Breakspear, M., & Guo, C. (2019). Naturalistic stimuli in neuroscience: Critically ac- claimed.Trends in Cognitive Sciences,23(8), 699–714. https://doi.org/10.1016/j.tics.2019.05. 004 Sun, Y., Cabezas, M., Lee, J., Wang, C., Zhang, W., Calamante, F., & Lv, J. (2024). Predicting human brain states with ...
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.