pith. sign in

arxiv: 2605.20028 · v1 · pith:HXXRMMRWnew · submitted 2026-05-19 · 💻 cs.LG · physics.ao-ph

Training-Free Bayesian Filtering with Generative Emulators

Pith reviewed 2026-05-20 07:00 UTC · model grok-4.3

classification 💻 cs.LG physics.ao-ph
keywords Bayesian filteringparticle filtersdiffusion modelsgenerative emulatorsdynamical systemsstate estimationhigh-dimensional filtering
0
0 comments X

The pith

Pre-trained diffusion emulators enable training-free optimal particle filters for high-dimensional dynamical systems.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that diffusion models pre-trained as emulators of dynamical systems can be inserted directly into particle filters to perform exact Bayesian state estimation. This sidesteps the usual requirement for extra training or repeated numerical integration of the dynamics at every filter step. The resulting method scales particle filtering to high-dimensional nonlinear and chaotic systems, such as atmospheric models, where classical approaches become impractical.

Core claim

Diffusion-based emulators of dynamical systems can be used to implement, without additional training, an optimal variant of particle filters that has remained largely unexplored due to implementation challenges with classical numerical solvers.

What carries the argument

Diffusion-based generative emulator of the system's transition dynamics, used to draw direct samples of next states inside each iteration of the particle filter.

If this is right

  • Scales particle filtering successfully to high-dimensional nonlinear systems.
  • Works on chaotic dynamical systems including atmospheric dynamics.
  • Avoids the need for corrective training or fine-tuning of the emulator.
  • Retains theoretical exactness for nonlinear dynamics and observations.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The same plug-in approach might work with other classes of generative models that can sample transitions accurately.
  • It could support real-time filtering tasks where repeated numerical integration is too slow.
  • Accuracy of long-horizon predictions would depend on how well the emulator preserves the system's invariant measures.

Load-bearing premise

The pre-trained diffusion emulator must accurately represent the true transition dynamics of the system so that sampling from it produces an unbiased posterior.

What would settle it

Run the emulator-based filter on a low-dimensional system with an exactly computable posterior and check whether the estimated state distributions match the ground truth within sampling error.

Figures

Figures reproduced from arXiv: 2605.20028 by Fran\c{c}ois Rozet, Gilles Louppe, Thomas Savary.

Figure 1
Figure 1. Figure 1: Evolution of the average skill as a function of the number of ensemble members. Our method (brown curve on [PITH_FULL_IMAGE:figures/full_fig_p005_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Ground truth, coarse 32 × 32 observation, FA-APF en￾semble mean and BPF ensemble mean at the last step of a filtering experiment. Examples of trajectories are given in Appendix D. Like FlowDAS, our method can be adapted to the stochastic interpolant framework (Albergo et al., 2025), as detailed in Appendix E. This framework corresponds to a family of generative models that generalize diffusion and flow mat… view at source ↗
Figure 3
Figure 3. Figure 3: Comparison of the 10m U component of wind between the reference ERA5 trajectory (first row), the FA-APF ensemble mean obtained with realistic observations (second row), and the GenCast ensemble mean (third row) after 3, 7, and 15 days. 4.3. Medium-range weather forecasts (GenCast) In the final experiment, we apply our method in a real-world scenario by leveraging the denoiser from GenCast (Price et al., 20… view at source ↗
Figure 4
Figure 4. Figure 4: shows the evolution of the skill for two surface variables (U component of wind and temperature) across successive assimilation steps. Results are shown for our filtering method under both observation scenarios and for an ensemble of forecasts with the same ensemble size [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Ground truth, 32 × 32 coarse observation, and FA-APF ensemble mean at different time steps during a filtering experiment. Ground truth t = 1 t = 3 t = 5 t = 10 Observations FA-APF [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ground truth, 16 × 16 coarse observation, and FA-APF ensemble mean at different time steps during a filtering experiment. 17 [PITH_FULL_IMAGE:figures/full_fig_p017_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: Skill for temperature, geopotential, V component of wind and specific humidity at three different pressure levels (100, 250 and 850 hPa). For experiments with sparse temperature observations (blue and green curves), the skill reaches a plateau after a certain number of time steps for all variables (even those that are not observed), well below the one of GenCast’s forecasts [PITH_FULL_IMAGE:figures/full_f… view at source ↗
Figure 8
Figure 8. Figure 8: Spread-to-skill ratio for temperature, geopotential, V component of wind and specific humidity at three different pressure levels (100, 250 and 850 hPa). For experiments with sparse temperature observations (blue and green curves), the ratio is close to 1, indicating that ensembles are well calibrated [PITH_FULL_IMAGE:figures/full_fig_p019_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Comparison of the geopotential at 500 hPa between the reference ERA5 trajectory (first row), the FA-APF ensemble mean with realistic observations (second row), and the GenCast ensemble mean (third row) after 3, 7, and 15 days. The ensemble mean of FA-APF remains qualitatively close to the ground truth, even under difficult observation conditions [PITH_FULL_IMAGE:figures/full_fig_p020_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: Comparison of surface temperature between the reference ERA5 trajectory (first row), the FA-APF ensemble mean with realistic observations (second row), and the GenCast ensemble mean (third row) after 3, 7, and 15 days. The ensemble mean of FA-APF remains qualitatively close to the ground truth, even under difficult observation conditions. 20 [PITH_FULL_IMAGE:figures/full_fig_p020_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: Comparison between the distributions of conditional samples (red curve, generated using the optimal proposal) and uncondi￾tional samples (blue curve, generated with GenCast without conditioning) at an observed grid point (in North America). Observations (black dotted lines) are more likely in the distribution of conditional samples. 2 3 4 5 6 7 8 10m U wind component [m/s] 0.00 0.25 0.50 0.75 1.00 1.25 1.… view at source ↗
Figure 12
Figure 12. Figure 12: Comparison between the distributions of conditional samples (red curve, generated using the optimal proposal) and uncondi￾tional samples (blue curve, generated with GenCast without conditioning) at an unobserved grid point (in North Africa). Observations (black dotted lines) are more likely in the distribution of conditional samples. 21 [PITH_FULL_IMAGE:figures/full_fig_p021_12.png] view at source ↗
read the original abstract

Bayesian filtering is a well-known problem that aims to estimate plausible states of a dynamical system from observations. Among existing approaches to solve this problem, particle filters are theoretically exact for non-linear dynamics and observations, but suffer from poor scalability in high dimensions. In this work, we show that diffusion-based emulators of dynamical systems can be used to implement, without additional training, an optimal variant of particle filters that has remained largely unexplored due to implementation challenges with classical numerical solvers. Experiments on nonlinear chaotic systems, including atmospheric dynamics, demonstrate that the proposed approach successfully scales particle filtering to high-dimensional settings.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The paper claims that pre-trained diffusion-based generative emulators of dynamical systems can be used without further training to implement an optimal particle filter for Bayesian filtering. This is positioned as overcoming implementation challenges of classical numerical solvers for optimal proposals, with experiments demonstrating successful scaling to high-dimensional nonlinear chaotic systems including atmospheric dynamics.

Significance. If the central claim holds with the required exactness, the work would be significant for enabling scalable, theoretically grounded filtering in high-dimensional chaotic systems such as those in data assimilation and geosciences. The training-free reuse of existing emulators is a clear strength, as is the focus on optimal proposals that have been underexplored due to solver difficulties.

major comments (2)
  1. [Experiments] Experiments section: the reported success on nonlinear chaotic systems and atmospheric dynamics provides no quantitative metrics (e.g., RMSE, ESS, or log-likelihood), error bars, or direct comparisons against a ground-truth simulator, preventing verification that the emulator sampling produces an unbiased posterior.
  2. [§3 (Method)] §3 (Method): the derivation of the training-free optimal particle filter assumes the diffusion emulator exactly reproduces the transition kernel p(x_{t+1}|x_t) or optimal proposal so that importance weights remain valid; however, the variational training objective and finite-step reverse SDE introduce residual approximation error that can be amplified by Lyapunov instability in chaotic regimes, and no bias analysis or corrective bound is supplied.
minor comments (1)
  1. [Abstract] Abstract: the phrase 'successfully scales particle filtering' would be strengthened by naming the specific systems and at least one performance measure.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for their constructive and insightful comments. We address each major comment point by point below, indicating where revisions will be incorporated.

read point-by-point responses
  1. Referee: [Experiments] Experiments section: the reported success on nonlinear chaotic systems and atmospheric dynamics provides no quantitative metrics (e.g., RMSE, ESS, or log-likelihood), error bars, or direct comparisons against a ground-truth simulator, preventing verification that the emulator sampling produces an unbiased posterior.

    Authors: We agree that the experiments would be strengthened by quantitative metrics and comparisons. In the revised manuscript we will report RMSE, effective sample size (ESS), and log-likelihood values with error bars from repeated runs. We will also add direct comparisons against particle filtering performed with the ground-truth simulator to assess posterior bias. revision: yes

  2. Referee: [§3 (Method)] §3 (Method): the derivation of the training-free optimal particle filter assumes the diffusion emulator exactly reproduces the transition kernel p(x_{t+1}|x_t) or optimal proposal so that importance weights remain valid; however, the variational training objective and finite-step reverse SDE introduce residual approximation error that can be amplified by Lyapunov instability in chaotic regimes, and no bias analysis or corrective bound is supplied.

    Authors: The referee correctly notes that our derivation assumes the emulator provides a sufficiently accurate approximation to the transition kernel. We will revise §3 to explicitly state this assumption, discuss the sources of residual error from variational training and finite-step discretization, and analyze how such errors may be amplified under chaotic dynamics. A preliminary bias discussion and weight-validity bound will be added; however, a fully rigorous corrective bound that accounts for arbitrary Lyapunov instability remains an open theoretical question. revision: partial

standing simulated objections not resolved
  • A complete, rigorous bias bound that holds uniformly under Lyapunov instability in chaotic regimes.

Circularity Check

0 steps flagged

No significant circularity; relies on independent pre-trained emulators

full rationale

The derivation chain uses pre-trained diffusion emulators as an external component to implement particle filtering without additional training. No steps reduce by construction to fitted parameters, self-definitions, or load-bearing self-citations; the central claim depends on the separate accuracy of those emulators rather than deriving the posterior from the paper's own outputs. This is a standard non-circular finding for a method that composes existing generative models with classical filtering.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the assumption that a pre-trained diffusion emulator can serve as an unbiased transition model inside the particle filter. No free parameters or new entities are introduced in the abstract; the main unstated premise is that the emulator's learned distribution matches the true dynamics closely enough for filtering.

axioms (1)
  • domain assumption The diffusion-based emulator accurately captures the system's transition dynamics without bias for the purposes of particle filtering.
    Invoked implicitly when stating that the emulator can be used directly to implement the filter without additional training.

pith-pipeline@v0.9.0 · 5623 in / 1224 out tokens · 34542 ms · 2026-05-20T07:00:52.388939+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

18 extracted references · 18 canonical work pages

  1. [1]

    URL https://www.sciencedirect.com/ science/article/pii/S0045782524007023

    doi: https://doi.org/10.1016/j.cma.2024.117447. URL https://www.sciencedirect.com/ science/article/pii/S0045782524007023. Barros, S., Dent, D., Isaksen, L., Robinson, G., Mozdzyn- ski, G., and Wollenweber, F. The IFS model: A parallel production weather code.Parallel Com- puting, 21(10):1621–1638, 1995. ISSN 0167-8191. doi: https://doi.org/10.1016/0167-81...

  2. [2]

    ISBN 0471007102

    ed edition, 1995. ISBN 0471007102. URL http://gso.gbv.de/DB=2.1/CMD?ACT= SRCHA&SRT=YOP&IKT=1016&TRM=ppn+ 164761632&sourceid=fbw_bibsonomy. Brousseau, P., V ogt, V ., Arbogast, E., Martet, M., Thomas, G., and Berre, L. The operational 3DEnVar data assim- ilation scheme for the M ´et´eo-France convective scale model AROME-France.EGUsphere [preprint], 2025. ...

  3. [3]

    doi: https://doi.org/10.1016/j.physd.2011.06

  4. [4]

    Chen, S., Jia, Y ., Qu, Q., Sun, H., and Fessler, J

    URL https://www.sciencedirect.com/ science/article/pii/S016727891100145X. Chen, S., Jia, Y ., Qu, Q., Sun, H., and Fessler, J. A. FlowDAS: A Stochastic Interpolant-based Framework for Data Assimilation. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems,

  5. [5]

    Chen, Y ., Goldstein, M., Hua, M., Albergo, M

    URL https://openreview.net/forum? id=1nWqhiulqD. Chen, Y ., Goldstein, M., Hua, M., Albergo, M. S., Boffi, N. M., and Vanden-Eijnden, E. Probabilistic Forecast- ing with Stochastic Interpolants and F¨ollmer Processes. In Salakhutdinov, R., Kolter, Z., Heller, K., Weller, A., Oliver, N., Scarlett, J., and Berkenkamp, F. (eds.),Pro- ceedings of the 41st Int...

  6. [6]

    9 Training-Free Bayesian Filtering with Generative Emulators Chorin, A

    URL https://proceedings.mlr.press/ v235/chen24n.html. 9 Training-Free Bayesian Filtering with Generative Emulators Chorin, A. J. Numerical Solution of the Navier-Stokes Equations.Mathematics of Computation, 22(104):745– 762, 1968. ISSN 00255718, 10886842. URL http: //www.jstor.org/stable/2004575. Chung, H., Kim, J., Mccann, M. T., Klasky, M. L., and Ye, J...

  7. [7]

    Huang, L., Gianinazzi, L., Yu, Y ., Dueben, P

    Curran Associates, Inc., 2020. Huang, L., Gianinazzi, L., Yu, Y ., Dueben, P. D., and Hoe- fler, T. DiffDA: a Diffusion model for weather-scale Data Assimilation. InProceedings of the 41st International Conference on Machine Learning, ICML’24. JMLR.org, 2024. Hunt, B. R., Kostelich, E. J., and Szunyogh, I. Efficient data assimilation for spatiotemporal ch...

  8. [8]

    Le Dimet, F.-X

    URL https://www.climatechange.ai/ papers/iclr2025/36. Le Dimet, F.-X. and Talagrand, O. Variational algorithms for analysis and assimilation of meteorological observa- tions: theoretical aspects.Tellus A, 38A(2):97–110, 1986. doi: https://doi.org/10.1111/j.1600-0870.1986.tb00459.x. URL https://onlinelibrary.wiley.com/ doi/abs/10.1111/j.1600-0870.1986. tb0...

  9. [9]

    Leutbecher, M

    URL https://proceedings.mlr.press/ v202/lemos23a.html. Leutbecher, M. Ensemble size: How subopti- mal is less than infinity?Quarterly Journal of the Royal Meteorological Society, 145(S1):107– 128, 2019. doi: https://doi.org/10.1002/qj.3387. URL https://rmets.onlinelibrary.wiley. com/doi/abs/10.1002/qj.3387. Lipman, Y ., Chen, R. T. Q., Ben-Hamu, H., Nicke...

  10. [10]

    URL https://rmets.onlinelibrary.wiley

    doi: https://doi.org/10.1002/qj.49711247414. URL https://rmets.onlinelibrary.wiley. com/doi/abs/10.1002/qj.49711247414. Lorenz, E. N. Deterministic Nonperiodic Flow.Journal of Atmospheric Sciences, 20(2):130 – 141, 1963. doi: 10.1175/1520-0469(1963)020⟨0130:DNF⟩2.0.CO;2. URL https://journals.ametsoc.org/view/ journals/atsc/20/2/1520-0469_1963_ 020_0130_dn...

  11. [11]

    Rozet, F., Ohana, R., McCabe, M., Louppe, G., Lanusse, F., and Ho, S

    URL https://openreview.net/forum? id=7v88Fh6iSM. Rozet, F., Ohana, R., McCabe, M., Louppe, G., Lanusse, F., and Ho, S. Lost in Latent Space: An Empirical Study of Latent Diffusion Models for Physics Emula- tion. InThe Thirty-ninth Annual Conference on Neural Information Processing Systems, 2025. URL https: //openreview.net/forum?id=xoNrbfbekM. 11 Training...

  12. [12]

    Sharief, S., Zeghal, J., Barco, G

    URL https://www.climatechange.ai/ papers/neurips2025/39. Sharief, S., Zeghal, J., Barco, G. M., Lemos, P., Heza- veh, Y ., and Perreault-Levasseur, L. MIRA: A Score for Conditional Distribution Accuracy and Model Compari- son, 2026. URL https://arxiv.org/abs/2605. 02014. Si, P. and Chen, P. Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional ...

  13. [13]

    Slivinski, L

    URL https://openreview.net/forum? id=urcEYsZOBz. Slivinski, L. and Snyder, C. Exploring Practical Estimates of the Ensemble Size Necessary for Particle Filters. Monthly Weather Review, 144(3):861 – 875, 2016. doi: 10.1175/MWR-D-14-00303.1. URL https: //journals.ametsoc.org/view/journals/ mwre/144/3/mwr-d-14-00303.1.xml. Snyder, C., Bengtsson, T., Bickel, ...

  14. [14]

    URL https: //journals.ametsoc.org/view/journals/ mwre/136/12/2008mwr2529.1.xml

    doi: 10.1175/2008MWR2529.1. URL https: //journals.ametsoc.org/view/journals/ mwre/136/12/2008mwr2529.1.xml. Snyder, C., Bengtsson, T., and Morzfeld, M. Performance Bounds for Particle Filters Using the Optimal Proposal. Monthly Weather Review, 143(11):4750 – 4761, 2015. doi: 10.1175/MWR-D-15-0144.1. URL https: //journals.ametsoc.org/view/journals/ mwre/14...

  15. [15]

    Transue, T., Chen, B., Takao, S., and Wang, B

    URL https://proceedings.mlr.press/ v70/tompson17a.html. Transue, T., Chen, B., Takao, S., and Wang, B. Flow Match- ing for Efficient and Scalable Data Assimilation, 2025. URLhttps://arxiv.org/abs/2508.13313. van der V orst, H. A. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Non- symmetric Linear Systems.SIAM Journal on Sc...

  16. [16]

    ISBN 9783319251370

    Springer, Heidelberg, 2015. ISBN 9783319251370. doi: 10.1007/978-3-319-25138-7. URL https:// centaur.reading.ac.uk/50238/. van Leeuwen, P. J., K ¨unsch, H. R., Nerger, L., Potthast, R., and Reich, S. Particle filters for high-dimensional geoscience applications: A review.Quarterly Jour- nal of the Royal Meteorological Society, 145(723): 2335–2365, 2019. d...

  17. [17]

    Zheng, H., Chu, W., Wang, A., Kovachki, N

    URL https://openreview.net/forum? id=Loek7hfb46P. Zheng, H., Chu, W., Wang, A., Kovachki, N. B., Baptista, R., and Yue, Y . Ensemble Kalman Diffusion Guidance: A Derivative-free Method for Inverse Problems.Transac- tions on Machine Learning Research, 2025. ISSN 2835-

  18. [18]

    12 Training-Free Bayesian Filtering with Generative Emulators A

    URL https://openreview.net/forum? id=XPEEsKneKs. 12 Training-Free Bayesian Filtering with Generative Emulators A. Tweedie’s formulas Theorem A.1.Assuming that pt(xt |x) =N(x t |α tx,Σ t), the first and second moments of pt(x|x t) are linked to the score function∇ xt logp t(xt)used in Equation(11)through E[x|x t] =α −1 t [xt + Σt∇xt logp t(xt)],(33) V[x|x ...