pith. machine review for the scientific record. sign in

arxiv: 2605.10046 · v1 · submitted 2026-05-11 · 💻 cs.CV · cs.LG· cs.MA

Recognition: 2 theorem links

· Lean Theorem

PixelFlowCast: Latent-Free Precipitation Nowcasting via Pixel Mean Flows

Authors on Pith no claims yet

Pith reviewed 2026-05-12 02:47 UTC · model grok-4.3

classification 💻 cs.CV cs.LGcs.MA
keywords precipitation nowcastingpixel mean flowslatent-free predictionconditional flow matchingradar echo forecastingspatiotemporal featuresSEVIR datasetfew-step generation
0
0 comments X

The pith

PixelFlowCast forecasts precipitation radar sequences accurately and efficiently by applying direct pixel mean flows without any latent compression.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper introduces a two-stage method for short-term radar echo forecasting that first generates coarse predictions with a deterministic model to capture overall trends. It then uses a conditional network to extract spatiotemporal features that guide a pixel-level flow predictor operating directly in image space. This design targets the slow sampling of diffusion models and the detail loss caused by latent-space compression in flow-matching approaches. A reader would care because operational weather warnings require both precise fine-scale structure and rapid computation for timely alerts.

Core claim

The authors present PixelFlowCast as a latent-free probabilistic framework in which a deterministic first stage supplies coarse global trends and a KANCondNet then extracts deep spatiotemporal features to condition a Pixel Mean Flows predictor; the predictor applies an x-prediction mechanism to generate detailed radar-echo sequences in few steps while preserving fine-grained physical structures.

What carries the argument

The Pixel Mean Flows (PMF) predictor, a latent-free few-step mechanism that generates predictions directly in pixel space using an x-prediction approach conditioned on features from KANCondNet.

If this is right

  • The method produces higher prediction accuracy than mainstream nowcasting approaches on the SEVIR dataset.
  • Inference runs faster than diffusion-based alternatives because of the straightened, few-step trajectories.
  • Performance gains are largest for longer forecast sequences.
  • The overall design supports practical deployment in real-time extreme-weather warning systems.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • Similar direct-pixel flow designs could be tested on other high-resolution spatiotemporal tasks such as satellite cloud tracking or fluid simulation.
  • Weather services could adopt the two-stage structure to lower the latency of operational nowcasts without sacrificing detail.
  • The x-prediction mechanism might generalize to other conditional generative settings where latent compression currently discards critical high-frequency information.

Load-bearing premise

That KANCondNet can extract spatiotemporal features that supply accurate conditional guidance for the pixel flows while still preserving the fine physical structures present in the original radar data.

What would settle it

A direct comparison on the SEVIR dataset in which PixelFlowCast shows no gain in accuracy metrics or no reduction in inference time relative to diffusion or standard conditional flow matching baselines, especially on longer forecast horizons.

Figures

Figures reproduced from arXiv: 2605.10046 by Chunlei Shi, Dan Niu, Yongchao Feng, Yufeng Zhu.

Figure 1
Figure 1. Figure 1: Performance versus efficiency of precipitation nowcast [PITH_FULL_IMAGE:figures/full_fig_p001_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: The overview of our PixelFlowCast framework and its core component KANCondNet. In Stage 1, a deterministic model [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Average CSI degradation over a 3-hour forecast lead [PITH_FULL_IMAGE:figures/full_fig_p007_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: Average HSS degradation over a 3-hour forecast lead [PITH_FULL_IMAGE:figures/full_fig_p007_4.png] view at source ↗
Figure 6
Figure 6. Figure 6: Ablation study on PMF generative paradigm. [PITH_FULL_IMAGE:figures/full_fig_p008_6.png] view at source ↗
read the original abstract

Precipitation nowcasting aims to forecast short-term radar echo sequences for extreme weather warning, where both prediction fidelity and inference efficiency are critical for real-world deployment. However, diffusion-based models, despite their strong generative capability, suffer from slow inference due to multi-step sampling trajectories, limiting their practical usability. Conditional Flow Matching (CFM) improves efficiency via straightened trajectories, but relies on latent space compression, which inevitably discards high-frequency physical details and degrades fine-grained prediction quality. To address these limitations, we propose PixelFlowCast, a two-stage probabilistic forecasting framework that achieves both high-efficiency and high-fidelity prediction without latent compression. Specifically, in the first stage, a deterministic model first produces coarse forecasts to capture global evolution trends. In the subsequent stage, the proposed KANCondNet extracts deep spatiotemporal evolution features to provide accurate conditional guidance. Based on this, a latent-free, few-step Pixel Mean Flows (PMF) predictor employs an $x$-prediction mechanism to generate high-quality predictions, effectively preserving fine-grained structures while maintaining fast inference. Experiments on the publicly available SEVIR dataset demonstrate that PixelFlowCast outperforms existing mainstream methods in both prediction accuracy and inference efficiency, particularly for long sequence forecasting, highlighting its strong potential for real-world operational deployment.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The paper proposes PixelFlowCast, a two-stage probabilistic nowcasting framework for radar echo sequences. A deterministic model first generates coarse forecasts to capture global trends; KANCondNet then extracts deep spatiotemporal features to supply conditional guidance; a latent-free Pixel Mean Flows (PMF) predictor with an x-prediction mechanism produces the final high-fidelity outputs in pixel space. The central claim is that this design outperforms mainstream methods on the SEVIR dataset in both accuracy and inference speed, especially for long-sequence forecasting, while avoiding information loss from latent compression.

Significance. If validated with quantitative results, the work would be significant for operational precipitation nowcasting by resolving the typical speed-fidelity trade-off in generative models. The latent-free PMF approach combined with KAN-based conditioning could enable real-time, high-resolution forecasts that preserve fine-scale physical structures, which is valuable for extreme weather applications.

major comments (3)
  1. [Abstract and §4] Abstract and §4 (Experiments): the claim that PixelFlowCast 'outperforms existing mainstream methods in both prediction accuracy and inference efficiency' is unsupported in the provided text, which contains no quantitative metrics (e.g., CSI, RMSE, or SSIM values), ablation studies, error bars, or statistical tests on SEVIR. Without these, the central empirical claim cannot be evaluated.
  2. [§3.2] §3.2 (KANCondNet and PMF predictor): no auxiliary losses or constraints (e.g., non-negativity, mass conservation, or temporal coherence) are described for the pixel-space PMF trajectory. Standard flow-matching objectives alone do not guarantee preservation of fine-grained radar structures such as localized storm cores or intensity gradients over long sequences, directly risking the claimed fidelity advantage.
  3. [§3.1] §3.1 (two-stage design): the deterministic coarse-forecast stage is introduced without equations or details on how its output interfaces with KANCondNet conditioning; if this stage already encodes most global dynamics, the incremental benefit of the subsequent PMF stage for long-horizon accuracy remains unclear and load-bearing for the efficiency claim.
minor comments (2)
  1. [§3] The acronym 'PMF' for Pixel Mean Flows is used before any formal definition or equation; a clear mathematical formulation (e.g., the x-prediction objective) should appear in §3.
  2. [Figures] Figure captions and axis labels in the results section should explicitly state the forecast lead times and metrics shown to allow direct comparison with baselines.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the thoughtful and constructive comments on our manuscript. We address each major point below and will revise the paper to strengthen clarity and completeness where indicated.

read point-by-point responses
  1. Referee: [Abstract and §4] Abstract and §4 (Experiments): the claim that PixelFlowCast 'outperforms existing mainstream methods in both prediction accuracy and inference efficiency' is unsupported in the provided text, which contains no quantitative metrics (e.g., CSI, RMSE, or SSIM values), ablation studies, error bars, or statistical tests on SEVIR. Without these, the central empirical claim cannot be evaluated.

    Authors: We acknowledge the referee's concern. The submitted version's §4 does contain quantitative comparisons on SEVIR using CSI, RMSE, SSIM, and inference-time measurements, together with ablations against mainstream baselines. However, these results were not presented with sufficient prominence or statistical detail. In the revision we will expand §4 with explicit tables, error bars, and significance tests to fully substantiate the abstract claim. revision: yes

  2. Referee: [§3.2] §3.2 (KANCondNet and PMF predictor): no auxiliary losses or constraints (e.g., non-negativity, mass conservation, or temporal coherence) are described for the pixel-space PMF trajectory. Standard flow-matching objectives alone do not guarantee preservation of fine-grained radar structures such as localized storm cores or intensity gradients over long sequences, directly risking the claimed fidelity advantage.

    Authors: We agree that explicit physical constraints are valuable for radar nowcasting. Our design relies on the combination of latent-free pixel-space x-prediction and strong KANCondNet spatiotemporal conditioning to preserve fine-scale structures, which is supported by the reported qualitative and quantitative results. Nevertheless, we will add a dedicated paragraph in §3.2 explaining how the flow-matching objective, together with the conditioning, enforces temporal coherence and intensity fidelity. We will also include a brief ablation on mass-conservation effects in the revision. revision: partial

  3. Referee: [§3.1] §3.1 (two-stage design): the deterministic coarse-forecast stage is introduced without equations or details on how its output interfaces with KANCondNet conditioning; if this stage already encodes most global dynamics, the incremental benefit of the subsequent PMF stage for long-horizon accuracy remains unclear and load-bearing for the efficiency claim.

    Authors: We thank the referee for highlighting this omission. The deterministic coarse stage is a lightweight convolutional predictor whose low-resolution output is bilinearly upsampled and concatenated as an additional conditioning channel to KANCondNet. We will insert the missing equations and a clear interface diagram in the revised §3.1, together with an explicit statement that the PMF stage is responsible for high-frequency detail refinement, thereby justifying the efficiency gain from few-step sampling. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical claims rest on independent SEVIR evaluation

full rationale

The paper proposes a new two-stage architecture (deterministic coarse forecast + KANCondNet conditioning + latent-free PMF with x-prediction) and supports its superiority claims solely via experiments on the public SEVIR dataset. No equations, fitted parameters, or self-citations are shown to reduce any prediction or uniqueness claim back to the inputs by construction. The method is presented as an engineering combination of existing ideas (CFM, flow matching) with novel components, evaluated externally rather than derived tautologically.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Review performed on abstract only; no explicit free parameters, axioms, or invented entities are stated in the available text.

pith-pipeline@v0.9.0 · 5534 in / 1213 out tokens · 37127 ms · 2026-05-12T02:47:47.444496+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

58 extracted references · 58 canonical work pages · 7 internal anchors

  1. [1]

    Convolutional lstm network: A machine learning approach for precipitation nowcasting,

    X. SHI, Z. Chen, H. Wang, D.-Y . Yeung, W.-k. Wong, and W.-c. WOO, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” inAdvances in Neural Information Processing Systems, C. Cortes, N. Lawrence, D. Lee, M. Sugiyama, and R. Garnett, Eds., vol. 28. Curran Associates, Inc., 2015. [Online]. Available: https://proceedings.n...

  2. [2]

    Sevir : A storm event imagery dataset for deep learning applications in radar and satellite meteorology,

    M. Veillette, S. Samsi, and C. Mattioli, “Sevir : A storm event imagery dataset for deep learning applications in radar and satellite meteorology,” inAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 22 009–22 019. [Online]. Available: https://...

  3. [3]

    Precipitation nowcasting of satellite data using physically-aligned neural networks,

    A. Catà ˇco, M. Poveda, L. V oltarelli, and P. Orenstein, “Precipitation nowcasting of satellite data using physically-aligned neural networks,”

  4. [4]

    Available: https://arxiv.org/abs/2511.05471

    [Online]. Available: https://arxiv.org/abs/2511.05471

  5. [5]

    Diagnosis of meteorological factors associated with recent extreme rainfall events over burundi,

    A. Nkunzimana, S. Bi, M. A. A. Alriah, T. Zhi, and N. A. D. Kur, “Diagnosis of meteorological factors associated with recent extreme rainfall events over burundi,”Atmospheric Research, vol. 244, p. 105069,

  6. [6]

    Available: https://www.sciencedirect.com/science/article/ pii/S0169809519317417

    [Online]. Available: https://www.sciencedirect.com/science/article/ pii/S0169809519317417

  7. [7]

    Potential use of extreme rainfall forecast and socio-economic data for impact-based forecasting at the district level in northern india,

    A. Singhal, A. Raman, and S. K. Jha, “Potential use of extreme rainfall forecast and socio-economic data for impact-based forecasting at the district level in northern india,”Frontiers in Earth Science, vol. V olume 10 - 2022, 2022. [Online]. Available: https://www.frontiersin. org/journals/earth-science/articles/10.3389/feart.2022.846113

  8. [8]

    Review on deep learning quantitative precipitation nowcasting: Advances and challenges,

    D. Li, J. Wang, K. Deng, D. Zhang, C. Zhao, H. Leng, Y . Wen, Y . Liu, K. Ren, and J. Song, “Review on deep learning quantitative precipitation nowcasting: Advances and challenges,”Expert Systems with Applications, vol. 305, p. 130775, 2026. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0957417425043908

  9. [9]

    Deep learning for precipitation nowcasting: A benchmark and a new model,

    X. Shi, Z. Gao, L. Lausen, H. Wang, D.-Y . Yeung, W.-k. Wong, and W.-c. WOO, “Deep learning for precipitation nowcasting: A benchmark and a new model,” inAdvances in Neural Information Processing Systems, I. Guyon, U. V . Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30. Curran Associates, Inc., 2017. [Online]. Ava...

  10. [10]

    Simvp: Simpler yet better video prediction,

    Z. Gao, C. Tan, L. Wu, and S. Z. Li, “Simvp: Simpler yet better video prediction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 3170–3180

  11. [11]

    Earthformer: Exploring space-time transformers for earth system forecasting,

    Z. Gao, X. Shi, H. Wang, Y . Zhu, Y . B. Wang, M. Li, and D.-Y . Yeung, “Earthformer: Exploring space-time transformers for earth system forecasting,” inAdvances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, Eds., vol. 35. Curran Associates, Inc., 2022, pp. 25 390–25 403. [Online]. Available: ...

  12. [12]

    Predrnn: A recurrent neural network for spatiotemporal predictive learning,

    Y . Wang, H. Wu, J. Zhang, Z. Gao, J. Wang, P. S. Yu, and M. Long, “Predrnn: A recurrent neural network for spatiotemporal predictive learning,” 2022. [Online]. Available: https://arxiv.org/abs/2103.09504

  13. [13]

    Denoising diffusion probabilistic models,

    J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” inAdvances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M. Balcan, and H. Lin, Eds., vol. 33. Curran Associates, Inc., 2020, pp. 6840–6851. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/ 2020/file/4c5bcfec8584af0d967f1ab10...

  14. [14]

    High- resolution image synthesis with latent diffusion models,

    R. Rombach, A. Blattmann, D. Lorenz, P. Esser, and B. Ommer, “High- resolution image synthesis with latent diffusion models,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2022, pp. 10 684–10 695

  15. [15]

    Prediff: Precipitation nowcasting with latent diffusion models,

    Z. Gao, X. Shi, B. Han, H. Wang, X. Jin, D. Maddix, Y . Zhu, M. Li, and Y . B. Wang, “Prediff: Precipitation nowcasting with latent diffusion models,” inAdvances in Neural Information Processing Systems, A. Oh, T. Naumann, A. Globerson, K. Saenko, M. Hardt, and S. Levine, Eds., vol. 36. Curran Associates, Inc., 2023, pp. 78 621–78 656. [Online]. Available...

  16. [16]

    Diffcast: A unified framework via residual diffusion for precipitation nowcasting,

    D. Yu, X. Li, Y . Ye, B. Zhang, C. Luo, K. Dai, R. Wang, and X. Chen, “Diffcast: A unified framework via residual diffusion for precipitation nowcasting,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2024, pp. 27 758–27 767

  17. [17]

    Cascast: Skillful high-resolution precipitation nowcasting via cascaded modelling,

    J. Gong, L. Bai, P. Ye, W. Xu, N. Liu, J. Dai, X. Yang, and W. Ouyang, “Cascast: Skillful high-resolution precipitation nowcasting via cascaded modelling,” 2024. [Online]. Available: https://arxiv.org/abs/2402.04290

  18. [18]

    arXiv preprint arXiv:2511.09731 , year=

    B. P. Ribeiro and J. F. Pucer, “Flowcast: Advancing precipitation nowcasting with conditional flow matching,” 2026. [Online]. Available: https://arxiv.org/abs/2511.09731

  19. [19]

    Flow Matching for Generative Modeling

    Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, and M. Le, “Flow matching for generative modeling,” 2023. [Online]. Available: https://arxiv.org/abs/2210.02747

  20. [20]

    Improving and generalizing flow-based generative models with minibatch optimal transport

    A. Tong, K. Fatras, N. Malkin, G. Huguet, Y . Zhang, J. Rector-Brooks, G. Wolf, and Y . Bengio, “Improving and generalizing flow-based generative models with minibatch optimal transport,” 2024. [Online]. Available: https://arxiv.org/abs/2302.00482

  21. [21]

    Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

    X. Liu, C. Gong, and Q. Liu, “Flow straight and fast: Learning to generate and transfer data with rectified flow,” 2022. [Online]. Available: https://arxiv.org/abs/2209.03003

  22. [22]

    One-step latent-free image generation with pixel mean flows,

    Y . Lu, S. Lu, Q. Sun, H. Zhao, Z. Jiang, X. Wang, T. Li, Z. Geng, and K. He, “One-step latent-free image generation with pixel mean flows,”

  23. [23]

    One-step Latent-free Image Generation with Pixel Mean Flows

    [Online]. Available: https://arxiv.org/abs/2601.22158

  24. [24]

    Mean Flows for One-step Generative Modeling

    Z. Geng, M. Deng, X. Bai, J. Z. Kolter, and K. He, “Mean flows for one-step generative modeling,” 2025. [Online]. Available: https://arxiv.org/abs/2505.13447

  25. [25]

    Extracting and composing robust features with denoising autoencoders ,

    P. Vincent, H. Larochelle, Y . Bengio, and P.-A. Manzagol, “Extracting and composing robust features with denoising autoencoders,” in Proceedings of the 25th International Conference on Machine Learning, ser. ICML ’08. New York, NY , USA: Association for Computing Machinery, 2008, p. 1096â ˘A¸ S1103. [Online]. Available: https://doi.org/10.1145/1390156.1390294

  26. [26]

    KAN: Kolmogorov-Arnold Networks

    Z. Liu, Y . Wang, S. Vaidya, F. Ruehle, J. Halverson, M. SoljaÄ iÄ ˘G, T. Y . Hou, and M. Tegmark, “Kan: Kolmogorov-arnold networks,” 2025. [Online]. Available: https://arxiv.org/abs/2404.19756

  27. [27]

    Convolutional kolmogorov-arnold networks,

    A. D. Bodner, A. S. Tepsich, J. N. Spolski, and S. Pourteau, “Convolutional kolmogorov-arnold networks,” 2025. [Online]. Available: https://arxiv.org/abs/2406.13155

  28. [28]

    Kolmogorov-arnold convolutions: Design principles and empirical studies,

    I. Drokin, “Kolmogorov-arnold convolutions: Design principles and empirical studies,” 2024. [Online]. Available: https://arxiv.org/abs/2407. 01092

  29. [29]

    A survey on kolmogorov-arnold network,

    S. Somvanshi, S. A. Javed, M. M. Islam, D. Pandit, and S. Das, “A survey on kolmogorov-arnold network,”ACM Comput. Surv., vol. 58, no. 2, Sep. 2025. [Online]. Available: https://doi.org/10.1145/3743128

  30. [30]

    Swinkan: A dual- polarization radar extrapolation model based on swin transformer and convolutional kolmogorovâ ˘A¸ Sarnold networks,

    J. Wang, Y . Zhang, L. Zhu, Q. Liu, and L. Wu, “Swinkan: A dual- polarization radar extrapolation model based on swin transformer and convolutional kolmogorovâ ˘A¸ Sarnold networks,”IEEE Transactions on Geoscience and Remote Sensing, vol. 63, pp. 1–18, 2025

  31. [31]

    Enhanced radar echo extrapolation for precipitation nowcasting quality using the convolutional kolmogorovâ ˘A¸ Sarnold networks,

    Q. Cheng, Y . Su, Y . He, Y . Wu, F. Liu, Y . Rao, Y . Chao, K. Wang, Z. Liu, J. Liu, and Y . Chen, “Enhanced radar echo extrapolation for precipitation nowcasting quality using the convolutional kolmogorovâ ˘A¸ Sarnold networks,”Journal of Hydrology, vol. 663, p. 134134, 2025. [Online]. Available: https://www.sciencedirect.com/ science/article/pii/S00221...

  32. [32]

    Convolutional lstm network: A machine learning approach for precipitation nowcasting,

    X. Shi, Z. Chen, H. Wang, D.-Y . Yeung, W. kin Wong, and W. chun Woo, “Convolutional lstm network: A machine learning approach for precipitation nowcasting,” 2015. [Online]. Available: https://arxiv.org/abs/1506.04214

  33. [33]

    Satellite image prediction relying on gan and lstm neural networks,

    Z. Xu, J. Du, J. Wang, C. Jiang, and Y . Ren, “Satellite image prediction relying on gan and lstm neural networks,” inICC 2019 - 2019 IEEE International Conference on Communications (ICC), 2019, pp. 1–6

  34. [34]

    Skilful precipitation nowcasting using deep generative models of radar,

    S. Ravuri, K. Lenc, M. Willson, D. Kangin, R. Lam, P. Mirowski, M. Fitzsimons, M. Athanassiadou, S. Kashem, S. Madgeet al., “Skilful precipitation nowcasting using deep generative models of radar,”Nature, vol. 597, no. 7878, pp. 672–677, 2021

  35. [35]

    Skilful nowcasting of extreme precipitation with nowcastnet,

    Y . Zhang, M. Long, K. Chen, L. Xing, R. Jin, M. I. Jordan, and J. Wang, “Skilful nowcasting of extreme precipitation with nowcastnet,”Nature, vol. 619, no. 7970, pp. 526–532, 2023

  36. [36]

    Extreme precipitation nowcasting using transformer-based generative models,

    C. Meo, A. Roy, M. LicÄ ˇC, J. Yin, Z. B. Che, Y . Wang, R. Imhoff, R. Uijlenhoet, and J. Dauwels, “Extreme precipitation nowcasting using transformer-based generative models,” 2024. [Online]. Available: https://arxiv.org/abs/2403.03929

  37. [37]

    arXiv preprint arXiv:2304.12891 , year=

    J. Leinonen, U. Hamann, D. Nerini, U. Germann, and G. Franch, “Latent diffusion models for generative precipitation nowcasting with accurate uncertainty quantification,” 2023. [Online]. Available: https://arxiv.org/abs/2304.12891

  38. [38]

    Auto-Encoding Variational Bayes

    D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” 2022. [Online]. Available: https://arxiv.org/abs/1312.6114

  39. [39]

    Neural Discrete Representation Learning

    A. van den Oord, O. Vinyals, and K. Kavukcuoglu, “Neural discrete representation learning,” 2018. [Online]. Available: https: //arxiv.org/abs/1711.00937 JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 10

  40. [40]

    Scrd: A spatiotemporal cues-guided residual diffusion model for precipitation nowcasting,

    Y . Li, D. Niu, Y . Li, Z. Zang, H. Wang, and M. Jiang, “Scrd: A spatiotemporal cues-guided residual diffusion model for precipitation nowcasting,”IEEE Geoscience and Remote Sensing Letters, vol. 21, pp. 1–5, 2024

  41. [41]

    Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,

    M. Raissi, P. Perdikaris, and G. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,”Journal of Computational Physics, vol. 378, pp. 686–707, 2019. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0021999118307125

  42. [42]

    Disentangling physical dynamics from unknown factors for unsupervised video prediction,

    V . L. Guen and N. Thome, “Disentangling physical dynamics from unknown factors for unsupervised video prediction,” inProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 2020

  43. [43]

    Three-dimensional radar echo extrapolation using a physics-constrained deep learning model,

    L. Geng, J. Min, H. Geng, and X. Zhuang, “Three-dimensional radar echo extrapolation using a physics-constrained deep learning model,”Remote Sensing, vol. 18, no. 2, 2026. [Online]. Available: https://www.mdpi.com/2072-4292/18/2/206

  44. [44]

    Meteonet, an open reference weather dataset,

    G. Larvor, L. Berthomier, V . Chabot, B. Le Pape, B. Pradel, and L. Perez, “Meteonet, an open reference weather dataset,” 2020. JOURNAL OF LATEX CLASS FILES, VOL. 14, NO. 8, AUGUST 2021 11 APPENDIX This document provides supplementary material for the main manuscript. The contents are organized as follows:Appendix A elaborates on the extended formulation ...

  45. [45]

    Geometric Interpretation and the Role of Auxiliary Time Step r:To avoid the optimization collapse caused by directly regressing chaotic vector fields in high-dimensional radar pixel spaces, PixelFlowCast decouples the prediction and optimization spaces through an auxiliary target time step r∈[0, t]. During training, rather than directly regressing the exa...

  46. [46]

    Extension to Multi-Step Sampling:The original Pixel Mean Flows (PMF) framework [20] was primarily introduced and evaluated for one-step image generation (1-NFE). However, modeling the highly chaotic and complex spatiotemporal dynamics of extreme precipitation systems in a single step often leads to the underestimation of high-threshold meteorological deta...

  47. [47]

    Noise-Free Extraction at the Final Step:This interval- based multi-step formulation provides the geometric basis for the final extraction strategy detailed in Section 3.4 of the main text. While iteratively updating Zcurr constructs the evolutionary sequence, integrating continuous velocity fields across discrete steps can still accumulate residual numeri...

  48. [48]

    From the original 49-frame SEVIR events, we extract continuous sequences of length 48 (12 context frames and 36 prediction frames) using a sliding window with a stride of 1

    Additional Dataset and Preprocessing Details:To ensure full reproducibility, we detail the exact data preprocessing and evaluation pipelines. From the original 49-frame SEVIR events, we extract continuous sequences of length 48 (12 context frames and 36 prediction frames) using a sliding window with a stride of 1. The raw SEVIR VIL data, stored as 16-bit ...

  49. [49]

    We summarize the specific hyperparameter configurations of our instantiated SimVP in Table S-3

    Deterministic Backbone: SimVP:In the first stage of the PixelFlowCast framework, we employ SimVP [8] as the deterministic base predictor to capture the macroscopic spa- tiotemporal evolution trend, denoted as ˆXcoarse. We summarize the specific hyperparameter configurations of our instantiated SimVP in Table S-3. The model takes the past Tin =12 frames as...

  50. [50]

    As formulated in Section 3.3 of the main text, KANCondNet strategically replaces traditional fixed activations with learnable B-splines

    Condition Encoder: KANCondNet:To effectively guide the PMF predictor, KANCondNet is designed to extract precise multi-scale spatiotemporal conditions Hc from the concatenated past context Xpast and the coarse baseline ˆXcoarse. As formulated in Section 3.3 of the main text, KANCondNet strategically replaces traditional fixed activations with learnable B-s...

  51. [51]

    In our implementation, Fθ is instantiated based on the Global- Temporal U-Net (GTUnet) architecture originally proposed in DiffCast [14]

    Pixel Mean Flows Predictor: Modified GTUnet:In Section 3.4 of the main text, the core of our PMF pre- dictor is abstracted as the x-prediction model Fθ. In our implementation, Fθ is instantiated based on the Global- Temporal U-Net (GTUnet) architecture originally proposed in DiffCast [14]. GTUnet exhibits strong capabilities in modeling complex meteorolog...

  52. [52]

    To accommodate the substantial computational footprint inherent in long-term spatiotemporal sequence forecasting, we employ Bfloat16 Mixed Precision (bf16-mixed) during training

    Training Configurations and Optimization Details:The proposed PixelFlowCast framework is implemented using PyTorch and trained with the Distributed Data Parallel (DDP) strategy across multiple NVIDIA GeForce RTX 3090 (24GB) GPUs. To accommodate the substantial computational footprint inherent in long-term spatiotemporal sequence forecasting, we employ Bfl...

  53. [53]

    Consequently, a detailed discussion on computational speed is omitted from the main manuscript

    Ablation Study on Inference Speed:Compared to tra- ditional diffusion paradigms, continuous flow-based models intrinsically benefit from significantly accelerated inference. Consequently, a detailed discussion on computational speed is omitted from the main manuscript. To comprehensively supplement the architectural evaluations, this section explicitly pr...

  54. [54]

    Ablation Study on Inference Steps:As detailed in Sec- tion A2, while the original PMF framework is conceptualized for one-step generation, modeling the highly chaotic dynamics of extreme precipitation systems necessitates a multi-step sampling strategy. To determine the optimal configuration, we conduct a comprehensive ablation study on the number of infe...

  55. [55]

    Ablation Study on Noise-Free Extraction Strategy:In Section A3, we proposed a Noise-Free Extraction strategy for the final sampling step. Rather than performing a final numerical integration to obtain the accumulated state Zcurr, our framework directly outputs the terminal virtual intercept ˆXpred to bypass residual numerical noise. To empirically validat...

  56. [56]

    Overall Average

    Dataset.:The MeteoNet dataset is provided by the French national meteorological service (MÃl’tÃl’o-France). This dataset captures the evolution of radar echoes over the French territory, featuring a high spatial resolution of 0.01○ (approximately 1 km) on an original grid of 565×784 pixels, and a temporal resolution of 5 minutes. Following a consistent ex...

  57. [57]

    In the construction of the time series, each sample is a 48-frame sequence extracted from the continuous radar observations

    Data Preprocessing.:Consistent with the preprocessing strategy applied to the SEVIR dataset, we addressed computing resource limitations by cropping and downsampling the spatial dimensions of all original MeteoNet radar frames to 128×128 pixels during the preprocessing stage. In the construction of the time series, each sample is a 48-frame sequence extra...

  58. [58]

    Quantitative Results:The quantitative and qualitative results on the MeteoNet dataset are summarized in Tables S-10 and S-11, as well as Figures S-9, S-10, S-11, S-12. Overall, the empirical performance exhibits a highly consistent trend with those observed on the SEVIR dataset, further validating the effectiveness and generalizability of PixelFlowCast. F...