Recognition: unknown
Frequency-Aware Flow Matching for High-Quality Image Generation
Pith reviewed 2026-05-10 10:54 UTC · model grok-4.3
The pith
Flow matching generates sharper images when low- and high-frequency components receive separate time-dependent weighting and dedicated branches.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Flow matching models learn to reverse a corruption process that adds Gaussian noise, yet the non-uniform impact of this noise on frequency components causes low-frequency elements to be generated early and high-frequency elements later. By adding frequency-aware conditioning via time-dependent adaptive weighting and a two-branch architecture—one branch that separately processes low- and high-frequency components to capture structure and refine details, and a spatial branch that synthesizes images in the latent domain guided by the frequency branch—the model ensures both large-scale coherence and fine-grained details are effectively modeled at each step of the reverse process.
What carries the argument
Time-dependent adaptive weighting applied to a two-branch frequency-spatial architecture that separates explicit low- and high-frequency processing from latent-domain spatial synthesis.
If this is right
- Low-frequency conditioning reinforces global structure in the generated images.
- High-frequency conditioning enhances texture fidelity and detail sharpness.
- Both large-scale coherence and fine-grained details are modeled more effectively than in standard flow matching.
- The approach yields state-of-the-art FID performance on class-conditional ImageNet-256 generation.
Where Pith is reading between the lines
- The same frequency imbalance likely appears in other noise-based generative models, so the weighting and branching idea may transfer beyond flow matching.
- Explicit timing of frequency emphasis could allow the generation process to reach acceptable quality in fewer steps.
- The method may show larger gains on higher-resolution images where fine-detail fidelity matters more.
Load-bearing premise
Separating frequency processing into its own branch and weighting it adaptively over time will improve both global structure and fine details without causing branch interference or training instability.
What would settle it
Training the same two-branch architecture without the time-dependent frequency weighting and observing whether the FID on class-conditional ImageNet-256 remains no better than the baseline flow-matching result.
Figures
read the original abstract
Flow matching models have emerged as a powerful framework for realistic image generation by learning to reverse a corruption process that progressively adds Gaussian noise. However, because noise is injected in the latent domain, its impact on different frequency components is non-uniform. As a result, during inference, flow matching models tend to generate low-frequency components (global structure) in the early stages, while high-frequency components (fine details) emerge only later in the reverse process. Building on this insight, we propose Frequency-Aware Flow Matching (FreqFlow), a novel approach that explicitly incorporates frequency-aware conditioning into the flow matching framework via time-dependent adaptive weighting. We introduce a two-branch architecture: (1) a frequency branch that separately processes low- and high-frequency components to capture global structure and refine textures and edges, and (2) a spatial branch that synthesizes images in the latent domain, guided by the frequency branch's output. By explicitly integrating frequency information into the generation process, FreqFlow ensures that both large-scale coherence and fine-grained details are effectively modeled low-frequency conditioning reinforces global structure, while high-frequency conditioning enhances texture fidelity and detail sharpness. On the class-conditional ImageNet-256 generation benchmark, our method achieves state-of-the-art performance with an FID of 1.38, surpassing the prior diffusion model DiT and flow matching model SiT by 0.79 and 0.58 FID, respectively. Code is available at https://github.com/OliverRensu/FreqFlow.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Frequency-Aware Flow Matching (FreqFlow) as an extension to standard flow matching for image generation. It adds time-dependent adaptive weighting to incorporate frequency-aware conditioning and proposes a two-branch architecture consisting of a frequency branch that processes low- and high-frequency components separately and a spatial branch that synthesizes the image in the latent domain. The central empirical claim is state-of-the-art performance on class-conditional ImageNet-256 generation, with an FID of 1.38 that improves over DiT by 0.79 and over SiT by 0.58.
Significance. If the reported FID improvement is reproducible and attributable to the proposed components, the work would constitute a useful architectural refinement for flow-matching models by explicitly handling the non-uniform frequency impact of the corruption process. The public release of code is a clear strength that supports verification and follow-on research.
major comments (1)
- [Results / Experiments] The central claim that the 1.38 FID results from the time-dependent adaptive weighting and two-branch architecture is not yet load-bearing without supporting evidence. The manuscript should include ablations that isolate these additions (e.g., removing the frequency branch or the adaptive weighting) and report the resulting FID degradation on the same ImageNet-256 benchmark.
minor comments (2)
- [Abstract] The abstract states that 'low-frequency conditioning reinforces global structure, while high-frequency conditioning enhances texture fidelity' but does not reference a specific figure or equation that illustrates this separation; adding such a pointer would improve clarity.
- [Method] Notation for the frequency decomposition (low- vs. high-frequency components) and how it is applied consistently to both training targets and conditioning should be defined explicitly in the method section to avoid ambiguity during re-implementation.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback and the recommendation for minor revision. We address the major comment point-by-point below.
read point-by-point responses
-
Referee: [Results / Experiments] The central claim that the 1.38 FID results from the time-dependent adaptive weighting and two-branch architecture is not yet load-bearing without supporting evidence. The manuscript should include ablations that isolate these additions (e.g., removing the frequency branch or the adaptive weighting) and report the resulting FID degradation on the same ImageNet-256 benchmark.
Authors: We agree that the manuscript would benefit from explicit ablations isolating the contributions of the time-dependent adaptive weighting and the two-branch architecture. In the revised version, we will add these experiments on the ImageNet-256 benchmark, including a baseline without the frequency branch and a variant without the adaptive weighting, and report the corresponding FID scores to quantify the performance degradation. revision: yes
Circularity Check
No significant circularity; empirical architectural proposal
full rationale
The paper presents FreqFlow as an architectural extension to existing flow-matching frameworks, adding time-dependent adaptive weighting and a two-branch frequency-spatial network. Its central claim is an empirical FID result (1.38) on the standard class-conditional ImageNet-256 benchmark, with code released for direct reproduction. No derivation chain, first-principles prediction, or fitted quantity is shown to reduce by construction to its own inputs; the method is described as an explicit addition whose components can be implemented consistently with flow-matching ODEs. Self-citations, if present, are not load-bearing for the performance claim, which rests on external benchmark comparison rather than internal redefinition.
Axiom & Free-Parameter Ledger
free parameters (1)
- time-dependent adaptive weighting
axioms (1)
- domain assumption Noise injection in the latent domain has non-uniform impact on different frequency components
Reference graph
Works this paper leans on
-
[1]
Building Normalizing Flows with Stochastic Interpolants
Michael S Albergo and Eric Vanden-Eijnden. Building nor- malizing flows with stochastic interpolants.arXiv preprint arXiv:2209.15571, 2022. 2
work page internal anchor Pith review arXiv 2022
-
[2]
All are worth words: A vit backbone for diffusion models
Fan Bao, Shen Nie, Kaiwen Xue, Yue Cao, Chongxuan Li, Hang Su, and Jun Zhu. All are worth words: A vit backbone for diffusion models. InCVPR, 2023. 2, 6, 7, 8
2023
-
[3]
Large Scale GAN Training for High Fidelity Natural Image Synthesis
Andrew Brock, Jeff Donahue, and Karen Simonyan. Large scale gan training for high fidelity natural image synthesis. arxiv 2018.arXiv preprint arXiv:1809.11096, 1809. 7, 8
work page internal anchor Pith review arXiv 2018
-
[4]
Maskgit: Masked generative image transformer
Huiwen Chang, Han Zhang, Lu Jiang, Ce Liu, and William T Freeman. Maskgit: Masked generative image transformer. In CVPR, 2022. 7, 8
2022
-
[5]
Semantic image segmen- tation with deep convolutional nets and fully connected crfs
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Semantic image segmen- tation with deep convolutional nets and fully connected crfs. InICLR, 2015. 2
2015
-
[6]
Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolu- tion, and fully connected crfs.TPAMI, 2017
Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolu- tion, and fully connected crfs.TPAMI, 2017. 2
2017
-
[7]
Imagenet: A large-scale hierarchical image database
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. Imagenet: A large-scale hierarchical image database. InCVPR, 2009. 2, 6
2009
-
[8]
Diffusion models beat gans on image synthesis.NeurIPS, 2021
Prafulla Dhariwal and Alexander Nichol. Diffusion models beat gans on image synthesis.NeurIPS, 2021. 7, 8
2021
-
[9]
An image is worth 16x16 words: Trans- formers for image recognition at scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale. InICLR, 2021. 2, 5
2021
-
[10]
Taming transformers for high-resolution image synthesis
Patrick Esser, Robin Rombach, and Bjorn Ommer. Taming transformers for high-resolution image synthesis. InCVPR,
-
[11]
Scaling recti- fied flow transformers for high-resolution image synthesis
Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas M ¨uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, et al. Scaling recti- fied flow transformers for high-resolution image synthesis. InICML, 2024. 1, 2
2024
-
[12]
Masked diffusion transformer is a strong image synthesizer
Shanghua Gao, Pan Zhou, Ming-Ming Cheng, and Shuicheng Yan. Mdtv2: Masked diffusion transformer is a strong image synthesizer.arXiv preprint arXiv:2303.14389,
-
[13]
Generative adversarial nets.NeurIPS, 2014
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets.NeurIPS, 2014. 2
2014
-
[14]
Diffit: Diffusion vision transformers for im- age generation
Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, and Arash Vahdat. Diffit: Diffusion vision transformers for im- age generation. InECCV, 2024. 7, 8
2024
-
[15]
Flow- tok: Flowing seamlessly across text and image tokens
Ju He, Qihang Yu, Qihao Liu, and Liang-Chieh Chen. Flow- tok: Flowing seamlessly across text and image tokens. In ICCV, 2025. 2
2025
-
[16]
Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 30, 2017
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. Gans trained by a two time-scale update rule converge to a local nash equilib- rium.NeurIPS, 30, 2017. 6
2017
-
[17]
Classifier-Free Diffusion Guidance
Jonathan Ho and Tim Salimans. Classifier-free diffusion guidance.arXiv preprint arXiv:2207.12598, 2022. 6
work page internal anchor Pith review arXiv 2022
-
[18]
Denoising diffu- sion probabilistic models.NeurIPS, 2020
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffu- sion probabilistic models.NeurIPS, 2020. 1, 2, 3
2020
-
[19]
Cascaded diffusion models for high fidelity image generation.JMLR, 23(47),
Jonathan Ho, Chitwan Saharia, William Chan, David J Fleet, Mohammad Norouzi, and Tim Salimans. Cascaded diffusion models for high fidelity image generation.JMLR, 23(47),
-
[20]
sim- ple diffusion: End-to-end diffusion for high resolution im- ages
Emiel Hoogeboom, Jonathan Heek, and Tim Salimans. sim- ple diffusion: End-to-end diffusion for high resolution im- ages. InICML, 2023. 1, 2, 7
2023
-
[21]
Fouriscale: A frequency perspective on training-free high-resolution image synthesis
Linjiang Huang, Rongyao Fang, Aiping Zhang, Guanglu Song, Si Liu, Yu Liu, and Hongsheng Li. Fouriscale: A frequency perspective on training-free high-resolution image synthesis. InECCV, 2024. 2
2024
-
[22]
Scaling up gans for text-to-image synthesis
Minguk Kang, Jun-Yan Zhu, Richard Zhang, Jaesik Park, Eli Shechtman, Sylvain Paris, and Taesung Park. Scaling up gans for text-to-image synthesis. InCVPR, 2023. 7
2023
-
[23]
Adam: A method for stochastic optimization
Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. InICLR, 2015. 1
2015
-
[24]
Auto-encoding varia- tional bayes
Diederik P Kingma and Max Welling. Auto-encoding varia- tional bayes. InICLR, 2014. 1, 2
2014
-
[25]
https://blackforestlabs.ai/announcements/
Black Forest Labs. https://blackforestlabs.ai/announcements/
-
[26]
Autoregressive image generation using residual quantization
Doyup Lee, Chiheon Kim, Saehoon Kim, Minsu Cho, and Wook-Shin Han. Autoregressive image generation using residual quantization. InCVPR, 2022. 7
2022
-
[27]
Return of unconditional generation: A self-supervised representation generation method.NeurIPS, 2024
Tianhong Li, Dina Katabi, and Kaiming He. Return of unconditional generation: A self-supervised representation generation method.NeurIPS, 2024. 7
2024
-
[28]
Autoregressive image generation without vec- tor quantization.NeurIPS, 2024
Tianhong Li, Yonglong Tian, He Li, Mingyang Deng, and Kaiming He. Autoregressive image generation without vec- tor quantization.NeurIPS, 2024. 2, 7
2024
-
[29]
Flow Matching for Generative Modeling
Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximil- ian Nickel, and Matt Le. Flow matching for generative mod- eling.arXiv preprint arXiv:2210.02747, 2022. 1, 2, 3
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[30]
Alleviating distortion in image generation via multi-resolution diffusion models and time- dependent layer normalization.NeurIPS, 2024
Qihao Liu, Zhanpeng Zeng, Ju He, Qihang Yu, Xiaohui Shen, and Liang-Chieh Chen. Alleviating distortion in image generation via multi-resolution diffusion models and time- dependent layer normalization.NeurIPS, 2024. 2, 6, 7, 8
2024
-
[31]
Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow
Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow straight and fast: Learning to generate and transfer data with rectified flow.arXiv preprint arXiv:2209.03003, 2022. 1, 2, 3
work page internal anchor Pith review arXiv 2022
-
[32]
A convnet for the 2020s
Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InCVPR, 2022. 5
2022
-
[33]
Decoupled Weight Decay Regularization
Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 1
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[34]
Sit: Exploring flow and diffusion-based generative models with scalable in- terpolant transformers
Nanye Ma, Mark Goldstein, Michael S Albergo, Nicholas M Boffi, Eric Vanden-Eijnden, and Saining Xie. Sit: Exploring flow and diffusion-based generative models with scalable in- terpolant transformers. InECCV, 2024. 1, 2, 3, 4, 6, 7
2024
-
[35]
Scalable diffusion models with transformers
William Peebles and Saining Xie. Scalable diffusion models with transformers. InICCV, 2023. 2, 6, 7, 8
2023
-
[36]
Ultrapixel: Advancing ultra high-resolution image synthesis to new peaks.NeurIPS, 2024
Jingjing Ren, Wenbo Li, Haoyu Chen, Renjing Pei, Bin Shao, Yong Guo, Long Peng, Fenglong Song, and Lei Zhu. Ultrapixel: Advancing ultra high-resolution image synthesis to new peaks.NeurIPS, 2024
2024
-
[37]
Turbo2k: Towards ultra-efficient and high-quality 2k video synthesis
Jingjing Ren, Wenbo Li, Zhongdao Wang, Haoze Sun, Bangzhen Liu, Haoyu Chen, Jiaqi Xu, Aoxue Li, Shifeng Zhang, Bin Shao, et al. Turbo2k: Towards ultra-efficient and high-quality 2k video synthesis. InICCV, 2025. 2
2025
-
[38]
Co-advise: Cross inductive bias distillation
Sucheng Ren, Zhengqi Gao, Tianyu Hua, Zihui Xue, Yong- long Tian, Shengfeng He, and Hang Zhao. Co-advise: Cross inductive bias distillation. InCVPR, 2022. 2
2022
-
[39]
Shunted self-attention via multi-scale token aggregation
Sucheng Ren, Daquan Zhou, Shengfeng He, Jiashi Feng, and Xinchao Wang. Shunted self-attention via multi-scale token aggregation. InCVPR, 2022
2022
-
[40]
Tinymim: An empirical study of distilling mim pre-trained models
Sucheng Ren, Fangyun Wei, Zheng Zhang, and Han Hu. Tinymim: An empirical study of distilling mim pre-trained models. InCVPR, 2023
2023
-
[41]
Sg-former: Self-guided transformer with evolving token reallocation
Sucheng Ren, Xingyi Yang, Songhua Liu, and Xinchao Wang. Sg-former: Self-guided transformer with evolving token reallocation. InICCV, 2023. 2
2023
-
[42]
M-var: Decoupled scale-wise autoregressive modeling for high-quality image generation
Sucheng Ren, Yaodong Yu, Nataniel Ruiz, Feng Wang, Alan Yuille, and Cihang Xie. M-var: Decoupled scale-wise autoregressive modeling for high-quality image generation. arXiv preprint arXiv:2411.10433, 2024. 7
-
[43]
Flowar: Scale-wise autoregressive image generation meets flow matching
Sucheng Ren, Qihang Yu, Ju He, Xiaohui Shen, Alan Yuille, and Liang-Chieh Chen. Flowar: Scale-wise autoregressive image generation meets flow matching. InICML, 2025. 7
2025
-
[44]
Beyond next-token: Next-x predic- tion for autoregressive visual generation
Sucheng Ren, Qihang Yu, Ju He, Xiaohui Shen, Alan Yuille, and Liang-Chieh Chen. Beyond next-token: Next-x predic- tion for autoregressive visual generation. InICCV, 2025. 1
2025
-
[45]
Sucheng Ren, Qihang Yu, Ju He, Alan Yuille, and Liang- Chieh Chen. Grouping first, attending smartly: Training- free acceleration for diffusion transformers.arXiv preprint arXiv:2505.14687, 2025. 2
-
[46]
High-resolution image syn- thesis with latent diffusion models
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨orn Ommer. High-resolution image syn- thesis with latent diffusion models. InCVPR, 2022. 1, 2, 3, 6, 7, 8
2022
-
[47]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InMICCAI, 2015. 2
2015
-
[48]
Improved techniques for training gans.NeurIPS, 29, 2016
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. Improved techniques for training gans.NeurIPS, 29, 2016. 6
2016
-
[49]
Stylegan- xl: Scaling stylegan to large diverse datasets
Axel Sauer, Katja Schwarz, and Andreas Geiger. Stylegan- xl: Scaling stylegan to large diverse datasets. InSIGGRAPH,
-
[50]
On the fre- quency bias of generative models.NeurIPS, 2021
Katja Schwarz, Yiyi Liao, and Andreas Geiger. On the fre- quency bias of generative models.NeurIPS, 2021. 2
2021
-
[51]
Deeply supervised flow-based generative models
Inkyu Shin, Chenglin Yang, and Liang-Chieh Chen. Deeply supervised flow-based generative models. InICCV, 2025. 1
2025
-
[52]
Freeu: Free lunch in diffusion u-net
Chenyang Si, Ziqi Huang, Yuming Jiang, and Ziwei Liu. Freeu: Free lunch in diffusion u-net. InCVPR, 2024. 2
2024
-
[53]
Denoising Diffusion Implicit Models
Jiaming Song, Chenlin Meng, and Stefano Ermon. Denoising diffusion implicit models.arXiv preprint arXiv:2010.02502, 2020. 1, 2
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[54]
Score-Based Generative Modeling through Stochastic Differential Equations
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Ab- hishek Kumar, Stefano Ermon, and Ben Poole. Score-based generative modeling through stochastic differential equa- tions.arXiv preprint arXiv:2011.13456, 2020. 1
work page internal anchor Pith review Pith/arXiv arXiv 2011
-
[55]
Visual autoregressive modeling: Scalable image generation via next-scale prediction.NeurIPS, 2024
Keyu Tian, Yi Jiang, Zehuan Yuan, Bingyue Peng, and Li- wei Wang. Visual autoregressive modeling: Scalable image generation via next-scale prediction.NeurIPS, 2024. 7, 8
2024
-
[56]
Attention is all you need.NeurIPS, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.NeurIPS, 2017. 2
2017
-
[57]
Maskbit: Embedding-free image generation via bit tokens.arXiv preprint arXiv:2409.16211, 2024
Mark Weber, Lijun Yu, Qihang Yu, Xueqing Deng, Xi- aohui Shen, Daniel Cremers, and Liang-Chieh Chen. Maskbit: Embedding-free image generation via bit tokens. arXiv:2409.16211, 2024. 7
-
[58]
Diffu- sion models without attention
Jing Nathan Yan, Jiatao Gu, and Alexander M Rush. Diffu- sion models without attention. InCVPR, 2024. 8
2024
-
[59]
1.58-bit FLUX.arXiv preprint arXiv:2412.18653, 2024
Chenglin Yang, Celong Liu, Xueqing Deng, Dongwon Kim, Xing Mei, Xiaohui Shen, and Liang-Chieh Chen. 1.58-bit flux.arXiv preprint arXiv:2412.18653, 2024. 2
-
[60]
Frag: Frequency adapting group for diffu- sion video editing.arXiv preprint arXiv:2406.06044, 2024
Sunjae Yoon, Gwanhyeong Koo, Geonwoo Kim, and Chang D Yoo. Frag: Frequency adapting group for diffu- sion video editing.arXiv preprint arXiv:2406.06044, 2024. 2
-
[61]
Vector-quantized image modeling with improved vqgan.arXiv preprint arXiv:2110.04627, 2021
Jiahui Yu, Xin Li, Jing Yu Koh, Han Zhang, Ruoming Pang, James Qin, Alexander Ku, Yuanzhong Xu, Jason Baldridge, and Yonghui Wu. Vector-quantized image modeling with improved vqgan.arXiv preprint arXiv:2110.04627, 2021. 7
-
[62]
An image is worth 32 tokens for reconstruction and generation.NeurIPS, 2024
Qihang Yu, Mark Weber, Xueqing Deng, Xiaohui Shen, Daniel Cremers, and Liang-Chieh Chen. An image is worth 32 tokens for reconstruction and generation.NeurIPS, 2024. 7
2024
-
[63]
Randomized autoregressive visual generation
Qihang Yu, Ju He, Xueqing Deng, Xiaohui Shen, and Liang- Chieh Chen. Randomized autoregressive visual generation. InICCV, 2025. 7 Appendix The supplementary material includes the following addi- tional information: • Sec. A details the model variants of FreqFlow. • Sec. B details the hyper-parameters for FreqFlow. • Sec. C provides additional ablation stu...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.