pith. sign in

arxiv: 2604.27147 · v3 · pith:3GM4RFGAnew · submitted 2026-04-29 · 💻 cs.LG · cs.AI

How to Guide Your Flow: Few-Step Alignment via Flow Map Reward Guidance

Pith reviewed 2026-07-01 08:15 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords flow matchingreward guidanceoptimal controlfew-step samplingtext-to-image generationgenerative modelingalignment
0
0 comments X

The pith

Reformulating guidance as optimal control shows the flow map enables single-trajectory reward guidance with three steps.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper casts guidance as a deterministic optimal control problem and derives a hierarchy of algorithms from it. Within this hierarchy the flow map appears directly in the optimal solution. The authors build Flow Map Reward Guidance on this fact: a training-free method that uses one trajectory and the flow map for both integration and reward steering. At text-to-image scale the method matches or exceeds existing baselines on inverse problems and reward tasks while using only three network evaluations. A reader would care because it removes the need for expensive multi-particle or many-step schemes that currently limit guided generation.

Core claim

By reformulating guidance as a deterministic optimal control problem, the authors derive a hierarchy of algorithms in which the flow map arises naturally in the optimal solution. They propose Flow Map Reward Guidance (FMRG), a training-free single-trajectory framework that uses the flow map to both integrate and guide the flow, matching or surpassing baselines across inverse problems and reward-guided generation with as few as 3 NFEs at text-to-image scale.

What carries the argument

Flow Map Reward Guidance (FMRG), a single-trajectory method that applies the flow map for both flow integration and reward guidance.

Load-bearing premise

The flow map obtained from the optimal control formulation can be used directly for guidance without approximations that lose accuracy at very low step counts.

What would settle it

Run FMRG and current multi-particle baselines on the same text-to-image reward benchmark with exactly three steps and check whether FMRG reward scores fall below the baselines.

Figures

Figures reproduced from arXiv: 2604.27147 by Jerry Y. Huang, Justin Lin, Kartik Nair, Nicholas M. Boffi, Sheel Shah.

Figure 1
Figure 1. Figure 1: We introduce Flow Map Reward Guidance (FMRG), a training-free, single-trajectory framework for inference-time alignment of flow-based models. FMRG achieves state-of-the-art performance across diverse rewards—including aesthetic enhancement, compositionality, latent-space inverse problems, style transfer, and VLM rewards—with up to a 70× speedup over prior work. 1 arXiv:2604.27147v1 [cs.LG] 29 Apr 2026 view at source ↗
Figure 2
Figure 2. Figure 2: Overview. FMRG guides a single generative trajectory by alternating flow map steps, which integrate the base dynamics exactly, with gradient steps that steer toward high reward. This optimization-centric perspective contrasts with methods that explicitly target sampling the exponential reward tilt ρ˜ ∝ e r ρ, which typically require many particles with resampling (e.g., SMC) and are often based on diffusio… view at source ↗
Figure 3
Figure 3. Figure 3: Hierarchy of approximations. The exact￾optimal control requires the controlled flow map X u ∗ t,1 . Our approaches leverages the uncontrolled flow map Xt,1, while DPS further approximates Xt,1 with a single Euler step. The proof is given in Appendix D.1. For the linear interpolant, the posterior mean xˆ1 coincides with a single Euler step of the probability flow, while the exact flow map Xt,1 corresponds t… view at source ↗
Figure 5
Figure 5. Figure 5: Terminal distribution. Greedy guidance produces a narrower distribution than reward tilting or the distribution produced by exactly solving the optimal control problem (5). Early stopping can be used to effectively mitigate this mode collapse, and when applied at tstop = 0.3 recovers variance comparable to the reward tilt. The proof is given in Appendix C.5. Inspecting (15), greedy guidance achieves the hi… view at source ↗
Figure 8
Figure 8. Figure 8: Gradient options. (Left) The flow map Jacobian ∇Xt,1(x) T projects the reward gradient ∇r onto Tx1M, keeping the trajectory on-manifold (blue, FMRG-J), while the Euclidean gradient follows ∇r off-manifold (purple, FMRG-E). (Right) FMRG-E achieves higher reward (r++) but produces artifacts because it can leave the data manifold, often leading to reward hacking; FMRG-J stays on-manifold and more robustly pre… view at source ↗
Figure 10
Figure 10. Figure 10: Latent-space inverse problems. (Left) FMRG obtains SoTA performance on super-resolution, motion deblurring, and inpainting at remarkably low NFEs. (Right) LPIPS vs. FID trade-off on AFHQ. FMRG-E achieves notably better performance in the low NFE regime. Full results in Appendix E. model rewards for text-to-image generation. For all experiments, we use a flow map distilled via Lagrangian distillation [20] … view at source ↗
Figure 11
Figure 11. Figure 11: Style guidance: hierarchy of methods. Given a style reference (left), we compare unguided FLUX, Jacobian-based methods (FMRG-J, DPS), and Euclidean-based methods (FMRG-E, FlowChef). FMRG-J captures the target style most faithfully while preserving semantic content. DPS fails to incorporate the style, while FlowChef produces artifacts, consistent with our derived approximation hierarchy ( view at source ↗
Figure 12
Figure 12. Figure 12: Reward-guided aesthetic enhancement. FMRG produces visually compelling aesthetic enhancements with as few as 6 NFEs. Additional comparisons in Appendix E.5. 5.3 Reward-guided generation We evaluate FMRG on human preference rewards for text-to-image generation. Following Eyring et al. [41], we use a linear combination of human preference and text-image alignment reward models, including ImageReward [54], H… view at source ↗
Figure 14
Figure 14. Figure 14: GenEval accuracy vs. NFE. FMRG-J dominates the Pareto frontier across all NFE budgets, matching FMTT (0.77) at NFE 20 with a 70× reduction in compute. models beyond the human preference ensembles used for GenEval. 5.5 Analysis of design choices We discuss two key design choices whose empirical behavior is consistent with our theoretical analysis. Full ablations are provided in Appendices E.3 and E.5. Earl… view at source ↗
Figure 15
Figure 15. Figure 15: VLM reward guidance. Unguided FLUX generations (top) fail to follow complex compositional prompts. FMRG (bottom) steers generation toward prompt-faithful outputs. far from the manifold. Empirically, for the ℓ2 reconstruction loss, whose optima lie close to the data manifold, the Euclidean gradient already produces approximately on-manifold updates without requiring the Jacobian projection; accordingly, FM… view at source ↗
read the original abstract

In generative modeling, we often wish to produce samples that maximize a user-specified reward such as aesthetic quality or alignment with human preferences, a problem known as \textit{guidance}. Despite their widespread use, existing guidance methods either require expensive multi-particle, many-step schemes or rely on poorly understood approximations. We reformulate guidance as a \textit{deterministic optimal control problem}, yielding a hierarchy of algorithms that subsumes existing approaches at the coarsest level. We show that the \textit{flow map}, an object of significant recent interest for its role in fast inference, arises naturally in the optimal solution. Based on this observation, we propose \textbf{Flow Map Reward Guidance (FMRG)}: a training-free, \textit{single-trajectory} framework that uses the flow map to both integrate and guide the flow. At text-to-image scale, FMRG matches or surpasses baselines across inverse problems and reward-guided generation with \textbf{as few as 3 NFEs}, giving at least an order-of-magnitude speedup in comparison to prior state of the art. Code is available at https://github.com/jrrhuang/fmrg.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 3 minor

Summary. The paper reformulates reward guidance for flow-based generative models as a deterministic optimal control problem. This yields a hierarchy of algorithms in which the flow map arises exactly in the optimal solution. The authors introduce Flow Map Reward Guidance (FMRG), a training-free single-trajectory method that uses the flow map for both integration and guidance, and report that it matches or exceeds baselines on inverse problems and reward-guided text-to-image generation using as few as 3 NFEs.

Significance. If the derivation holds, the work supplies a principled, approximation-free route to few-step guidance that subsumes prior methods at the coarsest level and directly exploits the flow map. The empirical claims of order-of-magnitude speedup at text-to-image scale, supported by appropriate baselines and ablations, would be a notable advance for efficient alignment of flow models.

minor comments (3)
  1. §3 (optimal control formulation): the transition from the continuous-time control problem to the discrete hierarchy of algorithms would benefit from an explicit statement of the discretization scheme and any truncation error bounds.
  2. Figure 4 and Table 2: the 3-NFE results are presented without error bars or multiple random seeds; adding these would strengthen the claim that performance is stable at very low NFEs.
  3. Related work section: the positioning relative to recent flow-map inference papers (e.g., those using the flow map for fast sampling) could be expanded to clarify the novelty of the guidance application.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive summary, recognition of the work's potential significance, and recommendation of minor revision. No major comments were listed in the report.

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper's central derivation begins with an explicit reformulation of guidance as a deterministic optimal control problem. From this starting point the flow map is shown to arise directly in the optimal solution under the stated assumptions, producing a hierarchy of algorithms. This chain is independent of the downstream empirical metrics (reward values, inverse-problem performance) and does not reduce any claimed prediction to a fitted parameter or self-citation by construction. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the provided derivation. The approach therefore remains self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the reformulation of guidance as deterministic optimal control and the assumption that the flow map can be leveraged directly for guidance; no new free parameters or invented entities are introduced in the abstract.

axioms (1)
  • domain assumption Guidance in generative flows can be exactly reformulated as a deterministic optimal control problem whose solution involves the flow map.
    This is the foundational step stated in the abstract that enables the hierarchy of algorithms and the FMRG method.

pith-pipeline@v0.9.1-grok · 5748 in / 1360 out tokens · 39517 ms · 2026-07-01T08:15:13.213897+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Diffusion-Based Posterior Sampling: A Feynman-Kac Analysis of Bias and Stability

    cs.LG 2026-05 unverdicted novelty 8.0

    Diffusion posterior samplers produce biased outputs that can be expressed as an Ornstein-Uhlenbeck path expectation via a surrogate Gaussian path and Feynman-Kac representation, with STSL flattening the spatially vary...

  2. Flow Map Denoisers: Traversing the Distortion-Perception Plane for Inverse Problems

    cs.LG 2026-06 conditional novelty 7.0

    Flow map denoisers use a lookahead parameter t to span the distortion-perception frontier, proven optimal for Gaussian targets and effective for natural images and inverse problems.

Reference graph

Works this paper leans on

69 extracted references · 47 canonical work pages · cited by 2 Pith papers · 19 internal anchors

  1. [1]

    Flow Matching for Generative Modeling

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling.arXiv preprint arXiv:2210.02747, 2022. (pages 2, 3, and 10)

  2. [2]

    Stochastic Interpolants: A Unifying Framework for Flows and Diffusions

    Michael S Albergo, Nicholas M Boffi, and Eric Vanden-Eijnden. Stochastic interpolants: A unifying framework for flows and diffusions.arXiv preprint arXiv:2303.08797, 2023. (pages 2, 3, 10, 29, and 36)

  3. [3]

    Score-Based Generative Modeling through Stochastic Differential Equations

    Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. Score-Based Generative Modeling through Stochastic Differential Equations.arXiv:2011.13456 [cs, stat], February 2021. arXiv: 2011.13456. (pages 2 and 10)

  4. [4]

    High-Resolution Image Synthesis with Latent Diffusion Models

    Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj¨ orn Ommer. High-Resolution Image Synthesis with Latent Diffusion Models. Technical Report arXiv:2112.10752, arXiv, April 2022. arXiv:2112.10752 [cs] type: article. (page 2)

  5. [5]

    Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models

    Andreas Blattmann, Robin Rombach, Huan Ling, Tim Dockhorn, Seung Wook Kim, Sanja Fidler, and Karsten Kreis. Align Your Latents: High-Resolution Video Synthesis With Latent Diffusion Models. pages 22563–22575, 2023. (page 2)

  6. [6]

    Watson, David Juergens, Nathaniel R

    Joseph L. Watson, David Juergens, Nathaniel R. Bennett, Brian L. Trippe, Jason Yim, Helen E. Eisenach, Woody Ahern, Andrew J. Borst, Robert J. Ragotte, Lukas F. Milles, Basile I. M. Wicky, Nikita Hanikel, 16 Samuel J. Pellock, Alexis Courbet, William Sheffler, Jue Wang, Preetham Venkatesh, Isaac Sappington, Susana V´ azquez Torres, Anna Lauko, Valentin De...

  7. [7]

    Kevin Clark, Paul Vicol, Kevin Swersky, and David J. Fleet. Directly Fine-Tuning Diffusion Models on Differentiable Rewards, June 2024. arXiv:2309.17400 [cs]. (pages 2 and 10)

  8. [8]

    A Survey on Diffusion Models for Inverse Problems

    Giannis Daras, Hyungjin Chung, Chieh-Hsin Lai, Yuki Mitsufuji, Jong Chul Ye, Peyman Milanfar, Alexandros G. Dimakis, and Mauricio Delbracio. A Survey on Diffusion Models for Inverse Problems, September 2024. arXiv:2410.00083 [cs]. (page 2)

  9. [9]

    2306.17775 , archiveprefix =

    Luhuan Wu, Brian L. Trippe, Christian A. Naesseth, David M. Blei, and John P. Cunningham. Practical and Asymptotically Exact Conditional Sampling in Diffusion Models, June 2023. arXiv:2306.17775 [cs, q-bio, stat]. (pages 2, 4, and 10)

  10. [10]

    A General Framework for Inference-time Scaling and Steering of Diffusion Models, July

    Raghav Singhal, Zachary Horvitz, Ryan Teehan, Mengye Ren, Zhou Yu, Kathleen McKeown, and Rajesh Ranganath. A General Framework for Inference-time Scaling and Steering of Diffusion Models, July

  11. [11]

    Singhal, Z

    arXiv:2501.06848 [cs]. (pages 2 and 54)

  12. [12]

    Carles Domingo-Enrich, Michal Drozdzal, Brian Karrer, and Ricky T. Q. Chen. Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control, January

  13. [13]
  14. [14]

    arXiv preprint arXiv:2511.22688 , year=

    Amirmojtaba Sabour, Michael S. Albergo, Carles Domingo-Enrich, Nicholas M. Boffi, Sanja Fidler, Karsten Kreis, and Eric Vanden-Eijnden. Test-time scaling of diffusions with flow maps, November 2025. arXiv:2511.22688 [cs]. (pages 2 and 10)

  15. [15]

    Inference-time alignment in diffusion models with reward-guided generation: Tutorial and review.arXiv preprint arXiv:2501.09685, 2025

    Masatoshi Uehara, Yulai Zhao, Chenyu Wang, Xiner Li, Aviv Regev, Sergey Levine, and Tommaso Biancalani. Inference-Time Alignment in Diffusion Models with Reward-Guided Generation: Tutorial and Review, January 2025. arXiv:2501.09685 [cs]. (pages 2, 4, and 10)

  16. [16]

    Steering diffusion models with quadratic rewards: a fine-grained analysis.arXiv preprint arXiv:2602.16570, 2026

    Ankur Moitra, Andrej Risteski, and Dhruv Rohatgi. Steering diffusion models with quadratic rewards: a fine-grained analysis, February 2026. arXiv:2602.16570 [cs]. (pages 2 and 6)

  17. [17]

    Sequential Monte Carlo samplers.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3):411–436, 2006

    Pierre Del Moral, Arnaud Doucet, and Ajay Jasra. Sequential Monte Carlo samplers.Journal of the Royal Statistical Society: Series B (Statistical Methodology), 68(3):411–436, 2006. (page 2)

  18. [18]

    Diffusion Posterior Sampling for General Noisy Inverse Problems

    Hyungjin Chung, Jeongsol Kim, Michael T. Mccann, Marc L. Klasky, and Jong Chul Ye. Diffusion Posterior Sampling for General Noisy Inverse Problems, May 2024. arXiv:2209.14687 [stat]. (pages 2, 6, 9, 10, 12, and 35)

  19. [19]

    arXiv preprint arXiv:2402.15194 , year=

    Masatoshi Uehara, Yulai Zhao, Kevin Black, Ehsan Hajiramezanali, Gabriele Scalia, Nathaniel Lee Diamant, Alex M. Tseng, Tommaso Biancalani, and Sergey Levine. Fine-Tuning of Continuous-Time Diffusion Models as Entropy-Regularized Control, February 2024. arXiv:2402.15194 [cs, stat]. (pages 2, 3, 4, and 10)

  20. [20]

    Scaling Rectified Flow Transformers for High-Resolution Image Synthesis

    Patrick Esser, Sumith Kulal, Andreas Blattmann, Rahim Entezari, Jonas M¨ uller, Harry Saini, Yam Levi, Dominik Lorenz, Axel Sauer, Frederic Boesel, Dustin Podell, Tim Dockhorn, Zion English, Kyle Lacey, Alex Goodwin, Yannik Marek, and Robin Rombach. Scaling Rectified Flow Transformers for High-Resolution Image Synthesis, March 2024. arXiv:2403.03206 [cs]....

  21. [21]

    FLUX.1 Kontext: Flow Matching for In-Context Image Generation and Editing in Latent Space

    Black Forest Labs, Stephen Batifol, Andreas Blattmann, Frederic Boesel, Saksham Consul, Cyril Diagne, Tim Dockhorn, Jack English, Zion English, Patrick Esser, Sumith Kulal, Kyle Lacey, Yam Levi, Cheng Li, Dominik Lorenz, Jonas M¨ uller, Dustin Podell, Robin Rombach, Harry Saini, Axel Sauer, and Luke Smith. FLUX.1 Kontext: Flow Matching for In-Context Imag...

  22. [22]

    Boffi, Michael S

    Nicholas M. Boffi, Michael S. Albergo, and Eric Vanden-Eijnden. How to build a consistency model: Learning flow maps via self-distillation, May 2025. (pages 2, 3, 10, 11, 21, and 39)

  23. [23]

    Boffi, Michael S

    Nicholas M. Boffi, Michael S. Albergo, and Eric Vanden-Eijnden. Flow map matching with stochastic interpolants: A mathematical framework for consistency models, June 2025. arXiv:2406.07507 [cs]. (pages 2, 3, 10, and 21)

  24. [24]

    Consistency Models

    Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. Consistency Models, May 2023. arXiv:2303.01469 [cs, stat]. (pages 2, 4, and 10)

  25. [25]

    Mean Flows for One-step Generative Modeling

    Zhengyang Geng, Mingyang Deng, Xingjian Bai, J. Zico Kolter, and Kaiming He. Mean Flows for One-step Generative Modeling, May 2025. arXiv:2505.13447 [cs]. (pages 2, 4, 10, and 21)

  26. [26]

    Consistency trajectory models: Learning probability flow ode trajectory of diffusion.arXiv preprint arXiv:2310.02279, 2023

    Dongjun Kim, Chieh-Hsin Lai, Wei-Hsiang Liao, Naoki Murata, Yuhta Takida, Toshimitsu Uesaka, Yutong He, Yuki Mitsufuji, and Stefano Ermon. Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion, March 2024. arXiv:2310.02279 [cs, stat]. (pages 2, 4, and 10)

  27. [27]

    Peter Holderrieth, Uriel Singer, Tommi Jaakkola, Ricky T. Q. Chen, Yaron Lipman, and Brian Karrer. GLASS Flows: Transition Sampling for Alignment of Flow and Diffusion Models, September 2025. arXiv:2509.25170 [cs]. (page 3)

  28. [28]

    arXiv preprint arXiv:2407.13734 , year=

    Masatoshi Uehara, Yulai Zhao, Tommaso Biancalani, and Sergey Levine. Understanding Reinforcement Learning-Based Fine-Tuning of Diffusion Models: A Tutorial and Review, July 2024. arXiv:2407.13734 [cs]. (pages 3 and 10)

  29. [30]

    Consistency models made easy.arXiv preprint arXiv:2406.14548,

    Zhengyang Geng, Ashwini Pokle, William Luo, Justin Lin, and J. Zico Kolter. Consistency Models Made Easy, October 2024. arXiv:2406.14548 [cs]. (pages 4 and 10)

  30. [31]

    Bidirectional Consistency Models, September 2024

    Liangchen Li and Jiajun He. Bidirectional Consistency Models, September 2024. arXiv:2403.18035 [cs]. (page 4)

  31. [32]

    Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models

    Cheng Lu and Yang Song. Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models, October 2024. arXiv:2410.11081 [cs] version: 1. (page 4)

  32. [33]

    Improved Mean Flows: On the Challenges of Fastforward Generative Models

    Zhengyang Geng, Yiyang Lu, Zongze Wu, Eli Shechtman, J. Zico Kolter, and Kaiming He. Improved Mean Flows: On the Challenges of Fastforward Generative Models, December 2025. arXiv:2512.02012 [cs]. (pages 4 and 10)

  33. [34]

    One Step Diffusion via Shortcut Models

    Kevin Frans, Danijar Hafner, Sergey Levine, and Pieter Abbeel. One Step Diffusion via Shortcut Models, October 2024. arXiv:2410.12557 [cs]. (pages 4 and 10)

  34. [35]

    Terminal Velocity Matching, November

    Linqi Zhou, Mathias Parger, Ayaan Haque, and Jiaming Song. Terminal Velocity Matching, November

  35. [36]

    Terminal velocity matching

    arXiv:2511.19797 [cs]. (pages 4 and 10)

  36. [37]

    A Taxonomy of Loss Functions for Stochastic Optimal Control, October 2024

    Carles Domingo-Enrich. A Taxonomy of Loss Functions for Stochastic Optimal Control, October 2024. arXiv:2410.00345 [cs]. (page 4)

  37. [38]

    Variational and optimal control representations of conditioned and driven processes.Journal of Statistical Mechanics: Theory and Experiment, 2015(12):P12001, December 2015

    Rapha¨ el Chetrite and Hugo Touchette. Variational and optimal control representations of conditioned and driven processes.Journal of Statistical Mechanics: Theory and Experiment, 2015(12):P12001, December 2015. (page 4)

  38. [39]

    Fleming and Raymond W

    Wendell H. Fleming and Raymond W. Rishel.Deterministic and Stochastic Optimal Control, volume 1 ofApplications of Mathematics. Springer-Verlag, New York, NY, 1975. (pages 4, 24, and 25) 18

  39. [40]

    Birkh¨ auser, Boston, MA, 1997

    Martino Bardi and Italo Capuzzo-Dolcetta.Optimal Control and Viscosity Solutions of Hamilton-Jacobi- Bellman Equations. Birkh¨ auser, Boston, MA, 1997. (pages 5 and 31)

  40. [41]

    FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems, March 2025

    Jeongsol Kim, Bryan Sangwoo Kim, and Jong Chul Ye. FlowDPS: Flow-Driven Posterior Sampling for Inverse Problems, March 2025. arXiv:2503.08136 [cs]. (pages 6, 9, 10, 12, and 36)

  41. [42]

    Metaxas, and Yezhou Yang

    Maitreya Patel, Song Wen, Dimitris N. Metaxas, and Yezhou Yang. FlowChef: Steering Rectified Flow Models for Controlled Generation. 2025. (pages 6, 9, 10, 12, and 37)

  42. [43]

    Manifold preserv- ing guided diffusion.arXiv preprint arXiv:2311.16424, 2023

    Yutong He, Naoki Murata, Chieh-Hsin Lai, Yuhta Takida, Toshimitsu Uesaka, Dongjun Kim, Wei-Hsiang Liao, Yuki Mitsufuji, J. Zico Kolter, Ruslan Salakhutdinov, and Stefano Ermon. Manifold Preserving Guided Diffusion, November 2023. arXiv:2311.16424 [cs]. (pages 6, 9, 10, 13, 38, and 52)

  43. [44]

    Ruiqi Feng, Chenglei Yu, Wenhao Deng, Peiyan Hu, and Tailin Wu

    Luca Eyring, Shyamgopal Karthik, Karsten Roth, Alexey Dosovitskiy, and Zeynep Akata. ReNO: Enhancing One-step Text-to-Image Models through Reward-based Noise Optimization, October 2024. arXiv:2406.04312 [cs]. (pages 6, 9, 10, 13, 14, 39, and 54)

  44. [45]

    Jinho Chang, Jaemin Kim, and Jong Chul Ye

    Heli Ben-Hamu, Omri Puny, Itai Gat, Brian Karrer, Uriel Singer, and Yaron Lipman. D-Flow: Differen- tiating through Flows for Controlled Generation, July 2024. arXiv:2402.14017 [cs]. (pages 6, 9, 10, and 39)

  45. [46]

    On the construction and comparison of difference schemes.SIAM Journal on Numerical Analysis, 5(3):506–517, 1968

    Gilbert Strang. On the construction and comparison of difference schemes.SIAM Journal on Numerical Analysis, 5(3):506–517, 1968. (page 8)

  46. [47]

    RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control, May 2024

    Litu Rout, Yujia Chen, Nataniel Ruiz, Abhishek Kumar, Constantine Caramanis, Sanjay Shakkottai, and Wen-Sheng Chu. RB-Modulation: Training-Free Personalization of Diffusion Models using Stochastic Optimal Control, May 2024. arXiv:2405.17401 [cs]. (pages 9 and 10)

  47. [48]

    Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow

    Xingchao Liu, Chengyue Gong, and Qiang Liu. Flow Straight and Fast: Learning to Generate and Transfer Data with Rectified Flow, September 2022. arXiv:2209.03003 [cs]. (page 10)

  48. [49]

    Multistep Consistency Models, November 2024

    Jonathan Heek, Emiel Hoogeboom, and Tim Salimans. Multistep Consistency Models, November 2024. arXiv:2403.06807 [cs]. (page 10)

  49. [50]

    Align your flow: Scaling continuous-time flow map distillation.arXiv preprint arXiv:2506.14603, 2025

    Amirmojtaba Sabour, Sanja Fidler, and Karsten Kreis. Align Your Flow: Scaling Continuous-Time Flow Map Distillation, June 2025. arXiv:2506.14603 [cs]. (pages 10 and 21)

  50. [51]

    M., and Han, J

    Yinuo Ren, Wenhao Gao, Lexing Ying, Grant M. Rotskoff, and Jiequn Han. DriftLite: Lightweight Drift Control for Inference-Time Scaling of Diffusion Models, September 2025. arXiv:2509.21655 [cs]. (page 10)

  51. [52]

    On the Guidance of Flow Matching

    Ruiqi Feng, Chenglei Yu, Wenhao Deng, Peiyan Hu, and Tailin Wu. On the Guidance of Flow Matching. InProceedings of the 42nd International Conference on Machine Learning (ICML), volume 267 ofProceedings of Machine Learning Research, pages 16993–17029. PMLR, 2025. URL https: //proceedings.mlr.press/v267/feng25s.html. (page 10)

  52. [53]

    Training Diffusion Models with Reinforcement Learning

    Kevin Black, Michael Janner, Yilun Du, Ilya Kos s, and Sergey Levine. Training Diffusion Models with Reinforcement Learning, January 2024. arXiv:2305.13301 [cs]. (page 10)

  53. [54]

    DPOK: Reinforcement Learning for Fine- tuning Text-to-Image Diffusion Models, November 2023

    Ying Fan, Olivia Watkins, Yuqing Du, Hao Liu, Moonkyung Ryu, Craig Boutilier, Pieter Abbeel, Mohammad Ghavamzadeh, Kangwook Lee, and Kimin Lee. DPOK: Reinforcement Learning for Fine- tuning Text-to-Image Diffusion Models, November 2023. arXiv:2305.16381 [cs]. (page 10)

  54. [55]

    Diamond Maps: Efficient Reward Alignment via Stochastic Flow Maps

    Peter Holderrieth, Douglas Chen, Luca Eyring, Ishin Shah, Giri Anantharaman, Yutong He, Zeynep Akata, Tommi Jaakkola, Nicholas Matthew Boffi, and Max Simchowitz. Diamond maps: Efficient reward alignment via stochastic flow maps, February 2026. arXiv:2602.05993 [cs]. (page 10) 19

  55. [56]

    Meta Flow Maps enable scalable reward alignment

    Peter Potaptchik, Adhi Saravanan, Abbas Mammadov, Alvaro Prat, Michael S. Albergo, and Yee Whye Teh. Meta flow maps enable scalable reward alignment, January 2026. arXiv:2601.14430 [cs]. (page 10)

  56. [57]

    arXiv preprint arXiv:2603.07276 , year=

    Abbas Mammadov, So Takao, Bohan Chen, Ricardo Baptista, Morteza Mardani, Yee Whye Teh, and Julius Berner. Variational flow maps: Make some noise for one-step conditional generation, March 2026. arXiv:2603.07276 [cs]. (page 11)

  57. [58]

    FLUX.1 [dev]: A 12 billion parameter rectified flow transformer

    Black Forest Labs. FLUX.1 [dev]: A 12 billion parameter rectified flow transformer. https: //huggingface.co/black-forest-labs/FLUX.1-dev , 2024. Model weights available on Hugging Face. (pages 11, 14, and 39)

  58. [59]

    ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation, December

    Jiazheng Xu, Xiao Liu, Yuchen Wu, Yuxuan Tong, Qinkai Li, Ming Ding, Jie Tang, and Yuxiao Dong. ImageReward: Learning and Evaluating Human Preferences for Text-to-Image Generation, December

  59. [60]

    (page 13)

    arXiv:2304.05977 [cs]. (page 13)

  60. [61]

    Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis

    Xiaoshi Wu, Yiming Hao, Keqiang Sun, Yixiong Chen, Feng Zhu, Rui Zhao, and Hongsheng Li. Human Preference Score v2: A Complementary Metric for Evaluating Human Preferences in Vision-Language Tasks, 2023. arXiv:2306.09341. (page 13)

  61. [62]

    Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation

    Yuval Kirstain, Adam Polyak, Uriel Singer, Shahbuland Matiana, Joe Penna, and Omer Levy. Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation. InAdvances in Neural Information Processing Systems, 2023. (page 13)

  62. [63]

    Dhruba Ghosh, Hanna Hajishirzi, and Ludwig Schmidt

    Dhruba Ghosh, Hannaneh Hajishirzi, and Luke Zettlemoyer. GenEval: An Object-Focused Framework for Evaluating Text-to-Image Alignment, 2023. arXiv:2310.11513. (pages 13 and 54)

  63. [64]

    Skywork-VL reward: An effective reward model for multimodal understanding and reasoning.arXiv preprint arXiv:2505.07263, 2025

    Xiaokun Wang, Peiyu Wang, Jiangbo Pei, Wei Shen, Yi Peng, Yunzhuo Hao, Weijie Qiu, Ai Jian, Tianyidan Xie, Xuchen Song, Yang Liu, and Yahui Zhou. Skywork-VL reward: An effective reward model for multimodal understanding and reasoning.arXiv preprint arXiv:2505.07263, 2025. (pages 14 and 58)

  64. [65]

    J. L. Doob. Conditional Brownian motion and the boundary limits of harmonic functions.Bulletin de la Soci´ et´ e Math´ ematique de France, 85:431–458, 1957. (page 22)

  65. [66]

    Stargan v2: Diverse image synthesis for multiple domains

    Yunjey Choi, Youngjung Uh, Jaejun Yoo, and Jung-Woo Ha. Stargan v2: Diverse image synthesis for multiple domains. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8188–8197, 2020. (page 40)

  66. [67]

    A style-based generator architecture for generative adversarial networks

    Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4401–4410, 2019. (page 40) 20 A Background on flow maps In this section, we provide some brief further background on flow maps. For complete details, ...

  67. [68]

    TheSemigroup property:for all(�� �� �)�[0�1] � and for all��� �, � ���(�) =� ���(� ���(�))�(21)

  68. [69]

    TheLagrangian equation:for all(�� �)�[0�1] � and for all��� �, ��� ���(�) =� �(� ���(�))�(22)

  69. [70]

    shortcut

    TheEulerian equation:for all(�� �)�[0�1] � and for all��� �, ��� ���(�) +�� ���(�)� �(�) = 0�(23) Following recent work on accelerated sampling [20, 23, 47], we parameterize the flow map as � ���(�) =�+ (���)� ���(�)�(24) where � : [0�1]� �� � �� � is a learned velocity function. On the diagonal � = �, the Lagrangian equation implies ����(�) =� �(�)�(25) ...