Curvature-Adaptive Consistency Flow Matching: Autonomous Trajectory Optimization via Reinforcement Learning

Bohan Li; Guhan Chen; Jingyi Ma; Songtao Tian; Zixiong Yu

arxiv: 2606.22394 · v1 · pith:LVGR3O7Tnew · submitted 2026-06-21 · 💻 cs.CV

Curvature-Adaptive Consistency Flow Matching: Autonomous Trajectory Optimization via Reinforcement Learning

Songtao Tian , Guhan Chen , Bohan Li , Jingyi Ma , Zixiong Yu This is my paper

Pith reviewed 2026-06-26 10:37 UTC · model grok-4.3

classification 💻 cs.CV

keywords consistency distillationreinforcement learningdiffusion modelsflow matchingfew-step generationtrajectory optimizationadaptive curriculum

0 comments

The pith

CACFM uses a reinforcement learning agent to automatically prioritize boundary stages during consistency distillation for higher-fidelity few-step generation on large models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper identifies that consistency distillation has a U-shaped difficulty profile unlike standard iterative sampling, placing the main optimization bottlenecks at the initialization and final refinement stages rather than the middle. It treats the distillation process as a dynamic decision problem solved by a lightweight RL agent that probes Probability Flow ODE trajectories to build an efficiency-oriented curriculum on its own. This is paired with a new Flow Distribution Matching Distillation objective. The result is state-of-the-art performance on models such as FLUX and SDXL, with reduced structural deformities and better retention of high-frequency details in extreme few-step regimes.

Core claim

Consistency distillation exhibits a distinctly different difficulty profile from standard iterative generation, with primary optimization bottlenecks at the boundary stages. CACFM formulates distillation as a dynamic decision process and employs a lightweight Reinforcement Learning agent to actively probe Probability Flow ODE trajectories, automatically constructing an efficiency-oriented curriculum that prioritizes critical regions without manual scheduling. Integrated with a novel Flow Distribution Matching Distillation objective, the method achieves new state-of-the-art results on large-scale models such as FLUX and SDXL while mitigating structural deformities and preserving high-frequenc

What carries the argument

Curvature-Adaptive Consistency Flow Matching (CACFM), which uses a lightweight reinforcement learning agent to adaptively select sampling points along Probability Flow ODE trajectories according to evolving optimization difficulty.

If this is right

Static sampling priors become suboptimal once the learning requirements change during distillation training.
Manual curriculum design is no longer required to focus effort on the hardest regions.
New state-of-the-art visual quality is reached on FLUX and SDXL in few-step regimes.
Structural deformities decrease and high-frequency details are retained better than prior consistency methods.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same RL-driven trajectory selection could be tested on other distillation techniques that also show non-uniform difficulty profiles.
Boundary-focused adaptation may reduce the total number of training steps needed to reach a target quality level.
Applying the agent to even larger or differently architected generative models would test whether the boundary-bottleneck pattern holds more generally.

Load-bearing premise

The primary optimization bottlenecks in consistency distillation reside at the boundary stages rather than the intermediate steps, and a lightweight RL agent can automatically identify and prioritize those regions without manual scheduling.

What would settle it

Direct measurement showing that per-stage optimization difficulty is uniform or higher in the middle of trajectories, or that a fixed static sampling schedule matches the RL agent's performance on the same models and tasks.

Figures

Figures reproduced from arXiv: 2606.22394 by Bohan Li, Guhan Chen, Jingyi Ma, Songtao Tian, Zixiong Yu.

**Figure 1.** Figure 1: 1024×1024 samples produced by our 4-step generator distilled from FLUX.1-dev. Models (PCMs [59]). However, existing methods exhibit a distinct static nature in their sampling strategies: they predominantly rely on pre-defined discretization schemes or heuristic priors, such as basic uniform sampling or directly adopting the Logit-Normal distribution [8] widely utilized in Rectified Flow training. However, … view at source ↗

**Figure 2.** Figure 2: Comparing sampling distributions along the Flow Matching trajectory. (a) The Logit-Normal prior assumes that intermediate steps are the most important. (b) The U-shaped distribution reveals that boundary stages (initialization and final refinement) constitute the primary bottlenecks, and our agent learns this pattern (see [PITH_FULL_IMAGE:figures/full_fig_p003_2.png] view at source ↗

**Figure 3.** Figure 3: Training paradigm of CACFM. the process into distinct semantic phases of generation: Initialization, Structural Formation, Texture Filling, and Final Refinement. – State Space S (Relative Difficulty Profile): To make informed decisions, the agent must perceive the relative difficulty across stages. We define the state st as the ordinal ranking of the consistency losses {L(1) , . . . ,L (M)} associated with… view at source ↗

**Figure 4.** Figure 4: Qualitative comparisons with LoRAbased approaches on FLUX architecture [PITH_FULL_IMAGE:figures/full_fig_p010_4.png] view at source ↗

**Figure 5.** Figure 5: Qualitative comparisons with LoRA-based approaches on SDXL architecture. 2) Human-Centric Evaluation: Tables 3 and 4 report metrics assessing perceptual quality and text alignment. – Consistent Performance: CACFM achieves top-tier performance (consistently ranking 1st or 2nd) across the vast majority of configurations. – Robustness: Our approach maintains stable improvements across increasing steps. Crucia… view at source ↗

**Figure 6.** Figure 6: Analysis of the Learned Policy. Left: The agent’s selection frequency (Red) perfectly aligns with the U-shaped optimization difficulty profile (Blue Bars), validating its geometry-awareness. Right: The selection probability heatmap shows an emergent curriculum, shifting focus from initialization to refinement (Phase 0 to 3) over time [PITH_FULL_IMAGE:figures/full_fig_p013_6.png] view at source ↗

**Figure 7.** Figure 7: Convergence of max Q-value differences across training iterations. The x-axis represents training epochs after window smoothing, while the y-axis shows the Q-value differences. The monotonic decrease of the ℓ2-norm demonstrates asymptotic stability. Value Function Convergence. The maximum Q-value per state demonstrates asymptotic stability, as evidenced by the convergence of the squared ℓ2-norm of temporal… view at source ↗

read the original abstract

Consistency distillation has significantly accelerated the inference of diffusion models. In this work, we reveal an intriguing asymmetry: while Logit-Normal sampling priors are highly efficacious for standard iterative generation, consistency distillation exhibits a distinctly different difficulty profile (e.g., U-shaped). We identify that the primary optimization bottlenecks reside at the boundary stages (initialization or final refinement) rather than the intermediate steps. To address the limitations of static sampling in accommodating evolving learning requirements, we propose Curvature-Adaptive Consistency Flow Matching (CACFM). By formulating distillation as a dynamic decision process, CACFM employs a lightweight Reinforcement Learning agent to actively probe Probability Flow ODE trajectories, automatically constructing an efficiency-oriented curriculum that prioritizes critical regions without manual scheduling. Integrated with a novel Flow Distribution Matching Distillation (DMD) objective, our approach achieves new state-of-the-art results on large-scale models such as FLUX and SDXL. It effectively mitigates structural deformities and preserves high-frequency details in extreme few-step regimes, achieving unprecedented visual fidelity.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper frames consistency distillation as an RL problem over PF-ODE curvature to build an adaptive curriculum, but the abstract supplies no evidence for the claimed U-shaped difficulty profile or the SOTA results on FLUX and SDXL.

read the letter

The punchline is that this work treats the choice of sampling points during consistency distillation as a reinforcement learning decision process that prioritizes boundary stages of the trajectory. It adds a Flow Distribution Matching Distillation objective and reports large gains on big models.

What is new is the explicit use of a lightweight RL agent to probe trajectories and construct the curriculum on the fly instead of relying on fixed schedules like Logit-Normal. The DMD objective is also presented as a fresh component.

The paper does a reasonable job noting that standard sampling priors may not match the needs of distillation and that difficulty may concentrate at the start and end rather than the middle.

The soft spots are more serious. The entire argument rests on an unshown U-shaped difficulty curve and the claim that the RL policy automatically discovers the right regions. No equations, trajectory plots, ablations, or quantitative tables appear in the abstract to support either point. Without those, the reported improvements on FLUX and SDXL cannot be attributed to the proposed mechanism rather than other factors.

This is for people already working on few-step diffusion and flow matching. A reader in that niche might pick up the RL framing as an idea to explore, but the missing evidence means the paper does not yet deserve a serious referee.

Referee Report

3 major / 1 minor

Summary. The paper claims that consistency distillation of diffusion models exhibits a U-shaped difficulty profile (distinct from Logit-Normal sampling), with primary optimization bottlenecks at boundary stages rather than intermediate steps. It proposes Curvature-Adaptive Consistency Flow Matching (CACFM), which formulates distillation as a dynamic decision process solved by a lightweight RL agent that probes PF-ODE trajectories to autonomously build an efficiency-oriented curriculum. This is integrated with a novel Flow Distribution Matching Distillation (DMD) objective, yielding claimed state-of-the-art results on large-scale models such as FLUX and SDXL by mitigating structural deformities and preserving high-frequency details in extreme few-step regimes.

Significance. If the U-shaped profile and the RL agent's ability to discover boundary prioritization without manual intervention are empirically verified, the method could offer an automated alternative to hand-crafted sampling schedules in consistency distillation, with potential benefits for scaling few-step generation in large models. The work highlights an asymmetry in learning dynamics that, if substantiated, would be a useful observation for the field. However, the absence of any supporting analysis or results in the provided manuscript limits the ability to assess whether these gains are attributable to the proposed mechanism rather than other factors.

major comments (3)

[Abstract] Abstract: The central assertion that 'consistency distillation exhibits a distinctly different difficulty profile (e.g., U-shaped)' with 'primary optimization bottlenecks reside at the boundary stages' is presented without any referenced analysis, equations, trajectory visualizations, or quantitative evidence demonstrating this profile for the models or sampling priors discussed. This assumption directly motivates the CACFM formulation and DMD objective.
[Abstract] Abstract: The claim that a 'lightweight Reinforcement Learning agent' can 'actively probe Probability Flow ODE trajectories, automatically constructing an efficiency-oriented curriculum that prioritizes critical regions without manual scheduling' lacks any specification of the underlying MDP (state/action spaces, reward function, or policy), making it impossible to determine whether the agent actually concentrates on the claimed boundary regions or if the performance gains reduce to the RL component fitting to the data.
[Abstract] Abstract: The statements that the approach 'achieves new state-of-the-art results on large-scale models such as FLUX and SDXL' and 'achieving unprecedented visual fidelity' are unsupported by any metrics, baselines, ablation studies, step counts, or experimental details. Without these, the attribution of improvements to the curvature-adaptive RL curriculum cannot be evaluated.

minor comments (1)

[Abstract] Abstract: The acronym 'DMD' (Flow Distribution Matching Distillation) is introduced without expansion or relation to prior distribution-matching distillation methods.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the detailed and constructive comments. We address each major point below and indicate where revisions to the manuscript will be made to improve clarity and support for the claims.

read point-by-point responses

Referee: [Abstract] Abstract: The central assertion that 'consistency distillation exhibits a distinctly different difficulty profile (e.g., U-shaped)' with 'primary optimization bottlenecks reside at the boundary stages' is presented without any referenced analysis, equations, trajectory visualizations, or quantitative evidence demonstrating this profile for the models or sampling priors discussed. This assumption directly motivates the CACFM formulation and DMD objective.

Authors: We agree that the abstract introduces the U-shaped difficulty profile without explicit references. The full manuscript contains supporting analysis in Section 3, including PF-ODE trajectory visualizations and quantitative comparisons to Logit-Normal sampling that demonstrate the boundary-stage bottlenecks. We will revise the abstract to cite the relevant section and figure. revision: yes
Referee: [Abstract] Abstract: The claim that a 'lightweight Reinforcement Learning agent' can 'actively probe Probability Flow ODE trajectories, automatically constructing an efficiency-oriented curriculum that prioritizes critical regions without manual scheduling' lacks any specification of the underlying MDP (state/action spaces, reward function, or policy), making it impossible to determine whether the agent actually concentrates on the claimed boundary regions or if the performance gains reduce to the RL component fitting to the data.

Authors: The methods section defines the MDP with state space derived from local curvature estimates along trajectories, discrete actions for timestep selection, and a reward based on reduction in the DMD objective. We acknowledge the abstract is underspecified and will revise it to include a concise description of these MDP elements or a direct reference to the subsection. revision: yes
Referee: [Abstract] Abstract: The statements that the approach 'achieves new state-of-the-art results on large-scale models such as FLUX and SDXL' and 'achieving unprecedented visual fidelity' are unsupported by any metrics, baselines, ablation studies, step counts, or experimental details. Without these, the attribution of improvements to the curvature-adaptive RL curriculum cannot be evaluated.

Authors: The experiments section reports FID and CLIP scores on FLUX and SDXL for 1-4 step generation against baselines including standard consistency models and DMD, along with ablations isolating the RL curriculum. We will revise the abstract to incorporate key quantitative results and step counts supporting the SOTA claim. revision: yes

Circularity Check

0 steps flagged

No circularity: derivation relies on external empirical observation and RL optimization, not self-definition or fitted inputs

full rationale

The provided abstract and description present an empirical observation (U-shaped difficulty profile in consistency distillation) as motivation, followed by a proposed RL-based adaptive sampling method and a new DMD objective. No equations, self-citations, or parameter-fitting steps are shown that would reduce the claimed performance gains or curriculum construction to tautological inputs by construction. The central claims rest on external benchmarks (FLUX, SDXL) and an autonomous RL agent, which are falsifiable outside the paper's fitted values. This is the common case of a self-contained proposal without load-bearing circular steps.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Only the abstract is available, so the ledger is necessarily incomplete and based solely on stated claims; no free parameters, axioms, or invented entities can be verified or enumerated from the provided text.

axioms (1)

domain assumption Logit-Normal sampling priors are highly efficacious for standard iterative generation while consistency distillation exhibits a U-shaped difficulty profile with bottlenecks at boundary stages.
Stated directly in the abstract as an observed fact motivating the method.

invented entities (1)

Curvature-Adaptive Consistency Flow Matching (CACFM) with Flow Distribution Matching Distillation (DMD) objective no independent evidence
purpose: To formulate distillation as a dynamic RL decision process that automatically builds an efficiency-oriented curriculum.
Introduced in the abstract as the core proposed technique.

pith-pipeline@v0.9.1-grok · 5714 in / 1370 out tokens · 30862 ms · 2026-06-26T10:37:12.922729+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

72 extracted references · 11 canonical work pages

[1]

Albergo, M.S., Vanden-Eijnden, E.: Building normalizing flows with stochastic interpolants.In:The Eleventh InternationalConference onLearning Representations (2023),https://openreview.net/forum?id=li7qeBbCR1t

2023
[2]

Random Forests,

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine learning47(2), 235–256 (2002),https://doi.org/10.1023/A: 1013689704352

work page doi:10.1023/a: 2002
[3]

In: Advances in Neural Information Processing Systems

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Nee- lakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems. vol. 33, pp. 1877–1901 (2020),https://arxiv.org/abs/2005.14165

Pith/arXiv arXiv 1901
[4]

In: Conference on Computer Vision and Pattern Recognition

Changpinyo, S., Sharma, P., Ding, N., Soricut, R.: Conceptual 12m: Pushing web- scale image-text pre-training to recognize long-tail visual concepts. In: Conference on Computer Vision and Pattern Recognition. pp. 3558–3568 (2021),https:// ieeexplore.ieee.org/document/9578388

arXiv 2021
[5]

In: The Thirty-eighth Annual Conference on Neural Informa- tion Processing Systems (2024),https://openreview.net/forum?id=ni3Ud2BV3G

Chen, G., Li, Y., Lin, Q.: On the impacts of the random initialization in the neural tangent kernel theory. In: The Thirty-eighth Annual Conference on Neural Informa- tion Processing Systems (2024),https://openreview.net/forum?id=ni3Ud2BV3G

2024
[6]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Chen, J., Xue, S., Zhao, Y., Yu, J., Paul, S., Chen, J., Cai, H., Han, S., Xie, E.: Sana-sprint: One-step diffusion with continuous-time consistency distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 16185–16195 (October 2025),https://doi.org/10.1109/ICCV51701. 2025.01502

work page doi:10.1109/iccv51701 2025
[7]

In: International Conference on Learning Representations (2021),https://openreview

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (2021),https://openreview. net/forum?id=YicbFdNTTy

2021
[8]

In: Forty- first International Conference on Machine Learning (2024),https://openreview

Esser, P., Kulal, S., Blattmann, A., Entezari, R., Müller, J., Saini, H., Levi, Y., Lorenz, D., Sauer, A., Boesel, F., Podell, D., Dockhorn, T., English, Z., Rombach, R.: Scaling rectified flow transformers for high-resolution image synthesis. In: Forty- first International Conference on Machine Learning (2024),https://openreview. net/forum?id=FPnUhsQJ5B

2024
[9]

Fan, Y., Lee, K.: Optimizing ddpm sampling with shortcut fine-tuning (2024), https://arxiv.org/abs/2301.13362

arXiv 2024
[10]

Fan, Y., Watkins, O., Du, Y., Liu, H., Ryu, M., Boutilier, C., Abbeel, P., Ghavamzadeh, M., Lee, K., Lee, K.: Dpok: Reinforcement learning for fine-tuning text-to-image diffusion models (2023),https://arxiv.org/abs/2305.16381

arXiv 2023
[11]

Generative adversarial networks.Commun

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Communications of the ACM63(11), 139–144 (2020),https://doi.org/10.1145/3422622

work page doi:10.1145/3422622 2020
[12]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015),https://arxiv.org/abs/1512.03385

Pith/arXiv arXiv 2015
[13]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings

He, Y., Zhang, C., Chen, F., Cao, J.: Cinematte: Background matting for virtual production and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings. pp. 8725–8735 (June 2026), https://arxiv.org/abs/2605.18328

Pith/arXiv arXiv 2026
[14]

In: Mul- 16 S

He, Y., Zhang, W., Deng, J., Cong, Y.: Prior-knowledge-free video frame inter- polation with bidirectional regularized implicit neural representations. In: Mul- 16 S. Tian et al. tiMedia Modeling. pp. 112–126. Springer Nature Switzerland, Cham (2024), https://link.springer.com/chapter/10.1007/978-3-031-53311-2_9

work page doi:10.1007/978-3-031-53311-2_9 2024
[15]

Advances in Neural Information Processing Systems30(2017), https://dl.acm.org/doi/10

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems30(2017), https://dl.acm.org/doi/10. 5555/3295222.3295408

arXiv 2017
[16]

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network (2015),https://arxiv.org/abs/1503.02531

Pith/arXiv arXiv 2015
[17]

Advances in neural information processing systems33, 6840–6851 (2020),https://arxiv.org/ abs/2006.11239

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020),https://arxiv.org/ abs/2006.11239

Pith/arXiv arXiv 2020
[18]

Ho, J., Salimans, T.: Classifier-free diffusion guidance (2022),https://arxiv.org/ abs/2207.12598

Pith/arXiv arXiv 2022
[20]

Advances in Neural Information Processing Systems35, 26565–26577 (2022),https://dl.acm.org/doi/10.5555/3600270.3602196

Karras, T., Aittala, M., Aila, T., Laine, S.: Elucidating the design space of diffusion- based generative models. Advances in Neural Information Processing Systems35, 26565–26577 (2022),https://dl.acm.org/doi/10.5555/3600270.3602196

work page doi:10.5555/3600270.3602196 2022
[21]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=ymjI8feDTD

Kim, D., Lai, C.H., Liao, W.H., Murata, N., Takida, Y., Uesaka, T., He, Y., Mitsufuji, Y., Ermon, S.: Consistency trajectory models: Learning probability flow ODE trajectory of diffusion. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=ymjI8feDTD

2024
[22]

In: Thirty- seventh Conference on Neural Information Processing Systems (2023), https: //openreview.net/forum?id=G5RwHpBUv0

Kirstain, Y., Polyak, A., Singer, U., Matiana, S., Penna, J., Levy, O.: Pick-a- pic: An open dataset of user preferences for text-to-image generation. In: Thirty- seventh Conference on Neural Information Processing Systems (2023), https: //openreview.net/forum?id=G5RwHpBUv0

2023
[23]

Labs, B.F.: Flux.https://github.com/black-forest-labs/flux(2024)

2024
[24]

Lai, J., Yu, Z., Tian, S., Lin, Q.: Generalization ability of wide residual networks (2023),https://arxiv.org/abs/2305.18506

arXiv 2023
[25]

Journal of Machine Learning Research25(82), 1–47 (2024),http://jmlr.org/papers/v25/ 23-0866.html

Li, Y., Yu, Z., Chen, G., Lin, Q.: On the eigenvalue decay rates of a class of neural-network related kernel functions defined on general domains. Journal of Machine Learning Research25(82), 1–47 (2024),http://jmlr.org/papers/v25/ 23-0866.html

2024
[26]

Li, Y., Yang, C., Dong, J., Yao, Z., Xu, H., Dong, Z., Zeng, H., An, Z., Tian, Y.: Ammkd: Adaptive multimodal multi-teacher distillation for lightweight vision- language models (2025),https://arxiv.org/abs/2509.00039

arXiv 2025
[27]

2025 , url =

Li, Y., Yang, C., Zeng, H., Dong, Z., An, Z., Xu, Y., Tian, Y., Wu, H.: Frequency- aligned knowledge distillation for lightweight spatiotemporal forecasting. In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7262–7272 (2025),https://doi.org/10.1109/ICCV51701.2025.00682

work page doi:10.1109/iccv51701.2025.00682 2025
[28]

In: AAAI

Li, Z., Hu, Y., Chen, Z., Huang, Q., Qiu, G., Fu, Z., Liu, M.: Retrack: Evidence- driven dual-stream directional anchor calibration network for composed video retrieval. In: AAAI. vol. 40, pp. 23373–23381 (2026),https://dl.acm.org/doi/ abs/10.1145/3796712

work page doi:10.1145/3796712 2026
[29]

Liang, G., Wang, Z., Hu, J., Zhou, H., Xue, Z., Zhang, J., Xu, D., Yu, Q.: Render- in-the-loop: Vector graphics generation via visual self-feedback (2026), https: //arxiv.org/abs/2604.20730

Pith/arXiv arXiv 2026
[30]

Liang, G., Wang, Z., Wang, C., Hu, J., Zhou, H., Liu, J., Zhang, J., Xu, D., Yu, Q.: Vanim: Rendering-aware sparse state modeling for structure-preserving vector animation (2026),https://arxiv.org/abs/2605.01517 Curvature-Adaptive Consistency Flow Matching 17

Pith/arXiv arXiv 2026
[31]

In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Lin, J., Zhang, J., Jin, G., Song, W., Liu, T., Lu, G.: 3d plant root skeleton detection and extraction. In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 3011–3017. IEEE (2025),https://arxiv.org/abs/2508. 08094

2025
[32]

Lin, S., Wang, A., Yang, X.: Sdxl-lightning: Progressive adversarial diffusion distil- lation (2024),https://arxiv.org/abs/2402.13929

Pith/arXiv arXiv 2024
[33]

In: The Eleventh International Conference on Learning Representations (2023),https://openreview.net/forum?id=PqvMRDCJT9t

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: The Eleventh International Conference on Learning Representations (2023),https://openreview.net/forum?id=PqvMRDCJT9t

2023
[34]

Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow (2022),https://arxiv.org/abs/2209.03003

Pith/arXiv arXiv 2022
[35]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum? id=1k4yZbbDqX

Liu, X., Zhang, X., Ma, J., Peng, J., qiang liu: Instaflow: One step is enough for high-quality diffusion-based text-to-image generation. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum? id=1k4yZbbDqX

2024
[36]

In: Advances in Neural Information Processing Systems (2022),https://openreview.net/forum? id=2uAaGwlP_V

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In: Advances in Neural Information Processing Systems (2022),https://openreview.net/forum? id=2uAaGwlP_V

2022
[37]

2025 , url =

Lu, Y., Ren, Y., Xia, X., Lin, S., Wang, X., Xiao, X., Ma, A.J., Xie, X., Lai, J.H.: Adversarial distribution matching for diffusion distillation towards efficient image and video synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 16818–16829 (October 2025),https://doi.org/ 10.1109/ICCV51701.2025.01562

work page doi:10.1109/iccv51701.2025.01562 2025
[38]

Luo, S., Tan, Y., Huang, L., Li, J., Zhao, H.: Latent consistency models: Synthesizing high-resolution images with few-step inference (2023),https://arxiv.org/abs/ 2310.04378

Pith/arXiv arXiv 2023
[39]

Mai, J., Liao, B., Zhao, Z., Zeng, Y., Li, H., Civera, J., Wu, T., Zhou, Y., Liu, P.: Neural predictor-corrector: Solving homotopy problems with reinforcement learning (2026),https://arxiv.org/abs/2602.03086

arXiv 2026
[40]

org/abs/2404.03673

Oertell, O., Chang, J.D., Zhang, Y., Brantley, K., Sun, W.: Rl for consistency models: Faster reward guided text-to-image generation (2024),https://arxiv. org/abs/2404.03673

arXiv 2024
[41]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=di52zR8xgf

Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: SDXL: Improving latent diffusion models for high-resolution image synthesis. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=di52zR8xgf

2024
[42]

In: Proceedings of the 38th International Conference on Machine Learning

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763...

2021
[43]

Rao, J., Liu, X., Deng, H., Lin, Z., Yu, Z., Wei, J., Meng, X., Zhang, M.: Dy- namic sampling that adapts: Self-aware iterative data persistent optimization for mathematical reasoning (2026),https://arxiv.org/abs/2505.16176

Pith/arXiv arXiv 2026
[44]

TechRxiv2025(1220) (2025),https: //www.techrxiv.org/doi/abs/10.36227/techrxiv.176620610.03288677/v1 18 S

Rao, J., Liu, X., Yan, H., Shen, J., Mo, H., Dong, Y., Yan, Z., Wang, Z., Lin, Z., Meng, X., Yu, Z., Deng, L., Wei, J., Wang, Y., Zhang, M.: A data-centric perspective on the lifecycle of large language models. TechRxiv2025(1220) (2025),https: //www.techrxiv.org/doi/abs/10.36227/techrxiv.176620610.03288677/v1 18 S. Tian et al

work page doi:10.36227/techrxiv.176620610.03288677/v1 2025
[45]

In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024), https://openreview.net/forum?id=O5XbOoi0x3

Ren, Y., Xia, X., Lu, Y., Zhang, J., Wu, J., Xie, P., WANG, X., Xiao, X.: Hyper- SD: Trajectory segmented consistency model for efficient image synthesis. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024), https://openreview.net/forum?id=O5XbOoi0x3

2024
[46]

In: International Conference on Learning Representations (2022),https://openreview

Salimans, T., Ho, J.: Progressive distillation for fast sampling of diffusion models. In: International Conference on Learning Representations (2022),https://openreview. net/forum?id=TIdIXIpzhoI

2022
[47]

Sauer, A., Lorenz, D., Blattmann, A., Rombach, R.: Adversarial diffusion distillation (2023),https://arxiv.org/abs/2311.17042

arXiv 2023
[48]

In: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2022),https://openreview.net/forum?id=M3Y74vmsMcY

Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., et al.: LAION-5b: An open large-scale dataset for training next generation image-text models. In: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2022),https://openreview.net/forum?...

2022
[49]

Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Bi, X., Zhang, H., Zhang, M., Li, Y.K., Wu, Y., Guo, D.: Deepseekmath: Pushing the limits of mathematical reasoning in open language models (2024),https://arxiv.org/abs/2402.03300

Pith/arXiv arXiv 2024
[50]

In: An- nual Meeting of the Association for Computational Linguistics (2018), https: //aclanthology.org/P18-1238/

Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: An- nual Meeting of the Association for Computational Linguistics (2018), https: //aclanthology.org/P18-1238/

2018
[51]

In: Proceedings of the 32nd International Conference on Machine Learning

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsuper- vised learning using nonequilibrium thermodynamics. In: Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learn- ing Research, vol. 37, pp. 2256–2265. PMLR, Lille, France (07–09 Jul 2015), https://proceedings.mlr.press/v37/sohl-dickste...

2015
[52]

In: The Twelfth International Conference on Learning Representations (2024),https: //openreview.net/forum?id=WNzy9bRDvG

Song, Y., Dhariwal, P.: Improved techniques for training consistency models. In: The Twelfth International Conference on Learning Representations (2024),https: //openreview.net/forum?id=WNzy9bRDvG

2024
[53]

In: Proceedings of the 40th International Conference on Machine Learning

Song, Y., Dhariwal, P., Chen, M., Sutskever, I.: Consistency models. In: Proceedings of the 40th International Conference on Machine Learning. pp. 32211–32252 (2023), https://arxiv.org/abs/2303.01469

Pith/arXiv arXiv 2023
[54]

Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score- basedgenerativemodelingthroughstochasticdifferentialequations.In:International Conference on Learning Representations (2021),https://arxiv.org/abs/2011. 13456

2021
[55]

Machine learning3(1), 9–44 (1988),https://doi.org/10.1023/A:1022633531479

Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine learning3(1), 9–44 (1988),https://doi.org/10.1023/A:1022633531479

work page doi:10.1023/a:1022633531479 1988
[56]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need (2023),https://arxiv.org/abs/1706. 03762

2023
[57]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Wallace, B., Dang, M., Rafailov, R., Zhou, L., Lou, A., Purushwalkam, S., Ermon, S., Xiong, C., Joty, S., Naik, N.: Diffusion model alignment using direct preference optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 8228–8238 (June 2024),https://arxiv. org/abs/2311.12908

arXiv 2024
[58]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Wang, C., Guo, Z., Duan, Y., Li, H., Chen, N., Tang, X., Hu, Y.: Target-driven distillation: Consistency distillation with target timestep selection and decoupled guidance. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 7619–7627 (2025),https://arxiv.org/abs/2409.01347 Curvature-Adaptive Consistency Flow Matching 19

arXiv 2025
[59]

Advances in neural information processing systems37, 83951–84009 (2024),https://arxiv.org/abs/ 2405.18407

Wang, F.Y., Huang, Z., Bergman, A., Shen, D., Gao, P., Lingelbach, M., Sun, K., Bian, W., Song, G., Liu, Y., et al.: Phased consistency models. Advances in neural information processing systems37, 83951–84009 (2024),https://arxiv.org/abs/ 2405.18407

arXiv 2024
[60]

doi: 10.1007/BF00992698

Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning8(3–4), 279–292 (1992), https://doi.org/10.1007/BF00992698

work page doi:10.1007/bf00992698 1992
[61]

Wu, W., Li, Y., Chen, G., Wang, L., Chen, H.: Tool-augmented policy optimization: Synergizing reasoning and adaptive tool use with reinforcement learning (2026), https://arxiv.org/abs/2510.07038

arXiv 2026
[62]

Wu, X., Hao, Y., Sun, K., Chen, Y., Zhu, F., Zhao, R., Li, H.: Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis (2023),https://arxiv.org/abs/2306.09341

Pith/arXiv arXiv 2023
[63]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Xiao, C., Dou, J., Lin, Z., Ke, Z., Hou, L.: From points to coalitions: Hierarchical contrastive shapley values for prioritizing data samples. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 40, pp. 15995–16003 (2026),https: //ojs.aaai.org/index.php/AAAI/article/view/38633

2026
[64]

Xu, J., Liu, X., Wu, Y., Tong, Y., Li, Q., Ding, M., Tang, J., Dong, Y.: Imagereward: Learning and evaluating human preferences for text-to-image generation (2023), https://arxiv.org/abs/2304.05977

arXiv 2023
[65]

In: Advances in Neural Information Processing Systems

Yao,J.,Li,C.,Xiao,C.:Swiftsampler:Efficientlearningofsamplerby10parameters. In: Advances in Neural Information Processing Systems. vol. 37, pp. 59030–59053. Curran Associates, Inc. (2024),https://arxiv.org/abs/2410.05578

arXiv 2024
[66]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Yin, T., Gharbi, M., Zhang, R., Shechtman, E., Durand, F., Freeman, W.T., Park, T.: One-step diffusion with distribution matching distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6613–6623 (2024),https://arxiv.org/abs/2311.18828

arXiv 2024
[67]

In: 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

Yu, B., Liu, D., Shi, H., Chang, G., Wei, J., Sun, L., Tian, S., Bu, L.: Sam- wav2lip++: Enhancing behavioral realism in synthetic agents through audio-driven speech and action refinement. In: 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC). pp. 2999–3006 (2024),https://ieeexplore.ieee. org/abstract/document/10832087

arXiv 2024
[68]

Yu, Z., Chen, G., Lai, J., Li, B., Tian, S.: Branch scaling manifests as implicit ar- chitectural regularization for improving generalization in overparameterized resnets (2026),https://arxiv.org/abs/2403.04545

Pith/arXiv arXiv 2026
[69]

Yu, Z., Rao, J., Chen, G., Tian, S., Li, B., Wei, J., Zhang, M., Meng, X.: Mathagent: Adversarial evolution of constraint graphs for mathematical reasoning data synthesis (2026),https://arxiv.org/abs/2604.11188

Pith/arXiv arXiv 2026
[70]

In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=VEJzjAvaIy

Yu, Z., Tian, S., Chen, G.: Divergence of neural tangent kernel in classification problems. In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=VEJzjAvaIy

2025
[71]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, J., Lu, G.: Vision-language embodiment for monocular depth estimation. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 29479–29489 (2025),https://arxiv.org/abs/2503.16535

arXiv 2025
[72]

Zhao, Z., Yang, H., Liao, B., Zeng, Y., Yan, S., Gu, Y., Liu, P., Zhou, Y., Li, H., Civera, J.: Advances in global solvers for 3d vision (2026),https://arxiv.org/ abs/2602.14662

arXiv 2026
[73]

Tian et al

Zheng, J., Hu, M., Fan, Z., Wang, C., Ding, C., Tao, D., Cham, T.J.: Trajectory consistency distillation: Improved latent consistency distillation by semi-linear consistency function with trajectory mapping (2024),https://arxiv.org/abs/ 2402.19159 20 S. Tian et al. Curvature-Adaptive Consistency Flow Matching: Autonomous Trajectory Optimization via Reinfo...

arXiv 2024

[1] [1]

Albergo, M.S., Vanden-Eijnden, E.: Building normalizing flows with stochastic interpolants.In:The Eleventh InternationalConference onLearning Representations (2023),https://openreview.net/forum?id=li7qeBbCR1t

2023

[2] [2]

Random Forests,

Auer, P., Cesa-Bianchi, N., Fischer, P.: Finite-time analysis of the multiarmed bandit problem. Machine learning47(2), 235–256 (2002),https://doi.org/10.1023/A: 1013689704352

work page doi:10.1023/a: 2002

[3] [3]

In: Advances in Neural Information Processing Systems

Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J.D., Dhariwal, P., Nee- lakantan, A., Shyam, P., Sastry, G., Askell, A., et al.: Language models are few-shot learners. In: Advances in Neural Information Processing Systems. vol. 33, pp. 1877–1901 (2020),https://arxiv.org/abs/2005.14165

Pith/arXiv arXiv 1901

[4] [4]

In: Conference on Computer Vision and Pattern Recognition

Changpinyo, S., Sharma, P., Ding, N., Soricut, R.: Conceptual 12m: Pushing web- scale image-text pre-training to recognize long-tail visual concepts. In: Conference on Computer Vision and Pattern Recognition. pp. 3558–3568 (2021),https:// ieeexplore.ieee.org/document/9578388

arXiv 2021

[5] [5]

In: The Thirty-eighth Annual Conference on Neural Informa- tion Processing Systems (2024),https://openreview.net/forum?id=ni3Ud2BV3G

Chen, G., Li, Y., Lin, Q.: On the impacts of the random initialization in the neural tangent kernel theory. In: The Thirty-eighth Annual Conference on Neural Informa- tion Processing Systems (2024),https://openreview.net/forum?id=ni3Ud2BV3G

2024

[6] [6]

In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)

Chen, J., Xue, S., Zhao, Y., Yu, J., Paul, S., Chen, J., Cai, H., Han, S., Xie, E.: Sana-sprint: One-step diffusion with continuous-time consistency distillation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 16185–16195 (October 2025),https://doi.org/10.1109/ICCV51701. 2025.01502

work page doi:10.1109/iccv51701 2025

[7] [7]

In: International Conference on Learning Representations (2021),https://openreview

Dosovitskiy, A., Beyer, L., Kolesnikov, A., Weissenborn, D., Zhai, X., Unterthiner, T., Dehghani, M., Minderer, M., Heigold, G., Gelly, S., Uszkoreit, J., Houlsby, N.: An image is worth 16x16 words: Transformers for image recognition at scale. In: International Conference on Learning Representations (2021),https://openreview. net/forum?id=YicbFdNTTy

2021

[8] [8]

In: Forty- first International Conference on Machine Learning (2024),https://openreview

Esser, P., Kulal, S., Blattmann, A., Entezari, R., Müller, J., Saini, H., Levi, Y., Lorenz, D., Sauer, A., Boesel, F., Podell, D., Dockhorn, T., English, Z., Rombach, R.: Scaling rectified flow transformers for high-resolution image synthesis. In: Forty- first International Conference on Machine Learning (2024),https://openreview. net/forum?id=FPnUhsQJ5B

2024

[9] [9]

Fan, Y., Lee, K.: Optimizing ddpm sampling with shortcut fine-tuning (2024), https://arxiv.org/abs/2301.13362

arXiv 2024

[10] [10]

Fan, Y., Watkins, O., Du, Y., Liu, H., Ryu, M., Boutilier, C., Abbeel, P., Ghavamzadeh, M., Lee, K., Lee, K.: Dpok: Reinforcement learning for fine-tuning text-to-image diffusion models (2023),https://arxiv.org/abs/2305.16381

arXiv 2023

[11] [11]

Generative adversarial networks.Commun

Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., Bengio, Y.: Generative adversarial networks. Communications of the ACM63(11), 139–144 (2020),https://doi.org/10.1145/3422622

work page doi:10.1145/3422622 2020

[12] [12]

He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition (2015),https://arxiv.org/abs/1512.03385

Pith/arXiv arXiv 2015

[13] [13]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings

He, Y., Zhang, C., Chen, F., Cao, J.: Cinematte: Background matting for virtual production and beyond. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Findings. pp. 8725–8735 (June 2026), https://arxiv.org/abs/2605.18328

Pith/arXiv arXiv 2026

[14] [14]

In: Mul- 16 S

He, Y., Zhang, W., Deng, J., Cong, Y.: Prior-knowledge-free video frame inter- polation with bidirectional regularized implicit neural representations. In: Mul- 16 S. Tian et al. tiMedia Modeling. pp. 112–126. Springer Nature Switzerland, Cham (2024), https://link.springer.com/chapter/10.1007/978-3-031-53311-2_9

work page doi:10.1007/978-3-031-53311-2_9 2024

[15] [15]

Advances in Neural Information Processing Systems30(2017), https://dl.acm.org/doi/10

Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in Neural Information Processing Systems30(2017), https://dl.acm.org/doi/10. 5555/3295222.3295408

arXiv 2017

[16] [16]

Hinton, G., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network (2015),https://arxiv.org/abs/1503.02531

Pith/arXiv arXiv 2015

[17] [17]

Advances in neural information processing systems33, 6840–6851 (2020),https://arxiv.org/ abs/2006.11239

Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Advances in neural information processing systems33, 6840–6851 (2020),https://arxiv.org/ abs/2006.11239

Pith/arXiv arXiv 2020

[18] [18]

Ho, J., Salimans, T.: Classifier-free diffusion guidance (2022),https://arxiv.org/ abs/2207.12598

Pith/arXiv arXiv 2022

[19] [20]

Advances in Neural Information Processing Systems35, 26565–26577 (2022),https://dl.acm.org/doi/10.5555/3600270.3602196

Karras, T., Aittala, M., Aila, T., Laine, S.: Elucidating the design space of diffusion- based generative models. Advances in Neural Information Processing Systems35, 26565–26577 (2022),https://dl.acm.org/doi/10.5555/3600270.3602196

work page doi:10.5555/3600270.3602196 2022

[20] [21]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=ymjI8feDTD

Kim, D., Lai, C.H., Liao, W.H., Murata, N., Takida, Y., Uesaka, T., He, Y., Mitsufuji, Y., Ermon, S.: Consistency trajectory models: Learning probability flow ODE trajectory of diffusion. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=ymjI8feDTD

2024

[21] [22]

In: Thirty- seventh Conference on Neural Information Processing Systems (2023), https: //openreview.net/forum?id=G5RwHpBUv0

Kirstain, Y., Polyak, A., Singer, U., Matiana, S., Penna, J., Levy, O.: Pick-a- pic: An open dataset of user preferences for text-to-image generation. In: Thirty- seventh Conference on Neural Information Processing Systems (2023), https: //openreview.net/forum?id=G5RwHpBUv0

2023

[22] [23]

Labs, B.F.: Flux.https://github.com/black-forest-labs/flux(2024)

2024

[23] [24]

Lai, J., Yu, Z., Tian, S., Lin, Q.: Generalization ability of wide residual networks (2023),https://arxiv.org/abs/2305.18506

arXiv 2023

[24] [25]

Journal of Machine Learning Research25(82), 1–47 (2024),http://jmlr.org/papers/v25/ 23-0866.html

Li, Y., Yu, Z., Chen, G., Lin, Q.: On the eigenvalue decay rates of a class of neural-network related kernel functions defined on general domains. Journal of Machine Learning Research25(82), 1–47 (2024),http://jmlr.org/papers/v25/ 23-0866.html

2024

[25] [26]

Li, Y., Yang, C., Dong, J., Yao, Z., Xu, H., Dong, Z., Zeng, H., An, Z., Tian, Y.: Ammkd: Adaptive multimodal multi-teacher distillation for lightweight vision- language models (2025),https://arxiv.org/abs/2509.00039

arXiv 2025

[26] [27]

2025 , url =

Li, Y., Yang, C., Zeng, H., Dong, Z., An, Z., Xu, Y., Tian, Y., Wu, H.: Frequency- aligned knowledge distillation for lightweight spatiotemporal forecasting. In: Pro- ceedings of the IEEE/CVF International Conference on Computer Vision. pp. 7262–7272 (2025),https://doi.org/10.1109/ICCV51701.2025.00682

work page doi:10.1109/iccv51701.2025.00682 2025

[27] [28]

In: AAAI

Li, Z., Hu, Y., Chen, Z., Huang, Q., Qiu, G., Fu, Z., Liu, M.: Retrack: Evidence- driven dual-stream directional anchor calibration network for composed video retrieval. In: AAAI. vol. 40, pp. 23373–23381 (2026),https://dl.acm.org/doi/ abs/10.1145/3796712

work page doi:10.1145/3796712 2026

[28] [29]

Liang, G., Wang, Z., Hu, J., Zhou, H., Xue, Z., Zhang, J., Xu, D., Yu, Q.: Render- in-the-loop: Vector graphics generation via visual self-feedback (2026), https: //arxiv.org/abs/2604.20730

Pith/arXiv arXiv 2026

[29] [30]

Liang, G., Wang, Z., Wang, C., Hu, J., Zhou, H., Liu, J., Zhang, J., Xu, D., Yu, Q.: Vanim: Rendering-aware sparse state modeling for structure-preserving vector animation (2026),https://arxiv.org/abs/2605.01517 Curvature-Adaptive Consistency Flow Matching 17

Pith/arXiv arXiv 2026

[30] [31]

In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Lin, J., Zhang, J., Jin, G., Song, W., Liu, T., Lu, G.: 3d plant root skeleton detection and extraction. In: 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 3011–3017. IEEE (2025),https://arxiv.org/abs/2508. 08094

2025

[31] [32]

Lin, S., Wang, A., Yang, X.: Sdxl-lightning: Progressive adversarial diffusion distil- lation (2024),https://arxiv.org/abs/2402.13929

Pith/arXiv arXiv 2024

[32] [33]

In: The Eleventh International Conference on Learning Representations (2023),https://openreview.net/forum?id=PqvMRDCJT9t

Lipman, Y., Chen, R.T.Q., Ben-Hamu, H., Nickel, M., Le, M.: Flow matching for generative modeling. In: The Eleventh International Conference on Learning Representations (2023),https://openreview.net/forum?id=PqvMRDCJT9t

2023

[33] [34]

Liu, X., Gong, C., Liu, Q.: Flow straight and fast: Learning to generate and transfer data with rectified flow (2022),https://arxiv.org/abs/2209.03003

Pith/arXiv arXiv 2022

[34] [35]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum? id=1k4yZbbDqX

Liu, X., Zhang, X., Ma, J., Peng, J., qiang liu: Instaflow: One step is enough for high-quality diffusion-based text-to-image generation. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum? id=1k4yZbbDqX

2024

[35] [36]

In: Advances in Neural Information Processing Systems (2022),https://openreview.net/forum? id=2uAaGwlP_V

Lu, C., Zhou, Y., Bao, F., Chen, J., Li, C., Zhu, J.: DPM-solver: A fast ODE solver for diffusion probabilistic model sampling in around 10 steps. In: Advances in Neural Information Processing Systems (2022),https://openreview.net/forum? id=2uAaGwlP_V

2022

[36] [37]

2025 , url =

Lu, Y., Ren, Y., Xia, X., Lin, S., Wang, X., Xiao, X., Ma, A.J., Xie, X., Lai, J.H.: Adversarial distribution matching for diffusion distillation towards efficient image and video synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). pp. 16818–16829 (October 2025),https://doi.org/ 10.1109/ICCV51701.2025.01562

work page doi:10.1109/iccv51701.2025.01562 2025

[37] [38]

Luo, S., Tan, Y., Huang, L., Li, J., Zhao, H.: Latent consistency models: Synthesizing high-resolution images with few-step inference (2023),https://arxiv.org/abs/ 2310.04378

Pith/arXiv arXiv 2023

[38] [39]

Mai, J., Liao, B., Zhao, Z., Zeng, Y., Li, H., Civera, J., Wu, T., Zhou, Y., Liu, P.: Neural predictor-corrector: Solving homotopy problems with reinforcement learning (2026),https://arxiv.org/abs/2602.03086

arXiv 2026

[39] [40]

org/abs/2404.03673

Oertell, O., Chang, J.D., Zhang, Y., Brantley, K., Sun, W.: Rl for consistency models: Faster reward guided text-to-image generation (2024),https://arxiv. org/abs/2404.03673

arXiv 2024

[40] [41]

In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=di52zR8xgf

Podell, D., English, Z., Lacey, K., Blattmann, A., Dockhorn, T., Müller, J., Penna, J., Rombach, R.: SDXL: Improving latent diffusion models for high-resolution image synthesis. In: The Twelfth International Conference on Learning Representations (2024),https://openreview.net/forum?id=di52zR8xgf

2024

[41] [42]

In: Proceedings of the 38th International Conference on Machine Learning

Radford, A., Kim, J.W., Hallacy, C., Ramesh, A., Goh, G., Agarwal, S., Sastry, G., Askell, A., Mishkin, P., Clark, J., Krueger, G., Sutskever, I.: Learning transferable visual models from natural language supervision. In: Proceedings of the 38th International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 139, pp. 8748–8763...

2021

[42] [43]

Rao, J., Liu, X., Deng, H., Lin, Z., Yu, Z., Wei, J., Meng, X., Zhang, M.: Dy- namic sampling that adapts: Self-aware iterative data persistent optimization for mathematical reasoning (2026),https://arxiv.org/abs/2505.16176

Pith/arXiv arXiv 2026

[43] [44]

TechRxiv2025(1220) (2025),https: //www.techrxiv.org/doi/abs/10.36227/techrxiv.176620610.03288677/v1 18 S

Rao, J., Liu, X., Yan, H., Shen, J., Mo, H., Dong, Y., Yan, Z., Wang, Z., Lin, Z., Meng, X., Yu, Z., Deng, L., Wei, J., Wang, Y., Zhang, M.: A data-centric perspective on the lifecycle of large language models. TechRxiv2025(1220) (2025),https: //www.techrxiv.org/doi/abs/10.36227/techrxiv.176620610.03288677/v1 18 S. Tian et al

work page doi:10.36227/techrxiv.176620610.03288677/v1 2025

[44] [45]

In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024), https://openreview.net/forum?id=O5XbOoi0x3

Ren, Y., Xia, X., Lu, Y., Zhang, J., Wu, J., Xie, P., WANG, X., Xiao, X.: Hyper- SD: Trajectory segmented consistency model for efficient image synthesis. In: The Thirty-eighth Annual Conference on Neural Information Processing Systems (2024), https://openreview.net/forum?id=O5XbOoi0x3

2024

[45] [46]

In: International Conference on Learning Representations (2022),https://openreview

Salimans, T., Ho, J.: Progressive distillation for fast sampling of diffusion models. In: International Conference on Learning Representations (2022),https://openreview. net/forum?id=TIdIXIpzhoI

2022

[46] [47]

Sauer, A., Lorenz, D., Blattmann, A., Rombach, R.: Adversarial diffusion distillation (2023),https://arxiv.org/abs/2311.17042

arXiv 2023

[47] [48]

In: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2022),https://openreview.net/forum?id=M3Y74vmsMcY

Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., et al.: LAION-5b: An open large-scale dataset for training next generation image-text models. In: Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (2022),https://openreview.net/forum?...

2022

[48] [49]

Shao, Z., Wang, P., Zhu, Q., Xu, R., Song, J., Bi, X., Zhang, H., Zhang, M., Li, Y.K., Wu, Y., Guo, D.: Deepseekmath: Pushing the limits of mathematical reasoning in open language models (2024),https://arxiv.org/abs/2402.03300

Pith/arXiv arXiv 2024

[49] [50]

In: An- nual Meeting of the Association for Computational Linguistics (2018), https: //aclanthology.org/P18-1238/

Sharma, P., Ding, N., Goodman, S., Soricut, R.: Conceptual captions: A cleaned, hypernymed, image alt-text dataset for automatic image captioning. In: An- nual Meeting of the Association for Computational Linguistics (2018), https: //aclanthology.org/P18-1238/

2018

[50] [51]

In: Proceedings of the 32nd International Conference on Machine Learning

Sohl-Dickstein, J., Weiss, E., Maheswaranathan, N., Ganguli, S.: Deep unsuper- vised learning using nonequilibrium thermodynamics. In: Proceedings of the 32nd International Conference on Machine Learning. Proceedings of Machine Learn- ing Research, vol. 37, pp. 2256–2265. PMLR, Lille, France (07–09 Jul 2015), https://proceedings.mlr.press/v37/sohl-dickste...

2015

[51] [52]

In: The Twelfth International Conference on Learning Representations (2024),https: //openreview.net/forum?id=WNzy9bRDvG

Song, Y., Dhariwal, P.: Improved techniques for training consistency models. In: The Twelfth International Conference on Learning Representations (2024),https: //openreview.net/forum?id=WNzy9bRDvG

2024

[52] [53]

In: Proceedings of the 40th International Conference on Machine Learning

Song, Y., Dhariwal, P., Chen, M., Sutskever, I.: Consistency models. In: Proceedings of the 40th International Conference on Machine Learning. pp. 32211–32252 (2023), https://arxiv.org/abs/2303.01469

Pith/arXiv arXiv 2023

[53] [54]

Song, Y., Sohl-Dickstein, J., Kingma, D.P., Kumar, A., Ermon, S., Poole, B.: Score- basedgenerativemodelingthroughstochasticdifferentialequations.In:International Conference on Learning Representations (2021),https://arxiv.org/abs/2011. 13456

2021

[54] [55]

Machine learning3(1), 9–44 (1988),https://doi.org/10.1023/A:1022633531479

Sutton, R.S.: Learning to predict by the methods of temporal differences. Machine learning3(1), 9–44 (1988),https://doi.org/10.1023/A:1022633531479

work page doi:10.1023/a:1022633531479 1988

[55] [56]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, L., Polosukhin, I.: Attention is all you need (2023),https://arxiv.org/abs/1706. 03762

2023

[56] [57]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Wallace, B., Dang, M., Rafailov, R., Zhou, L., Lou, A., Purushwalkam, S., Ermon, S., Xiong, C., Joty, S., Naik, N.: Diffusion model alignment using direct preference optimization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 8228–8238 (June 2024),https://arxiv. org/abs/2311.12908

arXiv 2024

[57] [58]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Wang, C., Guo, Z., Duan, Y., Li, H., Chen, N., Tang, X., Hu, Y.: Target-driven distillation: Consistency distillation with target timestep selection and decoupled guidance. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 39, pp. 7619–7627 (2025),https://arxiv.org/abs/2409.01347 Curvature-Adaptive Consistency Flow Matching 19

arXiv 2025

[58] [59]

Advances in neural information processing systems37, 83951–84009 (2024),https://arxiv.org/abs/ 2405.18407

Wang, F.Y., Huang, Z., Bergman, A., Shen, D., Gao, P., Lingelbach, M., Sun, K., Bian, W., Song, G., Liu, Y., et al.: Phased consistency models. Advances in neural information processing systems37, 83951–84009 (2024),https://arxiv.org/abs/ 2405.18407

arXiv 2024

[59] [60]

doi: 10.1007/BF00992698

Watkins, C.J.C.H., Dayan, P.: Q-learning. Machine Learning8(3–4), 279–292 (1992), https://doi.org/10.1007/BF00992698

work page doi:10.1007/bf00992698 1992

[60] [61]

Wu, W., Li, Y., Chen, G., Wang, L., Chen, H.: Tool-augmented policy optimization: Synergizing reasoning and adaptive tool use with reinforcement learning (2026), https://arxiv.org/abs/2510.07038

arXiv 2026

[61] [62]

Wu, X., Hao, Y., Sun, K., Chen, Y., Zhu, F., Zhao, R., Li, H.: Human preference score v2: A solid benchmark for evaluating human preferences of text-to-image synthesis (2023),https://arxiv.org/abs/2306.09341

Pith/arXiv arXiv 2023

[62] [63]

In: Proceedings of the AAAI Conference on Artificial Intelligence

Xiao, C., Dou, J., Lin, Z., Ke, Z., Hou, L.: From points to coalitions: Hierarchical contrastive shapley values for prioritizing data samples. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 40, pp. 15995–16003 (2026),https: //ojs.aaai.org/index.php/AAAI/article/view/38633

2026

[63] [64]

Xu, J., Liu, X., Wu, Y., Tong, Y., Li, Q., Ding, M., Tang, J., Dong, Y.: Imagereward: Learning and evaluating human preferences for text-to-image generation (2023), https://arxiv.org/abs/2304.05977

arXiv 2023

[64] [65]

In: Advances in Neural Information Processing Systems

Yao,J.,Li,C.,Xiao,C.:Swiftsampler:Efficientlearningofsamplerby10parameters. In: Advances in Neural Information Processing Systems. vol. 37, pp. 59030–59053. Curran Associates, Inc. (2024),https://arxiv.org/abs/2410.05578

arXiv 2024

[65] [66]

In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

Yin, T., Gharbi, M., Zhang, R., Shechtman, E., Durand, F., Freeman, W.T., Park, T.: One-step diffusion with distribution matching distillation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 6613–6623 (2024),https://arxiv.org/abs/2311.18828

arXiv 2024

[66] [67]

In: 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC)

Yu, B., Liu, D., Shi, H., Chang, G., Wei, J., Sun, L., Tian, S., Bu, L.: Sam- wav2lip++: Enhancing behavioral realism in synthetic agents through audio-driven speech and action refinement. In: 2024 IEEE International Conference on Systems, Man, and Cybernetics (SMC). pp. 2999–3006 (2024),https://ieeexplore.ieee. org/abstract/document/10832087

arXiv 2024

[67] [68]

Yu, Z., Chen, G., Lai, J., Li, B., Tian, S.: Branch scaling manifests as implicit ar- chitectural regularization for improving generalization in overparameterized resnets (2026),https://arxiv.org/abs/2403.04545

Pith/arXiv arXiv 2026

[68] [69]

Yu, Z., Rao, J., Chen, G., Tian, S., Li, B., Wei, J., Zhang, M., Meng, X.: Mathagent: Adversarial evolution of constraint graphs for mathematical reasoning data synthesis (2026),https://arxiv.org/abs/2604.11188

Pith/arXiv arXiv 2026

[69] [70]

In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=VEJzjAvaIy

Yu, Z., Tian, S., Chen, G.: Divergence of neural tangent kernel in classification problems. In: The Thirteenth International Conference on Learning Representations (2025),https://openreview.net/forum?id=VEJzjAvaIy

2025

[70] [71]

In: Proceedings of the Computer Vision and Pattern Recognition Conference

Zhang, J., Lu, G.: Vision-language embodiment for monocular depth estimation. In: Proceedings of the Computer Vision and Pattern Recognition Conference. pp. 29479–29489 (2025),https://arxiv.org/abs/2503.16535

arXiv 2025

[71] [72]

Zhao, Z., Yang, H., Liao, B., Zeng, Y., Yan, S., Gu, Y., Liu, P., Zhou, Y., Li, H., Civera, J.: Advances in global solvers for 3d vision (2026),https://arxiv.org/ abs/2602.14662

arXiv 2026

[72] [73]

Tian et al

Zheng, J., Hu, M., Fan, Z., Wang, C., Ding, C., Tao, D., Cham, T.J.: Trajectory consistency distillation: Improved latent consistency distillation by semi-linear consistency function with trajectory mapping (2024),https://arxiv.org/abs/ 2402.19159 20 S. Tian et al. Curvature-Adaptive Consistency Flow Matching: Autonomous Trajectory Optimization via Reinfo...

arXiv 2024