arxiv: 2604.13685 · v1 · submitted 2026-04-15 · 💻 cs.HC · cs.LG

Recognition: unknown

EMGFlow: Robust and Efficient Surface Electromyography Synthesis via Flow Matching

Boxuan Jiang , Chenyun Dai , Can Han

Authors on Pith no claims yet

Pith reviewed 2026-05-10 12:46 UTC · model grok-4.3

classification 💻 cs.HC cs.LG

keywords surface electromyographyflow matchingsynthetic data generationgesture recognitiondata augmentationgenerative modelingmyoelectric control

0 comments

The pith

Flow matching generates synthetic sEMG signals that improve gesture recognition more reliably than GANs or diffusion models.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents EMGFlow, a conditional generative framework that applies flow matching to create realistic surface electromyography recordings for training deep learning models on hand gestures. It establishes that continuous-time flow matching yields higher feature fidelity and distributional match to real signals than GAN baselines while delivering stronger downstream recognition accuracy under train-on-synthetic test-on-real evaluation. The work also shows that targeted solver choices and time sampling improve the speed-quality balance over diffusion approaches. A reader would care because data scarcity and subject variability currently limit reliable myoelectric control systems; better synthetic data could reduce the need for extensive real recordings from many individuals.

Core claim

EMGFlow is the first application of flow matching to sEMG synthesis. The model learns continuous vector fields that transport a simple noise distribution into conditional distributions of real sEMG time series. Across three public benchmark datasets the generated signals match real data more closely in both hand-crafted features and geometric distribution measures than conventional augmentation or GAN methods, and models trained solely on the synthetic data achieve higher recognition accuracy on held-out real recordings than models trained on diffusion-generated data.

What carries the argument

Flow matching, a continuous-time generative process that trains a neural network to predict the vector field directing probability mass from noise to data along an ordinary differential equation path.

If this is right

Synthetic sEMG produced by EMGFlow can augment limited real datasets and raise recognition accuracy under the TSTR protocol.
The method supplies a more stable training alternative to GANs for biosignal generation.
Optimized numerical integration and sampling schedules reduce the compute cost of producing usable synthetic signals.
Flow matching becomes a practical option for addressing data bottlenecks in other myoelectric applications.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same flow-matching recipe could be tested on other physiological time series such as EEG or ECG where data scarcity is also common.
On-device generation of fresh synthetic examples might become feasible if the efficiency gains scale with smaller models.
Adaptation experiments could check whether a single EMGFlow model can be fine-tuned quickly for new gesture vocabularies without collecting fresh real data.

Load-bearing premise

That results from the unified protocol on three fixed benchmark datasets will hold when the synthetic data are used with new subjects, different electrode hardware, or real-time myoelectric control loops.

What would settle it

An experiment that trains a gesture classifier on EMGFlow synthetic data from one set of subjects and measures whether its accuracy on recordings from entirely new subjects and a different sEMG acquisition system equals or exceeds the accuracy obtained from training on real data from the same new subjects.

Figures

Figures reproduced from arXiv: 2604.13685 by Boxuan Jiang, Can Han, Chenyun Dai.

**Figure 1.** Figure 1: Overview of the proposed EMGFlow pipeline. The framework consists of four stages: (a) sEMG data acquisition and sliding-window [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗

**Figure 2.** Figure 2: Single-column comparison of feature-based fidelity metrics across datasets. Each panel reports one metric, and the asterisk marks the [PITH_FULL_IMAGE:figures/full_fig_p011_2.png] view at source ↗

**Figure 3.** Figure 3: Summary of guidance effects on DB7 using EMGHandNet. (a) Feature-based fidelity metrics. (b) TSTR utility. (c) Augmentation utility. (d) PRDC metrics. (e) Neighborhood-based realism diagnostics, with the train-test gap shown on the right axis. (f) Prototype-concentration diagnostics. Stronger guidance improves class-discriminative and local-realism metrics, but reduces coverage and downstream utility. Orig… view at source ↗

**Figure 4.** Figure 4: t-SNE visualizations of real (blue) and generated (red) samples for one subject from (a) DB4, (b) DB7, and (c) DB2, where di [PITH_FULL_IMAGE:figures/full_fig_p012_4.png] view at source ↗

**Figure 5.** Figure 5: Examples of real (left) and generated (right) multi-channel sEMG windows from (a) DB7 and (b) DB4. The generated signals re [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗

**Figure 6.** Figure 6: FID trajectories of the diffusion baseline and EMGFlow during training on DB4 and DB7. EMGFlow reduces FID more rapidly in the early stage and reaches a stable low-FID regime with fewer training steps on both datasets. This trend suggests that Flow Matching is not only efficient at inference time, but also easier to optimize under the present training setup. We next compare three ODE solvers within FM itse… view at source ↗

**Figure 7.** Figure 7: Comparison of Euler, Heun, and RK4 within EMGFlow under matched numbers of function evaluations (NFE) on DB7. When the [PITH_FULL_IMAGE:figures/full_fig_p015_7.png] view at source ↗

**Figure 8.** Figure 8: Effect of time sampling strategy on DB2. The five panels report FID, IS, CAS, augmentation accuracy, and TSTR accuracy, respectively. Compared with uniform sampling, logit-normal time sampling improves all reported metrics, with the largest gain appearing under the more stringent TSTR setting. This result suggests that emphasizing the intermediate portion of the flow trajectory provides a more useful train… view at source ↗

read the original abstract

Deep learning-based surface electromyography (sEMG) gesture recognition is frequently bottlenecked by data scarcity and limited subject diversity. While synthetic data generation via Generative Adversarial Networks (GANs) and diffusion models has emerged as a promising augmentation strategy, these approaches often face challenges regarding training stability or inference efficiency. To bridge this gap, we propose EMGFlow, a conditional sEMG generation framework. To the best of our knowledge, this is the first study to investigate the application of Flow Matching (FM) and continuous-time generative modeling in the sEMG domain. To validate EMGFlow across three benchmark sEMG datasets, we employ a unified evaluation protocol integrating feature-based fidelity, distributional geometry, and downstream utility. Extensive evaluations show that EMGFlow outperforms conventional augmentation and GAN baselines, and provides stronger standalone utility than the diffusion baselines considered here under the train-on-synthetic test-on-real (TSTR) protocol. Furthermore, by optimizing generation dynamics through advanced numerical solvers and targeted time sampling, EMGFlow achieves improved quality-efficiency trade-offs. Taken together, these results suggest that Flow Matching is a promising and efficient paradigm for addressing data bottlenecks in myoelectric control systems. Our code is available at: https://github.com/Open-EXG/EMGFlow.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EMGFlow is the first flow-matching approach for sEMG synthesis and reports better efficiency plus TSTR utility than the diffusion baselines it compares against, but the abstract supplies no numbers and the generalization story stays thin.

read the letter

The paper's main contribution is applying flow matching to conditional sEMG generation and claiming it beats both GANs and diffusion models on a unified protocol of feature fidelity, distributional metrics, and train-on-synthetic test-on-real downstream accuracy across three public datasets. They also highlight faster sampling via better ODE solvers and time-step choices, and they release code. That combination of a new generative technique for this signal type plus practical efficiency is the part worth noting for people who need more training data for myoelectric interfaces. The evaluation design itself is reasonable: mixing low-level signal checks with actual classifier utility under TSTR gives a clearer picture than pure generation metrics alone. The soft spots sit where the abstract stops. No quantitative deltas, no error bars, and no architecture or training details appear, so the size of the claimed gains is impossible to judge from the summary. The stress-test concern about subject and hardware generalization also lands. sEMG changes sharply across users and sensors; if the TSTR splits stay intra-subject or within the same recording setup, the reported utility advantage may not survive real deployment where new anatomy and hardware appear. The paper would read stronger with an explicit leave-one-subject-out breakdown or at least a clear statement on how the splits were constructed. This work is aimed at HCI and biosignal researchers who already work on gesture recognition or prosthetic control and need augmentation tools. A reader in that niche can extract the comparisons and the released code even if the headline numbers need verification. It is coherent on its own terms and shows honest engagement with the relevant baselines, so it clears the bar for serious refereeing despite the missing details.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces EMGFlow, a conditional generative model based on flow matching for synthesizing surface electromyography (sEMG) signals. It claims to be the first application of flow matching in the sEMG domain, demonstrates superior performance over GAN and diffusion baselines in feature fidelity, distributional metrics, and train-on-synthetic test-on-real (TSTR) downstream utility across three benchmark datasets, while achieving better efficiency through optimized solvers.

Significance. If the performance claims hold under rigorous cross-subject and cross-hardware evaluation, this work could provide an efficient alternative to diffusion models for data augmentation in myoelectric control, addressing data scarcity and subject variability issues. The open-sourcing of code is a positive aspect.

major comments (2)

[Evaluation Protocol] Evaluation Protocol section: The TSTR protocol is described at a high level without specifying whether splits are leave-one-subject-out, intra-subject, or include explicit hardware/sensor variations. This detail is load-bearing for the central claim of 'stronger standalone utility' and robustness, given that sEMG signals vary strongly across subjects and hardware.
[Abstract and §4] Abstract and §4 (Results): The abstract and high-level evaluation summary report outperformance without quantitative metrics, error bars, or architecture/training details in the provided overview; while tables likely exist in the full text, the lack of these in the summary presentation weakens immediate assessment of the magnitude of gains over diffusion baselines.

minor comments (2)

[Throughout] Ensure all acronyms (sEMG, FM, TSTR) are defined on first use and used consistently.
[Figures] Figure captions should explicitly describe what each panel shows (e.g., real vs. synthetic waveforms) for clarity in qualitative comparisons.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for their constructive feedback on our manuscript. We address each major comment below, providing clarifications and committing to revisions that strengthen the presentation of our evaluation protocol and results without altering the core claims.

read point-by-point responses

Referee: [Evaluation Protocol] Evaluation Protocol section: The TSTR protocol is described at a high level without specifying whether splits are leave-one-subject-out, intra-subject, or include explicit hardware/sensor variations. This detail is load-bearing for the central claim of 'stronger standalone utility' and robustness, given that sEMG signals vary strongly across subjects and hardware.

Authors: We agree that explicit specification of the data splits is essential for assessing robustness and generalizability in sEMG synthesis, particularly due to inter-subject and hardware variability. The full manuscript (Section 3.3) describes a unified protocol across the three datasets, but we will revise the Evaluation Protocol section to explicitly detail: (i) leave-one-subject-out splits for cross-subject evaluation on all datasets, (ii) intra-subject random splits for within-subject utility where reported, and (iii) notes on hardware/sensor configurations (e.g., electrode placement and sampling rates) for each benchmark. This will directly support the TSTR utility claims and address the concern. revision: yes
Referee: [Abstract and §4] Abstract and §4 (Results): The abstract and high-level evaluation summary report outperformance without quantitative metrics, error bars, or architecture/training details in the provided overview; while tables likely exist in the full text, the lack of these in the summary presentation weakens immediate assessment of the magnitude of gains over diffusion baselines.

Authors: We acknowledge that the abstract and the opening of §4 present high-level qualitative statements of outperformance. Quantitative results, including metrics with standard deviations (error bars), are reported in Tables 1–4 and Figures 2–5, with architecture and training details in §3.2 and the appendix. To improve accessibility, we will revise the abstract to include specific quantitative gains (e.g., relative improvements in FID scores and TSTR accuracy) and add a concise summary of key metrics with error bars at the start of §4. This partial update enhances the summary presentation while retaining the detailed tables. revision: partial

Circularity Check

0 steps flagged

No circularity: empirical application of external flow matching framework

full rationale

The paper presents EMGFlow as an application of the pre-existing flow matching generative modeling technique to the sEMG synthesis task. No derivation chain is claimed that reduces a result to its own inputs by construction; performance claims rest on standard empirical benchmarks (feature fidelity, distributional metrics, and TSTR utility) evaluated against external baselines on three public datasets. The unified protocol and numerical solver optimizations are implementation details, not self-referential definitions or fitted quantities renamed as predictions. Self-citations, if present, are not load-bearing for the core claims, which remain falsifiable against independent data splits and prior GAN/diffusion work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The approach rests on the standard flow matching generative framework and assumptions about sEMG signal properties being amenable to continuous normalizing flows; no new entities or ad-hoc parameters are introduced in the abstract.

axioms (1)

domain assumption Flow matching yields stable training and efficient sampling for conditional generation of time-series signals
Invoked when positioning FM as superior to GANs and diffusion for sEMG.

pith-pipeline@v0.9.0 · 5521 in / 1276 out tokens · 30179 ms · 2026-05-10T12:46:41.481218+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

44 extracted references · 13 canonical work pages

[1]

Kaifosh, T

P. Kaifosh, T. R. Reardon, A generic non-invasive neuromotor interface for human-computer interaction, Nature (2025) 1–10

2025
[2]

J. Xu, R. Wang, S. Shang, A. Chen, L. Winterbottom, T.-L. Hsu, W. Chen, K. Ahmed, P. L. La Rotta, X. Zhu, et al., Chatemg: Synthetic data generation to control a robotic hand orthosis for stroke, IEEE Robotics and Automation Letters (2024)

2024
[3]

Sivakumar, J

V . Sivakumar, J. Seely, A. Du, S. Bittner, A. Berenzweig, A. Bolarinwa, A. Gramfort, M. Mandel, emg2qwerty: A large dataset with baselines for touch typing using surface electromyography, in: Advances in Neural Infor- mation Processing Systems, V ol. 37, 2024, pp. 91373–91389. 18

2024
[4]

J. Yang, D. Cha, D.-G. Lee, S. Ahn, Stcnet: Spatio-temporal cross network with subject-aware contrastive learning for hand gesture recognition in surface EMG, Computers in Biology and Medicine 185 (2025) 109525

2025
[5]

M. A. Al-Qaness, S. Ni, Tcnn-kan: Optimized cnn by kolmogorov-arnold network and pruning techniques for sEMG gesture recognition, IEEE Journal of Biomedical and Health Informatics 29 (1) (2025) 188–197. doi:10.1109/JBHI.2024.3467065

work page doi:10.1109/jbhi.2024.3467065 2025
[6]

Z. Wang, J. Yao, M. Xu, M. Jiang, J. Su, Transformer-based network with temporal depthwise convolutions for sEMG recognition, Pattern Recognition 145 (2024) 109967

2024
[7]

Tsinganos, B

P. Tsinganos, B. Cornelis, J. Cornelis, B. Jansen, A. Skodras, Data augmentation of surface electromyography for hand gesture recognition, Sensors 20 (17) (2020) 4892

2020
[8]

K. Zhao, Z. He, A. Hung, D. Zeng, Dominant shuffle: A simple yet powerful data augmentation for time-series prediction, arXiv preprint arXiv:2405.16456 (2024)

work page arXiv 2024
[9]

C. Cao, F. Zhou, Y . Dai, J. Wang, K. Zhang, A survey of mix-based data augmentation: Taxonomy, methods, applications, and explainability, ACM Computing Surveys 57 (2) (2024) 1–38

2024
[10]

Mendez, C

V . Mendez, C. Lhoste, S. Micera, Emg data augmentation for grasp classification using generative adversarial networks, in: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society, IEEE, 2022, pp. 3619–3622

2022
[11]

J. Ao, S. Liang, T. Yan, R. Hou, Z. Zheng, J. Ryu, Overcoming the effect of muscle fatigue on gesture recognition based on sEMG via generative adversarial networks, Expert Systems with Applications 238 (2024) 122304

2024
[12]

Coelho, M

F. Coelho, M. F. Pinto, A. G. Melo, G. S. Ramos, A. L. M. Marcato, A novel sEMG data augmentation based on wgan-gp, Computer Methods in Biomechanics and Biomedical Engineering 26 (9) (2023) 1008–1017. doi:10.1080/10255842.2022.2102422

work page doi:10.1080/10255842.2022.2102422 2023
[13]

J. J. Bird, M. Pritchard, A. Fratini, A. Ekárt, D. R. Faria, Synthetic biological signals machine-generated by gpt- 2 improve the classification of eeg and emg through data augmentation, IEEE Robotics and Automation Letters 6 (2) (2021) 3498–3504

2021
[14]

J. Ho, A. Jain, P. Abbeel, Denoising diffusion probabilistic models, in: Advances in Neural Information Process- ing Systems, V ol. 33, 2020, pp. 6840–6851

2020
[15]

Dhariwal, A

P. Dhariwal, A. Nichol, Diffusion models beat gans on image synthesis, in: Advances in Neural Information Processing Systems, V ol. 34, 2021, pp. 8780–8794

2021
[16]

Xiong, W

B. Xiong, W. Chen, H. Li, Y . Niu, N. Zeng, Z. Gan, Y . Xu, PatchEMG: Few-shot EMG signal generation with diffusion models for data augmentation to improve classification performance, IEEE Transactions on Instrumen- tation and Measurement (2024)

2024
[17]

Neifar, A

N. Neifar, A. Mdhaffar, A. Ben-Hamadou, M. Jmaiel, Deep generative models for physiologi- cal signals: A systematic literature review, Artificial Intelligence in Medicine 165 (2025) 103127. doi:10.1016/j.artmed.2025.103127

work page doi:10.1016/j.artmed.2025.103127 2025
[18]

X. Liu, C. Gong, Q. Liu, Flow straight and fast: Learning to generate and transfer data with rectified flow, in: International Conference on Learning Representations, 2022

2022
[19]

Lipman, R

Y . Lipman, R. T. Q. Chen, H. Ben-Hamu, M. Nickel, M. Le, Flow matching for generative modeling, in: Inter- national Conference on Learning Representations, 2022

2022
[20]

Atzori, A

M. Atzori, A. Gijsberts, I. Kuzborskij, S. Elsig, A.-G. M. Hager, O. Deriaz, C. Castellini, H. Müller, B. Caputo, Characterization of a benchmark database for myoelectric movement classification, IEEE Transactions on Neural Systems and Rehabilitation Engineering 23 (1) (2014) 73–83. 19

2014
[21]

L. Meng, X. Jiang, X. Liu, J. Fan, H. Ren, Y . Guo, H. Diao, Z. Wang, C. Chen, C. Dai, W. Chen, User-tailored hand gesture recognition system for wearable prosthesis and armband based on surface electromyogram, IEEE Transactions on Instrumentation and Measurement 71 (2022) 2520616. doi:10.1109/TIM.2022.3217868

work page doi:10.1109/tim.2022.3217868 2022
[22]

Prabhavathy, V

T. Prabhavathy, V . K. Elumalai, E. Balaji, Hand gesture classification framework leveraging the entropy features from sEMG signals and vmd augmented multi-class svm, Expert Systems with Applications 238 (2024) 121972

2024
[23]

Q. Dai, Y . Wong, M. Kankanhali, X. Li, W. Geng, Improved network and training scheme for cross- trial surface electromyography (sEMG)-based gesture recognition, Bioengineering 10 (9) (2023) 1101. doi:10.3390/bioengineering10091101

work page doi:10.3390/bioengineering10091101 2023
[24]

N. K. Karnam, S. R. Dubey, A. C. Turlapaty, B. Gokaraju, Emghandnet: A hybrid cnn and bi-lstm architecture for hand activity classification using surface EMG signals, Biocybernetics and Biomedical Engineering 42 (1) (2022) 325–340. doi:10.1016/j.bbe.2022.02.005

work page doi:10.1016/j.bbe.2022.02.005 2022
[25]

Y . Chen, M. Orlandi, P. M. Rapa, S. Benatti, L. Benini, Y . Li, Waveformer: A lightweight transformer model for sEMG-based gesture recognition, arXiv preprint arXiv:2506.11168 (2025). doi:10.48550/arXiv.2506.11168

work page doi:10.48550/arxiv.2506.11168 2025
[26]

Jiang, X

X. Jiang, X. Liu, J. Fan, X. Ye, C. Dai, E. A. Clancy, D. Farina, W. Chen, Optimization of hd-semg-based cross-day hand gesture classification by optimal feature extraction and data augmentation, IEEE Transactions on Human-Machine Systems 52 (6) (2022) 1281–1291. doi:10.1109/THMS.2022.3175408

work page doi:10.1109/thms.2022.3175408 2022
[27]

Zhang, M

H. Zhang, M. Cisse, Y . N. Dauphin, D. Lopez-Paz, mixup: Beyond empirical risk minimization, in: International Conference on Learning Representations, 2018

2018
[28]

T. T. Um, F. M. Pfister, D. Pichler, S. Endo, M. Lang, S. Hirche, U. Fietzek, D. Kuli ´c, Data augmentation of wearable sensor data for parkinson’s disease monitoring using convolutional neural networks, in: Proceedings of the ACM International Conference on Multimodal Interaction, 2017, pp. 216–220

2017
[29]

Semenoglou, E

A.-A. Semenoglou, E. Spiliotis, V . Assimakopoulos, Data augmentation for univariate time series forecasting with neural networks, Pattern Recognition 134 (2023) 109132

2023
[30]

M. Chen, Z. Xu, A. Zeng, Q. Xu, Fraug: Frequency domain augmentation for time series forecasting, arXiv preprint arXiv:2302.09292 (2023)

work page arXiv 2023
[31]

Zhang, R

X. Zhang, R. R. Chowdhury, J. Shang, R. Gupta, D. Hong, Towards diverse and coherent augmentation for time-series forecasting, in: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE, 2023, pp. 1–5

2023
[32]

Venugopal, D

A. Venugopal, D. Resende Faria, Boosting eeg and ecg classification with synthetic biophysical data generated via generative adversarial networks, Applied Sciences 14 (23) (2024) 10818. doi:10.3390/app142310818

work page doi:10.3390/app142310818 2024
[33]

J. Ho, T. Salimans, Classifier-free diffusion guidance, in: Advances in Neural Information Processing Systems Workshop, 2021

2021
[34]

J. Song, C. Meng, S. Ermon, Denoising diffusion implicit models, in: International Conference on Learning Representations, 2020

2020
[35]

Atzori, A

M. Atzori, A. Gijsberts, C. Castellini, B. Caputo, A.-G. M. Hager, S. Elsig, G. Giatsidis, F. Bassetto, H. Müller, Electromyography data for non-invasive naturally-controlled robotic hand prostheses, Scientific Data 1 (1) (2014) 1–13

2014
[36]

Pizzolato, L

S. Pizzolato, L. Tagliapietra, M. Cognolato, M. Reggiani, H. Müller, M. Atzori, Comparison of six electromyo- graphy acquisition setups on hand movement classification tasks, PLoS ONE 12 (10) (2017) e0186132

2017
[37]

Krasoulis, I

A. Krasoulis, I. Kyranou, M. S. Erden, K. Nazarpour, S. Vijayakumar, Improved prosthetic hand control with concurrent use of myoelectric and inertial measurements, Journal of NeuroEngineering and Rehabilitation 14 (1) (2017) 71. 20

2017
[38]

Real-valued (medical) time series generation with recurrent conditional gans.arXiv preprint arXiv:1706.02633, 2017

C. Esteban, S. L. Hyland, G. Rätsch, Real-valued (medical) time series generation with recurrent conditional GANs, arXiv preprint arXiv:1706.02633 (2017). arXiv:1706.02633

work page arXiv 2017
[39]

Heusel, H

M. Heusel, H. Ramsauer, T. Unterthiner, B. Nessler, S. Hochreiter, Gans trained by a two time-scale update rule converge to a local nash equilibrium, in: Advances in Neural Information Processing Systems, V ol. 30, 2017

2017
[40]

Kynkäänniemi, T

T. Kynkäänniemi, T. Karras, S. Laine, J. Lehtinen, T. Aila, Improved precision and recall metric for assessing generative models, in: Advances in Neural Information Processing Systems, 2019

2019
[41]

M. F. Naeem, S. J. Oh, Y . Uh, Y . Choi, J. Yoo, Reliable fidelity and diversity metrics for generative models, in: International Conference on Machine Learning, 2020

2020
[42]

K. Mei, Z. Tu, M. Delbracio, H. Talebi, V . M. Patel, P. Milanfar, Bigger is not always better: Scaling properties of latent diffusion models, Transactions on Machine Learning ResearchAccepted in 2024; arXiv:2404.01367 (2025)

work page arXiv 2024
[43]

L. Fan, K. Chen, D. Krishnan, D. Katabi, P. Isola, Y . Tian, Scaling laws of synthetic images for model training ... for now, in: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024, pp. 7382–7392

2024
[44]

Karras, M

T. Karras, M. Aittala, T. Aila, S. Laine, Elucidating the design space of diffusion-based generative models, in: Advances in Neural Information Processing Systems, 2022. 21

2022