pith. sign in

arxiv: 2606.10802 · v1 · pith:HTCLMUXWnew · submitted 2026-06-09 · 💻 cs.LG · cs.AI

Boosting ECG Classification Performance by Pre-training with Synthesized Data

Pith reviewed 2026-06-27 13:49 UTC · model grok-4.3

classification 💻 cs.LG cs.AI
keywords ECG classificationsynthetic datapre-trainingdeep neural networksatrial fibrillationGaussian wave synthesisdata scarcitymedical signal processing
0
0 comments X

The pith

Pre-training on knowledge-driven synthetic ECGs improves real-data classification for three of four heart abnormalities, with gains largest when real datasets are small.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper tests whether ECG signals generated from a Gaussian wave model based on medical knowledge can pre-train neural networks before they see actual patient recordings. The goal is to ease the problem of scarce labeled medical data caused by privacy rules and rare conditions. Experiments across ten architectures and four abnormality types show that this synthetic pre-training raises accuracy on real test sets in most cases. The largest average lift reaches 33.2 percent for atrial flutter, and the advantage grows as the amount of real training data shrinks. A reader would care because the method offers a practical way to build better diagnostic tools without collecting huge private datasets.

Core claim

A knowledge-driven Gaussian-composition algorithm that builds single-lead ECGs from summed Gaussian P, Q, R, S, and T components produces synthetic examples for atrial fibrillation, atrial flutter, premature ventricular complex, and Wolff-Parkinson-White syndrome; when these examples are used for pre-training followed by fine-tuning on real recordings, classification performance rises for three of the four classes relative to training from scratch alone, with the largest architecture-averaged gain of 33.2 percent for atrial flutter and larger relative gains observed on smaller real-world datasets.

What carries the argument

The knowledge-driven Gaussian-composition synthesis algorithm that represents each heartbeat as a linear combination of five Gaussian-shaped wave components to produce realistic single-lead II ECG morphologies.

If this is right

  • Synthetic pre-training becomes a practical route for improving ECG classifiers when real labeled examples are limited or expensive to obtain.
  • The performance lift is larger for atrial flutter than for the other three tested abnormalities.
  • The benefit scales inversely with real-data volume, so the method is most useful precisely when data scarcity is greatest.
  • Domain-knowledge synthesis can serve as a reusable pre-training resource across multiple downstream ECG tasks.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the Gaussian model captures the essential morphological features, the same synthesis approach could be adapted to generate pre-training data for other time-series biosignals such as EEG or arterial pressure waveforms.
  • The method implicitly assumes that the real test distribution remains stable; any systematic shift between the synthetic training distribution and future clinical recordings could reduce the observed gains.
  • Because gains are strongest on small real sets, the technique may allow smaller hospitals or research groups to reach competitive model performance without large private data collections.

Load-bearing premise

The generated synthetic ECGs must be close enough in statistical and morphological detail to real signals that pre-training on them transfers helpful features rather than harmful biases to the downstream real-data task.

What would settle it

An experiment in which the same ten architectures trained from scratch on the real datasets match or exceed the accuracy of the synthetic-pre-trained versions across all four abnormality classes and all dataset sizes would falsify the central claim.

Figures

Figures reproduced from arXiv: 2606.10802 by Jun Seita, Naoki Nonaka.

Figure 1
Figure 1. Figure 1: Examples of real-world and synthesized ECG of AF. For the synthesis of AF ECGs, we used the following rule. AF is characterized by the loss of regular atrial excitation, resulting in fine trembling of the atrial muscles. This condition manifests in the waveform as the absence of the P wave, oscillations in the baseline, and a reduction in the RR interval. Consequently, in the AF synthesis algorithm, the P … view at source ↗
Figure 2
Figure 2. Figure 2: Examples of real-world and synthesized ECG of AFLT. generated. The synthesis of the QRS complex is conducted in the same manner as for a normal ECG. Resulting synthesized AFLT ECG sample is visualized in [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Examples of real-world and synthesized ECG of PVC. For the synthesis of PVC ECGs, we used the following rule. PVC occur when an ectopic excitation stimulates the heart ahead of the normal contraction. This condition is characterized by the absence of a preceding P wave and an increased width of the QRS complex compared to normal. To reflect these characteristics, the synthesis algorithm generates PVC heart… view at source ↗
Figure 4
Figure 4. Figure 4: Examples of real-world and synthesized ECG of WPW [PITH_FULL_IMAGE:figures/full_fig_p006_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Average improvement of the “Syn → Real” approach over the models trained with “Real” setting. For all four abnormal ECG classification task, improvement rate increases with the decrease of the real-world data. We did not plot point if the relative improvement was negative (= AF with all positive samples) [PITH_FULL_IMAGE:figures/full_fig_p013_5.png] view at source ↗
read the original abstract

Deep Neural Networks (DNNs) typically require extensive datasets for effective training. In the medical domain, acquiring large-scale data is often challenging due to privacy concerns and the rarity of certain diseases. To address this data scarcity, we investigate the efficacy of training DNN models using synthetic data, generated based on domain-specific medical knowledge. Specifically, we develop a knowledge-driven Gaussian-composition synthesis algorithm for single-lead II ECGs, in which each heartbeat is represented by Gaussian-shaped P, Q, R, S, and T wave components. Using this simulator, we generate synthetic data for four abnormal electrocardiogram (ECG) classes: atrial fibrillation (AF), atrial flutter (AFLT), premature ventricular complex (PVC), and Wolff-Parkinson-White Syndrome (WPW). We evaluate the utility of this synthetic data by conducting abnormal ECG classification using ten different DNN architectures. Our results demonstrate that synthetic-to-real training improves classification performance for three of the four target abnormalities, with the largest architecture-averaged gain of $33.2\%$ observed for AFLT. Further analysis reveals that the performance enhancement from synthetic data is more pronounced with smaller real-world datasets. These findings suggest that domain-knowledge-based synthetic ECGs can serve as a useful pre-training resource, particularly in scenarios where real-world data are limited or difficult to obtain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces a knowledge-driven Gaussian-composition simulator to generate synthetic single-lead ECGs for four abnormality classes (AF, AFLT, PVC, WPW) and evaluates pre-training DNNs on this data before fine-tuning on real recordings. Across ten architectures the approach yields accuracy gains for three of the four classes, with the largest architecture-averaged improvement of 33.2% for AFLT; the benefit is reported to be larger when the real training set is small.

Significance. If the central claim holds after the requested controls, the work would supply a practical, domain-knowledge-based route to data augmentation for privacy-constrained or rare-disease ECG tasks. The multi-architecture evaluation and the explicit dataset-size interaction are concrete strengths that would be of interest to the medical-ML community.

major comments (2)
  1. [Abstract / Results] Abstract and Results section: the claim that performance gains are attributable to morphological transfer from the Gaussian-synthesized signals (rather than simple sample-size increase or regularization) is load-bearing, yet no quantitative distributional comparison (Kolmogorov-Smirnov or Wasserstein distances on PR interval, QRS width, RR variability, or spectral features) between the synthetic corpus and the real training folds is provided.
  2. [Methods / Experimental setup] Methods / Experimental setup: the paper does not report the synthetic-to-real mixing ratios, the precise fine-tuning protocol, exclusion criteria for the real data, or any statistical significance testing (p-values, confidence intervals, or multiple-comparison correction) for the reported percentage gains.
minor comments (1)
  1. [Methods] The ten DNN architectures should be listed with their exact layer counts or references so that the experiments are fully reproducible.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. The two major comments highlight important gaps in evidence and reporting that we will address through additional analyses and clarifications in the revised manuscript.

read point-by-point responses
  1. Referee: [Abstract / Results] Abstract and Results section: the claim that performance gains are attributable to morphological transfer from the Gaussian-synthesized signals (rather than simple sample-size increase or regularization) is load-bearing, yet no quantitative distributional comparison (Kolmogorov-Smirnov or Wasserstein distances on PR interval, QRS width, RR variability, or spectral features) between the synthetic corpus and the real training folds is provided.

    Authors: We agree that a quantitative distributional comparison would provide stronger support for attributing gains to morphological transfer. In the revision we will compute and report Kolmogorov-Smirnov tests together with Wasserstein distances on PR interval, QRS width, RR variability, and spectral features between the synthetic corpus and each real training fold. These results will be added to the Results section and discussed in relation to the observed performance improvements. revision: yes

  2. Referee: [Methods / Experimental setup] Methods / Experimental setup: the paper does not report the synthetic-to-real mixing ratios, the precise fine-tuning protocol, exclusion criteria for the real data, or any statistical significance testing (p-values, confidence intervals, or multiple-comparison correction) for the reported percentage gains.

    Authors: We acknowledge that these experimental details are currently missing. The revised Methods section will explicitly state the synthetic-to-real mixing ratios employed, provide a complete description of the fine-tuning protocol (including optimizer, learning-rate schedule, epochs, and early-stopping criteria), list the exclusion criteria applied to the real recordings, and include statistical significance testing with p-values, 95% confidence intervals, and multiple-comparison correction for all reported accuracy gains. revision: yes

Circularity Check

0 steps flagged

No circularity; purely empirical evaluation with external benchmarks

full rationale

The manuscript presents an empirical study: a Gaussian-composition simulator generates synthetic ECGs for four abnormality classes; ten DNN architectures are pre-trained on synthetic data then fine-tuned on real ECG datasets; classification accuracy is measured on held-out real test folds. No equations, derivations, fitted parameters, or uniqueness theorems appear. All performance claims (e.g., 33.2% AFLT gain) are direct measurements against external real-world data partitions. No self-citation chain or self-definitional reduction exists. This is the normal, non-circular outcome for a methods-and-results paper.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The approach depends on a domain assumption that ECG morphology can be captured by summed Gaussians and on the transferability of features learned from such synthetics; no new physical entities are postulated and free parameters appear limited to knowledge-derived wave timings and amplitudes.

free parameters (1)
  • Gaussian wave parameters (amplitude, location, width for P/Q/R/S/T)
    Chosen according to domain medical knowledge to generate realistic shapes; exact values or fitting procedure not detailed in abstract.
axioms (1)
  • domain assumption Single-lead II ECG heartbeats can be faithfully represented as linear combinations of five Gaussian functions corresponding to P, Q, R, S, and T waves
    This is the explicit basis of the knowledge-driven synthesis algorithm.

pith-pipeline@v0.9.1-grok · 5760 in / 1486 out tokens · 32135 ms · 2026-06-27T13:49:19.615832+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

45 extracted references · 5 canonical work pages · 4 internal anchors

  1. [1]

    Physiological measurement35(8), 1537 (2014)

    Behar, J., Andreotti, F., Zaunseder, S., Li, Q., Oster, J., Clifford, G.D.: An ecg sim- ulator for generating maternal-foetal activity mixtures on abdominal ecg record- ings. Physiological measurement35(8), 1537 (2014)

  2. [2]

    In: Machine Learning for Healthcare Conference

    Biswal, S., Ghosh, S., Duke, J., Malin, B., Stewart, W., Xiao, C., Sun, J.: Eva: Generating longitudinal electronic health records using conditional variational au- toencoders. In: Machine Learning for Healthcare Conference. pp. 260–282. PMLR (2021)

  3. [3]

    In: Machine Learning for Healthcare Conference

    Chintagunta, B., Katariya, N., Amatriain, X., Kannan, A.: Medically aware gpt-3 as a data generator for medical dialogue summarization. In: Machine Learning for Healthcare Conference. pp. 354–372. PMLR (2021)

  4. [4]

    Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling

    Chung, J., Gulcehre, C., Cho, K., Bengio, Y.: Empirical evaluation of gated recur- rent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014)

  5. [5]

    In: Machine learning for health workshop

    Ghorbani, A., Natarajan, V., Coz, D., Liu, Y.: Dermgan: Synthetic generation of clinical skin images with pathology. In: Machine learning for health workshop. pp. 155–170. PMLR (2020)

  6. [6]

    In: Proceedings of the AAAI Conference on Artificial Intelligence

    Golany, T., Freedman, D., Radinsky, K.: Ecg ode-gan: Learning ordinary differen- tial equations of ecg dynamics via generative adversarial learning. In: Proceedings of the AAAI Conference on Artificial Intelligence. vol. 35, pp. 134–141 (2021)

  7. [7]

    In: Interna- tional Conference on Machine Learning

    Golany, T., Radinsky, K., Freedman, D.: Simgans: Simulator-based generative ad- versarial networks for ecg synthesis to improve deep ecg classification. In: Interna- tional Conference on Machine Learning. pp. 3597–3606. PMLR (2020)

  8. [8]

    circulation101(23), e215–e220 (2000)

    Goldberger, A.L., Amaral, L.A., Glass, L., Hausdorff, J.M., Ivanov, P.C., Mark, R.G., Mietus, J.E., Moody, G.B., Peng, C.K., Stanley, H.E.: Physiobank, phys- iotoolkit, and physionet: components of a new research resource for complex phys- iologic signals. circulation101(23), e215–e220 (2000)

  9. [9]

    Efficiently Modeling Long Sequences with Structured State Spaces

    Gu, A., Goel, K., Ré, C.: Efficiently modeling long sequences with structured state spaces. arXiv preprint arXiv:2111.00396 (2021)

  10. [10]

    Nature medicine25(1), 65–69 (2019)

    Hannun, A.Y., Rajpurkar, P., Haghpanahi, M., Tison, G.H., Bourn, C., Turakhia, M.P., Ng, A.Y.: Cardiologist-level arrhythmia detection and classification in am- bulatory electrocardiograms using a deep neural network. Nature medicine25(1), 65–69 (2019)

  11. [11]

    He,K.,Zhang,X.,Ren,S.,Sun,J.:Deepresiduallearningforimagerecognition.In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 770–778 (2016) Boosting ECG Classification with Synthesized Data 15

  12. [12]

    Neural computation 9(8), 1735–1780 (1997)

    Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural computation 9(8), 1735–1780 (1997)

  13. [13]

    In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

    Hu, Q., Chen, Y., Xiao, J., Sun, S., Chen, J., Yuille, A.L., Zhou, Z.: Label-free liver tumor segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 7422–7432 (2023)

  14. [14]

    Artificial Intelligence in Medicine p

    Kaisti, M.,Laitala,J.,Wong,D.,Airola,A.:Domainrandomization usingsynthetic electrocardiograms for training neural networks. Artificial Intelligence in Medicine p. 102583 (2023)

  15. [15]

    In: Asian Confer- ence on Computer Vision (ACCV) (2020)

    Kataoka, H., Okayasu, K., Matsumoto, A., Yamagata, E., Yamada, R., Inoue, N., Nakamura, A., Satoh, Y.: Pre-training without natural images. In: Asian Confer- ence on Computer Vision (ACCV) (2020)

  16. [16]

    International Journal of Computer Vision (IJCV) (2022)

    Kataoka, H., Okayasu, K., Matsumoto, A., Yamagata, E., Yamada, R., Inoue, N., Nakamura, A., Satoh, Y.: Pre-training without natural images. International Journal of Computer Vision (IJCV) (2022)

  17. [17]

    Adam: A Method for Stochastic Optimization

    Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)

  18. [18]

    General-to-Detailed GAN for Infrequent Class Medical Images

    Koga, T., Nonaka, N., Sakuma, J., Seita, J.: General-to-detailed gan for infrequent class medical images. arXiv preprint arXiv:1812.01690 (2018)

  19. [19]

    In: 2022 Computing in Cardiology (CinC)

    Landajuela, M., Anirudh, R., Loscazo, J., Blake, R.: Intracardiac electrical imaging using the 12-lead ecg: a machine learning approach using synthetic data. In: 2022 Computing in Cardiology (CinC). vol. 498, pp. 1–4. IEEE (2022)

  20. [20]

    Advances in Neural Information Processing Systems 34, 2441–2453 (2021)

    Ma, X., Kong, X., Wang, S., Zhou, C., May, J., Ma, H., Zettlemoyer, L.: Luna: Lin- ear unified nested attention. Advances in Neural Information Processing Systems 34, 2441–2453 (2021)

  21. [21]

    arXiv preprint arXiv:2209.10655 (2022)

    Ma, X., Zhou, C., Kong, X., He, J., Gui, L., Neubig, G., May, J., Zettlemoyer, L.: Mega: moving average equipped gated attention. arXiv preprint arXiv:2209.10655 (2022)

  22. [22]

    In: 2018 IEEE 15th International symposium on biomedical imaging (ISBI 2018)

    Madani, A., Moradi, M., Karargyris, A., Syeda-Mahmood, T.: Semi-supervised learning with generative adversarial networks for chest x-ray classification with ability of data domain adaptation. In: 2018 IEEE 15th International symposium on biomedical imaging (ISBI 2018). pp. 1038–1042. IEEE (2018)

  23. [23]

    IEEE transactions on biomedical engineering50(3), 289–294 (2003)

    McSharry, P.E., Clifford, G.D., Tarassenko, L., Smith, L.A.: A dynamical model for generating synthetic electrocardiogram signals. IEEE transactions on biomedical engineering50(3), 289–294 (2003)

  24. [24]

    In: Machine Learning for Health- care Conference

    Naseer, A.A., Walker, B., Landon, C., Ambrosy, A., Fudim, M., Wysham, N., Toro, B., Swaminathan, S., Lyons, T.: Scoehr: Generating synthetic electronic health records using continuous-time diffusion models. In: Machine Learning for Health- care Conference. pp. 489–508. PMLR (2023)

  25. [25]

    In: Machine Learning for Healthcare Conference

    Nolin-Lapalme, A., Avram, R., Julie, H.: Privecg: generating private ecg for end-to- end anonymization. In: Machine Learning for Healthcare Conference. pp. 509–528. PMLR (2023)

  26. [26]

    In: Machine Learning for Healthcare Conference

    Nonaka, N., Seita, J.: In-depth benchmarking of deep neural network architectures for ecg diagnosis. In: Machine Learning for Healthcare Conference. pp. 414–439. PMLR (2021)

  27. [27]

    In: Annual Conference of the Japanese Society for Artificial In- telligence

    Nonaka, N., Seita, J.: Randecg: Data augmentation for deep neural network based ecg classification. In: Annual Conference of the Japanese Society for Artificial In- telligence. pp. 178–189. Springer (2021)

  28. [28]

    In: Conference on Health, Inference, and Learn- ing

    Raghu, A., Shanmugam, D., Pomerantsev, E., Guttag, J., Stultz, C.M.: Data aug- mentation for electrocardiograms. In: Conference on Health, Inference, and Learn- ing. pp. 282–310. PMLR (2022) 16 Nonaka et al

  29. [29]

    In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP)

    Salehinejad, H., Valaee, S., Dowdell, T., Colak, E., Barfett, J.: Generalization of deep neural networks for chest pathology classification in x-rays using generative adversarial networks. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). pp. 990–994. IEEE (2018)

  30. [30]

    Sayadi, O., Shamsollahi, M.B., Clifford, G.D.: Synthetic ecg generation and bayesianfilteringusingagaussianwave-baseddynamicalmodel.Physiologicalmea- surement31(10), 1309 (2010)

  31. [31]

    In: International conference on machine learning

    Tan, M., Le, Q.: Efficientnet: Rethinking model scaling for convolutional neural networks. In: International conference on machine learning. pp. 6105–6114. PMLR (2019)

  32. [32]

    In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops

    Tremblay, J., Prakash, A., Acuna, D., Brophy, M., Jampani, V., Anil, C., To, T., Cameracci, E., Boochoon, S., Birchfield, S.: Training deep networks with synthetic data: Bridging the reality gap by domain randomization. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops. pp. 969– 977 (2018)

  33. [33]

    In: CVPR (2017)

    Varol, G., Romero, J., Martin, X., Mahmood, N., Black, M.J., Laptev, I., Schmid, C.: Learning from synthetic humans. In: CVPR (2017)

  34. [34]

    Advances in neural information pro- cessing systems30(2017)

    Vaswani,A.,Shazeer,N.,Parmar,N.,Uszkoreit,J.,Jones,L.,Gomez,A.N.,Kaiser, Ł., Polosukhin, I.: Attention is all you need. Advances in neural information pro- cessing systems30(2017)

  35. [35]

    Scientific data7(1), 154 (2020)

    Wagner, P., Strodthoff, N., Bousseljot, R.D., Kreiseler, D., Lunze, F.I., Samek, W., Schaeffter, T.: Ptb-xl, a large publicly available electrocardiography dataset. Scientific data7(1), 154 (2020)

  36. [36]

    In: Machine Learning for Healthcare Conference

    Zhu, J., Qiu, J., Yang, Z., Weber, D., Rosenberg, M.A., Liu, E., Li, B., Zhao, D.: Geoecg: Data augmentation via wasserstein geodesic perturbation for robust electrocardiogram prediction. In: Machine Learning for Healthcare Conference. pp. 172–197. PMLR (2022) Boosting ECG Classification with Synthesized Data 17 Appendix A Details of ECG synthesis This ap...

  37. [37]

    Initialize an empty ECG signal

  38. [38]

    Sample the initial P, Q, R, S, and T wave parameters from Gaussian dis- tributions using the base values and inter-sample standard deviations in Table A.1

  39. [39]

    Generate one heartbeat by summing the Gaussian-shaped P, Q, R, S, and T wave components

  40. [40]

    Append the generated heartbeat to the ECG signal

  41. [41]

    Perturb the waveform parameters using the beat-level standard deviations in Table A.1

  42. [42]

    Repeat Steps 3–5 until the signal exceeds the target length

  43. [43]

    Trim the signal to 5,000 time steps

  44. [44]

    Add white noise and sinusoidal baseline fluctuation

  45. [45]

    Base” column denotes the mean value of each parameter. The “Inter- sample SD

    Standardize the sample by subtracting its mean and dividing by its standard deviation. For a waveform componentw, the Gaussian peak is computed as gw(t) =a w exp − 1 2 t−µ w σw 2! , wherea w is the signed amplitude,µw is the temporal shift, andσw is the width of the component. By summing the P, Q, R, S, and T components, we obtain one synthetic heartbeat....