pith. machine review for the scientific record. sign in

arxiv: 2604.10439 · v2 · submitted 2026-04-12 · 💻 cs.CV

Recognition: unknown

Removing Motion Artifact in MRI by Using a Perceptual Loss Driven Deep Learning Framework

Boyang Pan, Chengwei Chen, Chenwei Shao, Danqun Zheng, Langdi Zhong, Nan-Jie Gong, Shuai Li, Xuezhou Li, Yun Bian, Ziheng Guo, Ziqin Yu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:46 UTC · model grok-4.3

classification 💻 cs.CV
keywords MRI motion artifactdeep learningperceptual lossimage correctionmedical imagingU-Netartifact removalstructural preservation
0
0 comments X

The pith

PERCEPT-Net uses a dedicated motion perceptual loss to remove MRI artifacts while preserving anatomical structures on clinical data.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper develops a deep learning framework called PERCEPT-Net to correct motion artifacts that blur MRI scans when patients move during imaging. Existing methods often fail on real clinical scans because they cannot reliably separate artifacts from actual body structures. PERCEPT-Net adds three parts to a residual U-Net: a multi-scale recovery module for global and fine details, dual attention to focus on relevant features, and a Motion Perceptual Loss trained on paired volumes that combine real and simulated motion. This loss teaches the network generalized artifact patterns so it can suppress them without distorting true anatomy. Tests on prospective clinical data show gains over prior methods in image quality metrics and in radiologist ratings of diagnostic usefulness.

Core claim

PERCEPT-Net is built on a residual U-Net backbone and adds a multi-scale recovery module, dual attention mechanisms, and a Motion Perceptual Loss. The loss is trained on a hybrid dataset of real and simulated paired MRI volumes to learn generalized representations of motion artifacts. On a prospective clinical test set the network suppresses artifacts while keeping anatomical fidelity, outperforming state-of-the-art methods with higher SSIM and PSNR values and with radiologists reporting greater diagnostic confidence in the corrected volumes. Ablation experiments identify the Motion Perceptual Loss as the main source of these gains.

What carries the argument

The Motion Perceptual Loss, an artifact-aware perceptual supervision term that learns generalized representations of MRI motion artifacts to suppress them while maintaining anatomical fidelity.

If this is right

  • The corrected volumes exhibit higher structural consistency and tissue contrast as shown by improved SSIM and PSNR scores.
  • Radiologists assign higher diagnostic confidence scores to the motion-corrected images compared with uncorrected or prior-method outputs.
  • Ablation studies attribute most of the performance gain to the Motion Perceptual Loss rather than the other network components.
  • Hybrid training on real and simulated pairs enables better generalization to new clinical scans than training on simulated data alone.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • If the perceptual loss captures motion patterns reliably, the same supervision idea could be adapted to correct other MRI artifacts such as those from field inhomogeneity.
  • Wider adoption might allow shorter scan times in pediatric or restless patients by tolerating some motion without repeat acquisitions.
  • The framework's emphasis on perceptual rather than pixel-wise losses points to possible use in other medical image restoration tasks where structure preservation matters.
  • Testing across multiple scanner vendors and field strengths would reveal whether the learned artifact representations remain effective outside the original training distribution.

Load-bearing premise

The Motion Perceptual Loss trained on a hybrid real-and-simulated dataset learns representations that generalize to unseen clinical MRI volumes without distorting true anatomical structures.

What would settle it

Run the trained model on a large prospective clinical dataset where experts have annotated both artifact regions and the underlying true anatomy, then check whether the corrected outputs match the true anatomy more closely than the original scans without introducing new distortions or removals.

read the original abstract

Purpose: Deep learning-based MRI artifact correction methods often demonstrate poor generalization to clinical data. This limitation largely stems from the inability of deep learning models in reliably distinguishing motion artifacts from true anatomical structures, due to insufficient awareness of artifact characteristics. To address this challenge, we proposed PERCEPT-Net, a deep learning framework that enhances structure preserving and suppresses artifact through dedicated perceptual supervision.Method: PERCEPT-Net is built on a residual U-Net backbone and incorporates three auxiliary components. The first multi-scale recovery module is designed to preserve both global anatomical context and fine structural details, while the second dual attention mechanisms further improve performance by prioritizing clinically relevant features. At the core of the framework is the third Motion Perceptual Loss (MPL), an artifact-aware perceptual supervision strategy that learns generalized representations of MRI motion artifacts, enabling the model to effectively suppress them while maintaining anatomical fidelity. The model is trained on a hybrid dataset comprising both real and simulated paired volumes, and its performance is validated on a prospective test set using a combination of quantitative metrics and qualitative assessments by experienced radiologists.Result: PERCEPT-Net outperformed state-of-the-art methods on clinical data. Ablation studies identified the Motion Perceptual Loss as the primary contributor to this performance, yielding significant improvements in structural consistency and tissue contrast, as reflected by higher SSIM and PSNR values. These findings were further corroborated by radiologist evaluations, which demonstrated significantly higher diagnostic confidence in the corrected volumes.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

3 major / 2 minor

Summary. The manuscript proposes PERCEPT-Net, a residual U-Net based deep learning model for MRI motion artifact removal. The framework includes a multi-scale recovery module, dual attention mechanisms, and a Motion Perceptual Loss (MPL) designed to learn artifact representations from a hybrid real and simulated training dataset. Performance is evaluated on a prospective clinical test set using quantitative metrics (SSIM, PSNR) and qualitative radiologist assessments, with claims of outperformance over state-of-the-art methods and ablation studies highlighting the MPL as the key component for improved structural consistency and tissue contrast.

Significance. If the central claims hold with proper substantiation, this would be a useful contribution to MRI artifact correction by showing that a tailored perceptual loss can improve generalization from hybrid training data to real prospective clinical volumes. The hybrid training approach, prospective validation, and radiologist involvement are positive elements that address common weaknesses in DL-based medical image restoration. The ablation isolating MPL as the dominant factor is a strength if the supporting numbers and controls are provided.

major comments (3)
  1. [Method] Method section: The hybrid training dataset is described only qualitatively as 'comprising both real and simulated paired volumes.' No counts of real versus simulated volumes, motion simulation parameters (e.g., rotation/translation ranges or artifact severity distribution), or confirmation of no patient overlap with the prospective test set are given. These omissions directly affect the load-bearing claim that the MPL learns generalized artifact representations transferable to unseen clinical data without anatomical distortion.
  2. [Results] Results section: The abstract and results state that PERCEPT-Net achieves higher SSIM and PSNR values than SOTA methods and that ablation studies identify MPL as the primary contributor, yet no numerical values, baseline comparisons, standard deviations, error bars, or statistical significance tests are reported. Without these, the quantitative support for outperformance and the ablation conclusion cannot be evaluated.
  3. [Method] Method (Motion Perceptual Loss subsection): Details on the perceptual feature extractor (architecture, pre-training dataset, or artifact-specific fine-tuning) and any explicit controls for structure preservation (e.g., quantitative metrics on motion-free regions or against motion-free ground truth) are absent. This information is required to assess whether the reported ablation gains reflect genuine artifact suppression or potential overfitting to simulated patterns.
minor comments (2)
  1. [Abstract] Abstract: The statement that radiologist evaluations showed 'significantly higher diagnostic confidence' lacks any description of the evaluation protocol, number of readers, scoring scale, or statistical test used.
  2. [Figures/Tables] Figure captions and tables: Ensure all quantitative comparisons include error bars or confidence intervals and that ablation tables explicitly list the metric values for each configuration rather than only qualitative descriptions.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for the constructive and detailed comments, which help improve the clarity and rigor of our manuscript on PERCEPT-Net. We address each major comment below and will revise the manuscript accordingly to provide the requested details and quantitative support.

read point-by-point responses
  1. Referee: [Method] Method section: The hybrid training dataset is described only qualitatively as 'comprising both real and simulated paired volumes.' No counts of real versus simulated volumes, motion simulation parameters (e.g., rotation/translation ranges or artifact severity distribution), or confirmation of no patient overlap with the prospective test set are given. These omissions directly affect the load-bearing claim that the MPL learns generalized artifact representations transferable to unseen clinical data without anatomical distortion.

    Authors: We agree these specifics are necessary for reproducibility and to substantiate generalization. In the revised manuscript, we will report the exact counts of real paired volumes (acquired from clinical scans) versus simulated volumes, detail the motion simulation parameters including rotation and translation ranges along with artifact severity distribution, and confirm no patient overlap between the hybrid training set and the prospective test set. These additions will directly support the claim regarding the MPL's transferable artifact representations. revision: yes

  2. Referee: [Results] Results section: The abstract and results state that PERCEPT-Net achieves higher SSIM and PSNR values than SOTA methods and that ablation studies identify MPL as the primary contributor, yet no numerical values, baseline comparisons, standard deviations, error bars, or statistical significance tests are reported. Without these, the quantitative support for outperformance and the ablation conclusion cannot be evaluated.

    Authors: We acknowledge the absence of specific numbers limits evaluation. The revised results section and abstract will include tables reporting mean SSIM and PSNR values for PERCEPT-Net versus all SOTA baselines, with standard deviations, error bars, and statistical significance (e.g., p-values from paired t-tests or Wilcoxon tests). Ablation results will similarly present full quantitative metrics to demonstrate the MPL's primary contribution. revision: yes

  3. Referee: [Method] Method (Motion Perceptual Loss subsection): Details on the perceptual feature extractor (architecture, pre-training dataset, or artifact-specific fine-tuning) and any explicit controls for structure preservation (e.g., quantitative metrics on motion-free regions or against motion-free ground truth) are absent. This information is required to assess whether the reported ablation gains reflect genuine artifact suppression or potential overfitting to simulated patterns.

    Authors: We will expand the Motion Perceptual Loss subsection with the requested details: the feature extractor architecture (a pre-trained VGG network on ImageNet, specifying layers), pre-training dataset, and any MRI-specific fine-tuning. We will also add explicit controls, including quantitative SSIM/PSNR metrics on motion-free regions compared to ground truth, to confirm structure preservation and address potential overfitting concerns. revision: yes

Circularity Check

0 steps flagged

No circularity in empirical DL framework for MRI motion correction

full rationale

The paper describes an empirical deep learning method (residual U-Net with multi-scale recovery, dual attention, and Motion Perceptual Loss) trained on a hybrid real+simulated dataset and evaluated on a prospective clinical test set. Claims of outperformance and ablation results identifying MPL as the main contributor are standard supervised learning outcomes with external validation metrics (SSIM, PSNR, radiologist scores). No derivation chain, first-principles result, fitted parameter renamed as prediction, or self-citation load-bearing step is present; the architecture and loss are defined independently of the reported performance numbers.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is an applied empirical deep-learning study; no mathematical axioms, free parameters beyond standard network weights, or invented physical entities are introduced or required by the central claim.

pith-pipeline@v0.9.0 · 5592 in / 1185 out tokens · 37390 ms · 2026-05-10T15:46:41.473419+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

41 extracted references · 14 canonical work pages · 2 internal anchors

  1. [1]

    Introduction Magnetic resonance imaging (MRI) is an indispensable diagnostic modality in neuroradiology. Owing to its exceptional soft-tissue contrast and high spatial resolution, MRI remains the gold standard for evaluating a wide range of neurological disorders, including brain tumors, Alzheimer’s disease, and cerebrovascular pathologies [1–3]. However,...

  2. [2]

    1.5T and 3T MRI scanners with standardized T1-weighted and T2-weighted protocols were used

    Method 2.1 Data Acquisition and Preparation 2.1.1 Multi-Center Data Collection A prospective cohort of 664 patients was recruited from three medical centers: Changhai Hospital (Shanghai, China), 411 Hospital (Shanghai, China), and Putian Hospital (Fujian, China). 1.5T and 3T MRI scanners with standardized T1-weighted and T2-weighted protocols were used. T...

  3. [3]

    Results 3.1 Ablation study Three complementary ablation studies were conducted to systematically validate the efficacy of the proposed framework: first, to assess the impact of training data composition by isolating simulated versus real clinical data sources, thereby elucidating the value of authentic artifact representations; second, to verify the speci...

  4. [4]

    artifact- tissue confusion

    Discussion The generalization gap between synthetic training environments and clinical reality remains a primary hurdle for deep learning-based MRI restoration. This gap often manifests as "artifact- tissue confusion", where models fail to distinguish non-linear motion patterns from authentic anatomical structures. While general-purpose image restoration ...

  5. [5]

    Conclusion In this study, we proposed the PERCEPT-Net, a deep learning framework for MRI motion artifact removal. By introducing the Motion Perceptual Loss (MPL), the model learns generalizable representations of real motion artifacts, enabling accurate artifact suppression while preserving anatomical integrity. Validated on multi-center clinical data, PE...

  6. [6]

    Abdalrahman Ahmed Yassen Mahmoud, Marwan Khaled Ibrahim Ahmed, Teba Haitham Jameel Mohammed, Athraa Mahmoud Mohamed Hani, & Halah madhor Mahmoud. (2024). DEVELOPMENT OF MAGNETIC RESONANCE IMAGING (MRI) TECHNIQUES FOR STUDYING NEUROLOGICAL CHANGES ASSOCIATED WITH BRAIN DISEASES. European Journal of Medical Genetics and Clinical Biology, 1(10), 29 – 39. htt...

  7. [7]

    Zhang, J., Yu, L., Yu, M., Yu, D., Chen, Y., & Zhang, J. (2024). Engineering nanoprobes for magnetic resonance imaging of brain diseases. Chemical Engineering Journal, 481, 148472. https://doi.org/10.1016/j.cej.2023.148472

  8. [8]

    M., Abdulqader, A

    Taha, D. M., Abdulqader, A. T., Al-Khawaja, A. M. H., & Mousa, H. A. (2024). Review article about Magnetic Resonance Imaging (MRI). European Journal of Theoretical and Applied Sciences, 2(5), 530–535. https://doi.org/10.59324/ejtas.2024.2(5).51

  9. [9]

    Deep-learning methods for parallel magnetic resonance imaging reconstruction: A review of the current state-of-the-art

    Knoll, F., et al. (2020). "Deep-learning methods for parallel magnetic resonance imaging reconstruction: A review of the current state-of-the-art." IEEE Signal Processing Magazine

  10. [10]

    Gupta, S., & Vig, R. (2019). Detection and Correction of Head Motion and Physiological Artifacts in BOLD fMRI: A Study. 2019 9th International Conference on Cloud Computing, Data Science & Engineering (Confluence), 526 – 531. https://doi.org/10.1109/confluence.2019.8776963

  11. [11]

    J., Williams, S., Howard, R., Frackowiak, R

    Friston, K. J., Williams, S., Howard, R., Frackowiak, R. S., & Turner, R. (1996). Movement‐ related effects in fMRI time‐ series. Magnetic resonance in medicine, 35(3), 346- 355

  12. [12]

    R., Hiess, R

    Pardoe, H. R., Hiess, R. K., & Kuzniecky, R. (2016). Motion and morphometry in clinical and nonclinical populations. Neuroimage, 135, 177-185

  13. [13]

    H., Nybing, J

    Havsteen, I., Ohlhues, A., Madsen, K. H., Nybing, J. D., Christensen, H., & Christensen, A. (2017). Are movement artifacts in magnetic resonance imaging a real problem?— a narrative review. Frontiers in neurology, 8, 232

  14. [14]

    Wu, K., Xia, Y., Ravikumar, N., & Frangi, A. F. (2024). Compressed sensing using a deep adaptive perceptual generative adversarial network for MRI reconstruction from undersampled K-space data. Biomedical Signal Processing and Control, 96, 106560. https://doi.org/10.1016/j.bspc.2024.106560

  15. [15]

    Motion artifacts in MRI: A complex problem with many partial solutions

    Zaitsev, M., et al. "Motion artifacts in MRI: A complex problem with many partial solutions." Journal of Magnetic Resonance Imaging (2015)

  16. [16]

    Motion correction in MRI of the brain

    Godenschweger, F., et al. "Motion correction in MRI of the brain." Physics in Medicine & Biology (2016)

  17. [17]

    E., & Ye, J

    Oh, G., Jung, S., Lee, J. E., & Ye, J. C. (2023). Annealed score-based diffusion model for mr motion artifact reduction. IEEE Transactions on Computational Imaging, 10, 43-53

  18. [18]

    Safari, M., Yang, X., Fatemi, A., & Archambault, L. (2024). MRI motion artifact reduction using a conditional diffusion probabilistic model (MAR ‐ CDPM). Medical physics, 51(4), 2598-2610

  19. [19]

    Liu, J., Kocak, M., Supanich, M., & Deng, J. (2020). Motion artifacts reduction in brain MRI by means of a deep residual network with densely connected multi-resolution blocks (DRN- DCMB). Magnetic resonance imaging, 71, 69-79

  20. [20]

    H., & Lee, Y

    Kang, S. H., & Lee, Y. (2024). Motion artifact reduction using U-net model with three- dimensional simulation-based datasets for brain magnetic resonance volumes. Bioengineering, 11(3), 227

  21. [21]

    C., Otaki, Y., Kuronuma, K.,

    Lyu, Q., Shan, H., Xie, Y., Kwan, A. C., Otaki, Y., Kuronuma, K., ... & Wang, G. (2021). Cine cardiac MRI motion artifact reduction using a recurrent neural network. IEEE transactions on medical imaging, 40(8), 2170-2181

  22. [22]

    & Daida, H

    Usui, K., Muro, I., Shibukawa, S., Goto, M., Ogawa, K., Sakano, Y., ... & Daida, H. (2023). Evaluation of motion artefact reduction depending on the artefacts’ directions in head MRI using conditional generative adversarial networks. Scientific Reports, 13(1), 8526

  23. [23]

    M., & Deng, J

    Wu, Y., Liu, J., White, G. M., & Deng, J. (2023). Image-based motion artifact reduction on liver dynamic contrast enhanced MRI. Physica Medica, 105, 102509

  24. [24]

    H., Chen, S., Ng, Y

    Jiang, W., Liu, Z., Lee, K. H., Chen, S., Ng, Y. L., Dou, Q., ... & Kwok, K. W. (2019). Respiratory motion correction in abdominal MRI using a densely connected U-Net with GAN- guided training. arXiv preprint arXiv:1906.09745

  25. [25]

    & Yang, G

    Cui, L., Song, Y., Wang, Y., Wang, R., Wu, D., Xie, H., ... & Yang, G. (2023). Motion artifact reduction for magnetic resonance imaging with deep learning and k-space analysis. PloS one, 18(1), e0278668

  26. [26]

    E., & Ye, J

    Oh, G., Lee, J. E., & Ye, J. C. (2021). Unpaired MR motion artifact deep learning using outlier-rejecting bootstrap aggregation. IEEE Transactions on Medical Imaging, 40(11), 3125- 3139

  27. [27]

    Ding, P. L. K., Li, Z., Zhou, Y., & Li, B. (2020). Deep Residual Dense U-Net for Resolution Enhancement in Accelerated MRI Acquisition. arXiv preprint arXiv:2001.04488

  28. [28]

    MRI Motion Correction Through Disentangled CycleGAN Based on Multi-Mask K-Space Subsampling[J]

    Chen G, et al. MRI Motion Correction Through Disentangled CycleGAN Based on Multi-Mask K-Space Subsampling[J]. IEEE Transactions on Medical Imaging, 2024

  29. [29]

    Yue, Z., Wang, J., & Loy, C. C. (2023). ResShift: Efficient diffusion model for image super-resolution by residual shifting. arXiv preprint arXiv:2307.12348v3 [cs.CV]. https://github.com/zsyOAOA/ResShift

  30. [30]

    Diffusion Posterior Sampling for General Noisy Inverse Problems

    Chung H, Kim J, Mccann MT, et al. Diffusion posterior sampling for general noisy inverse problems. arXiv preprint arXiv:2209.14687. 2022

  31. [31]

    Motion artifact reduction for magnetic resonance imaging with deep learning and k-space analysis

    Cui L, Song Y, Wang Y, Wang R, Wu D, Xie H, et al. Motion artifact reduction for magnetic resonance imaging with deep learning and k-space analysis. PLoS ONE. 2023 Jan 5;18(1):e0278668. doi:10.1371/journal.pone.0278668

  32. [32]

    Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network,

    C. Ledig et al., "Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 4681-4690

  33. [33]

    Perceptual losses for real-time style transfer and super-resolution

    Johnson, J., et al. "Perceptual losses for real-time style transfer and super-resolution." ECCV (2016)

  34. [34]

    Confidence Calibration and Predictive Uncertainty Estimation for Deep Medical Image Segmentation

    Mehrtash, A., et al. "Confidence Calibration and Predictive Uncertainty Estimation for Deep Medical Image Segmentation." IEEE Transactions on Medical Imaging, 2019

  35. [35]

    DIMA: DIffusing Motion Artifacts for Unsupervised Correction in Brain MRI Images,

    P. Angella, L. Balbi, F. Ferrando, P. Traverso, R. Varriale, and V. P. Pastore, "DIMA: DIffusing Motion Artifacts for Unsupervised Correction in Brain MRI Images," IEEE Access, vol. 13, pp. 1-12, Nov. 2025, doi: 10.1109/ACCESS.2025.3634749

  36. [36]

    Jiang, L., Dai, B., Wu, W., & Loy, C. C. (2021). Focal Frequency Loss for Image Reconstruction and Synthesis. arXiv:2012.12821v3 [cs.CV]. https://doi.org/10.48550/arXiv.2012.12821

  37. [37]

    Generative Adversarial Networks

    Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., & Bengio, Y. (2014). Generative Adversarial Networks. arXiv:1406.2661v1 [stat.ML]. https://doi.org/10.48550/arXiv.1406.2661

  38. [38]

    Image-to-image translation with conditional adversarial networks,

    P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros, "Image-to-image translation with conditional adversarial networks," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017, pp. 1125–1134

  39. [39]

    A., Shechtman, E., & Wang, O

    (LPIPS): Zhang, R., Isola, P., Efros, A. A., Shechtman, E., & Wang, O. (2018). The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

  40. [40]

    (FID): Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., & Hochreiter, S. (2017). GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Advances in Neural Information Processing Systems (NeurIPS)

  41. [41]

    R., & Henderson, J

    Hayes, T. R., & Henderson, J. M. (2021). Deep saliency models learn low-, mid-, and high-level features to predict scene attention. Scientific Reports, 11(1), Article 18434. https://doi.org/10.1038/s41598-021-97879-z) Table 1. Detailed acquisition parameters of each sequence across the three centers Center ChangHai Hospital 411 Hospital Putian City Hospit...