Multi-Contrast MRI Motion Correction via Parameter-Informed Disentanglement and Adaptive Experts
Pith reviewed 2026-06-28 20:32 UTC · model grok-4.3
The pith
A unified framework corrects motion artifacts across MRI contrasts by disentangling acquisition parameters from anatomy.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central claim is that parameter-informed contrast disentanglement combined with severity-aware adaptive correction produces a single model that removes motion artifacts more effectively than prior methods. On IXI and HCP benchmarks the framework raises PSNR by 0.75 dB and SSIM by up to 0.0279, with the largest improvements at high artifact levels. It further shows zero-shot generalization to real clinical data acquired with scanning parameters absent from training, where previous approaches either leave artifacts or add new distortions.
What carries the argument
ScanCLIP embeddings derived from acquisition parameters, which isolate contrast-free anatomical features for downstream severity estimation and expert routing.
If this is right
- The method achieves higher PSNR and SSIM than state-of-the-art approaches on IXI and HCP benchmarks.
- Performance gains increase with higher motion artifact severities.
- The same model generalizes in zero-shot fashion to real clinical scans with unseen acquisition parameters.
- Existing methods either leave residual artifacts or introduce new distortions on those unseen-parameter scans.
Where Pith is reading between the lines
- Hospitals could maintain fewer specialized models if one network handles multiple MRI contrasts and protocols.
- The parameter-embedding approach might apply to other medical imaging tasks where acquisition settings vary widely.
- Pretraining on large paired text-image datasets could supply useful priors for other medical image restoration problems.
- Integration with scanner software could allow real-time severity estimation to guide acquisition adjustments.
Load-bearing premise
ScanCLIP embeddings derived from acquisition parameters successfully produce contrast-free anatomical features that a vision transformer and mixture-of-experts can use for accurate severity estimation and targeted correction without introducing new distortions.
What would settle it
Direct visual and quantitative comparison of output images on real clinical MRI volumes acquired with scanning parameters never seen during training, checking whether artifacts are removed without added distortions.
Figures
read the original abstract
Motion artifacts in magnetic resonance imaging (MRI) degrade diagnostic reliability. Existing deep learning methods are typically contrast-specific and fail to generalize across diverse modalities and artifact severities. We propose a unified framework combining parameter-informed contrast disentanglement with severity-aware adaptive correction. ScanCLIP, pretrained on over 30,000 MRI text-image pairs, derives contrast embeddings from acquisition parameters to disentangle contrast style from anatomical content, yielding contrast-free features. A Vision Transformer then estimates motion severity and routes features through a Mixture-of-Experts network, enabling targeted artifact correction. A dual-pathway decoder reconstructs both the clean image and residual artifact map, enforcing image-space consistency. On IXI and HCP benchmarks, our method improves PSNR by 0.75 dB and SSIM by up to 0.0279 over state-of-the-art approaches, with larger gains at higher artifact severities. It further demonstrates robust zero-shot generalization on real-world clinical data acquired with unseen scanning parameters, where existing methods either fail to remove artifacts or introduce additional distortions.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a unified framework for multi-contrast MRI motion correction. It employs ScanCLIP, pretrained on over 30,000 MRI text-image pairs, to derive contrast embeddings from acquisition parameters, disentangling contrast style from anatomical content. A Vision Transformer estimates motion severity and routes features through a Mixture-of-Experts network for targeted correction. A dual-pathway decoder reconstructs the clean image and residual artifact map. Evaluations on IXI and HCP benchmarks show PSNR improvements of 0.75 dB and SSIM gains up to 0.0279 over SOTA, with larger gains at higher artifact severities, and zero-shot generalization on real clinical data with unseen parameters.
Significance. If the empirical results hold under scrutiny, this work could be significant for the field of MRI image reconstruction. It addresses the challenge of contrast-specific methods by using parameter-informed disentanglement and adaptive experts, potentially enabling robust correction across diverse modalities and artifact levels. The zero-shot generalization claim on clinical data is particularly noteworthy if supported by rigorous validation. The integration of pretrained models like ScanCLIP with MoE routing represents a promising direction for handling variability in MRI acquisitions.
major comments (2)
- [Methods] The description of how ScanCLIP embeddings are used to produce contrast-free anatomical features for the Vision Transformer and Mixture-of-Experts requires more detail on the disentanglement process, loss terms, and any regularization to ensure no new distortions are introduced. This is central to the zero-shot generalization claim.
- [Results] The reported improvements (PSNR +0.75 dB, SSIM +0.0279) should be accompanied by statistical significance tests, standard deviations across runs, and ablation studies isolating the contribution of the parameter-informed disentanglement versus the adaptive experts component.
minor comments (2)
- Ensure that all acronyms (e.g., PSNR, SSIM, MoE, ViT) are defined at first use in the main text.
- [Abstract] The abstract mentions 'over 30,000 MRI text-image pairs' for pretraining; provide the exact source dataset or reference in the methods section.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our manuscript. We address each major comment below and will revise the manuscript to incorporate additional details and analyses as outlined.
read point-by-point responses
-
Referee: [Methods] The description of how ScanCLIP embeddings are used to produce contrast-free anatomical features for the Vision Transformer and Mixture-of-Experts requires more detail on the disentanglement process, loss terms, and any regularization to ensure no new distortions are introduced. This is central to the zero-shot generalization claim.
Authors: We appreciate the referee's point that additional clarity on the disentanglement mechanism would strengthen the paper. We will expand Section 3 to provide explicit details on how ScanCLIP embeddings are processed to yield contrast-free features (including the subtraction operation and feature routing), the full set of loss terms (contrast alignment, reconstruction, and consistency losses), and any regularization applied to preserve anatomical content. These additions will directly support the zero-shot generalization discussion. revision: yes
-
Referee: [Results] The reported improvements (PSNR +0.75 dB, SSIM +0.0279) should be accompanied by statistical significance tests, standard deviations across runs, and ablation studies isolating the contribution of the parameter-informed disentanglement versus the adaptive experts component.
Authors: We agree that statistical tests, variability measures, and targeted ablations would improve the rigor of the empirical claims. We will add paired statistical significance tests with p-values, report standard deviations from multiple independent runs, and include a new ablation study (with corresponding table) that isolates the parameter-informed disentanglement module from the severity-aware Mixture-of-Experts routing. revision: yes
Circularity Check
No significant circularity detected
full rationale
The abstract and available description present an empirical method relying on an externally pretrained ScanCLIP model (on >30k pairs) plus standard ViT and MoE components, evaluated on public IXI/HCP benchmarks with reported PSNR/SSIM gains. No equations, self-citations, fitted parameters renamed as predictions, or uniqueness theorems are quoted that reduce any claimed result to its own inputs by construction. The central claims rest on observable performance deltas rather than definitional equivalence, making the derivation self-contained.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption ScanCLIP embeddings derived solely from acquisition parameters can separate contrast style from anatomical content to produce usable contrast-free features.
- domain assumption Motion severity can be estimated accurately enough from image features to select the appropriate expert network without error propagation.
Reference graph
Works this paper leans on
-
[1]
Ujjwal Baid, Satyam Ghodasara, Suyash Mohan, Michel Bilello, Evan Calabrese, Errol Colak, Keyvan Farahani, Jayashree Kalpathy-Cramer, Felipe C Kitamura, Sarthak Pati, et al. The rsna-asnr-miccai brats 2021 benchmark on brain tumor segmentation and radiogenomic classification. arXiv preprint arXiv:2107.02314, 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[2]
Simple baselines for image restoration.ECCV, 2022
Liangyu Chen, Xiaojie Chu, Xiangyu Zhang, and Jian Sun. Simple baselines for image restoration.ECCV, 2022
2022
-
[3]
Self-supervised learning for medical image analysis using image context restoration.Medical image analysis, 58:101539, 2020
Xiahai Zhuang Chen, Li Lei, Yefeng Zheng, et al. Self-supervised learning for medical image analysis using image context restoration.Medical image analysis, 58:101539, 2020
2020
-
[4]
Retrospective motion artifact correction of structural mri images using deep learning improves the quality of cortical surface reconstructions.NeuroImage, 230:117756, 2021
Ben A Duffy, Lu Zhao, Farshid Sepehrband, Joyce Min, Danny JJ Wang, Yonggang Shi, Arthur W Toga, Hosung Kim, Alzheimer’s Disease Neu- roimaging Initiative, et al. Retrospective motion artifact correction of structural mri images using deep learning improves the quality of cortical surface reconstructions.NeuroImage, 230:117756, 2021. 22
2021
-
[5]
Benjamin A Duffy, Liangjia Zhao, Fei Li, and Xiaoxiao Li. Deep learning- based motion correction in mri: promises, challenges, and future direc- tions.arXiv preprint arXiv:2104.04340, 2021
-
[6]
Image denoising via sparse and redundant representations over learned dictionaries.IEEE Transactions on Image processing, 15(12):3736–3745, 2006
Michael Elad and Michal Aharon. Image denoising via sparse and redundant representations over learned dictionaries.IEEE Transactions on Image processing, 15(12):3736–3745, 2006
2006
-
[7]
KathrynAEllis, AshleyIBush, DavidDarby, DanielaDeFazio, Jonathan Foster, Peter Hudson, Nicola T Lautenschlager, Nat Lenzo, Ralph N Martins, Paul Maruff, et al. The australian imaging, biomarkers and lifestyle (aibl) study of aging: methodology and baseline characteristics of 1112 individuals recruited for a longitudinal study of alzheimer’s disease. Inter...
2009
-
[8]
Denoising diffusion prob- abilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
Jonathan Ho, Ajay Jain, and Pieter Abbeel. Denoising diffusion prob- abilistic models.Advances in neural information processing systems, 33:6840–6851, 2020
2020
-
[9]
The unc/umn baby connectome project (bcp): An overview of the study design and protocol development.NeuroImage, 185:891–905, 2019
Brittany R Howell, Martin A Styner, Wei Gao, Pew-Thian Yap, Li Wang, Kristine Baluyot, Essa Yacoub, Geng Chen, Taylor Potts, Andrew Salzwedel, et al. The unc/umn baby connectome project (bcp): An overview of the study design and protocol development.NeuroImage, 185:891–905, 2019
2019
-
[10]
Improved optimization for the robust and accurate linear registration and motion correction of brain images.Neuroimage, 17(2):825–841, 2002
Mark Jenkinson, Peter Bannister, Michael Brady, and Stephen Smith. Improved optimization for the robust and accurate linear registration and motion correction of brain images.Neuroimage, 17(2):825–841, 2002
2002
-
[11]
Johnson and Maria Drangova
Patricia M. Johnson and Maria Drangova. Conditional generative ad- versarial network for 3d rigid-body motion correction in mri.Magnetic Resonance in Medicine, 82(3):901–910, 2019
2019
-
[12]
Retrospective correction of motion-affected mr images using deep learning frameworks.Magnetic Resonance in Medicine, 82(4):1527–1540, 2019
Thomas Küstner, Karim Armanious, Jiahuan Yang, Bin Yang, Fritz Schick, and Sergios Gatidis. Retrospective correction of motion-affected mr images using deep learning frameworks.Magnetic Resonance in Medicine, 82(4):1527–1540, 2019
2019
-
[13]
Oasis-3: longitudi- nal neuroimaging, clinical, and cognitive dataset for normal aging and alzheimer disease.MedRxiv, pages 2019–12, 2019
Pamela J LaMontagne, Tammie LS Benzinger, John C Morris, Sarah Keefe, Russ Hornbeck, Chengjie Xiong, Elizabeth Grant, Jason Hassen- 23 stab, Krista Moulder, Andrei G Vlassenko, et al. Oasis-3: longitudi- nal neuroimaging, clinical, and cognitive dataset for normal aging and alzheimer disease.MedRxiv, pages 2019–12, 2019
2019
-
[14]
Swinir: Image restoration using swin transformer
Jingyun Liang, Jiezhang Cao, Guolei Sun, Kai Zhang, Luc Van Gool, and Radu Timofte. Swinir: Image restoration using swin transformer. InProceedings of the IEEE/CVF international conference on computer vision, pages 1833–1844, 2021
2021
-
[15]
Deep learning for mri motion artifact correction: A systematic review.Magnetic Resonance Imaging, 82:109–125, 2021
Wei Lin, Yang Chen, and Jianxin Wu. Deep learning for mri motion artifact correction: A systematic review.Magnetic Resonance Imaging, 82:109–125, 2021
2021
-
[16]
Swin transformer: Hierarchical vision transformer using shifted windows.Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 10012–10022, 2021
Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, and Baining Guo. Swin transformer: Hierarchical vision transformer using shifted windows.Proceedings of the IEEE/CVF Inter- national Conference on Computer Vision, pages 10012–10022, 2021
2021
-
[17]
Blind retrospective motion correction of mr images.Magnetic Resonance in Medicine, 70(6):1608–1618, 2013
Alexander Loktyushin, Hannes Nickisch, Rolf Pohmann, and Bernhard Schölkopf. Blind retrospective motion correction of mr images.Magnetic Resonance in Medicine, 70(6):1608–1618, 2013
2013
-
[18]
Prospective motion correction in brain imaging: a review.Magnetic resonance in medicine, 69(3):621–636, 2013
Julian Maclaren, Michael Herbst, Oliver Speck, and Maxim Zaitsev. Prospective motion correction in brain imaging: a review.Magnetic resonance in medicine, 69(3):621–636, 2013
2013
-
[19]
An- nealed score-based diffusion model for mr motion artifact reduction
Gyutaek Oh, Sukyoung Jung, Jeong Eun Lee, and Jong Chul Ye. An- nealed score-based diffusion model for mr motion artifact reduction. IEEE Transactions on Computational Imaging, 2023
2023
-
[20]
Alzheimer’s dis- ease neuroimaging initiative (adni): clinical characterization.Neurology, 74(3):201–209, 2010
Ronald Carl Petersen, Paul S Aisen, Laurel A Beckett, Michael C Donohue, Anthony Collins Gamst, Danielle J Harvey, Clifford R Jack, William J Jagust, Leslie M Shaw, Arthur W Toga, et al. Alzheimer’s dis- ease neuroimaging initiative (adni): clinical characterization.Neurology, 74(3):201–209, 2010
2010
-
[21]
Promptir: Prompting for all-in-one image restoration
Vaishnav Potlapalli, Syed Waqas Zamir, Salman H Khan, and Fahad Shahbaz Khan. Promptir: Prompting for all-in-one image restoration. Advances in Neural Information Processing Systems, 36:71275–71293, 2023. 24
2023
-
[22]
Learning transferable visual models from natural language supervision.International Conference on Machine Learning, pages 8748–8763, 2021
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision.International Conference on Machine Learning, pages 8748–8763, 2021
2021
-
[23]
High-resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. High-resolution image synthesis with latent diffusion models.Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 10684–10695, 2022
2022
-
[24]
Nonlinear total varia- tion based noise removal algorithms.Physica D: nonlinear phenomena, 60(1-4):259–268, 1992
Leonid I Rudin, Stanley Osher, and Emad Fatemi. Nonlinear total varia- tion based noise removal algorithms.Physica D: nonlinear phenomena, 60(1-4):259–268, 1992
1992
-
[25]
Image information and visual quality
Hamid R Sheikh and Alan C Bovik. Image information and visual quality. IEEE Transactions on Image Processing, 15(2):430–444, 2006
2006
-
[26]
Motion artifact correction in mri using gan-based channel attention transformer
Tsung-Han Tsai, Yz-Heng Lin, and Tsung-Hsien Lin. Motion artifact correction in mri using gan-based channel attention transformer. In IEEE Biomedical Circuits and Systems Conference, pages 1–5, 2023
2023
-
[27]
The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013
David C Van Essen, Stephen M Smith, Deanna M Barch, Timothy EJ Behrens, Essa Yacoub, Kamil Ugurbil, Wu-Minn HCP Consortium, et al. The wu-minn human connectome project: an overview.Neuroimage, 80:62–79, 2013
2013
-
[28]
Transformer-based motion correction for mri.Medical Image Analysis, 76:102342, 2022
Yuhao Wang, Yibo Zhou, Wenqi Yu, Chen Chen, and Liang Wang. Transformer-based motion correction for mri.Medical Image Analysis, 76:102342, 2022
2022
-
[29]
Toward general text-guided multimodal brain mri synthesis for diagnosis and medical image analysis.Cell Reports Medicine, 2025
Yulin Wang, Honglin Xiong, Kaicong Sun, Shuwei Bai, Ling Dai, Zhongx- iang Ding, Jiameng Liu, Qian Wang, Qian Liu, and Dinggang Shen. Toward general text-guided multimodal brain mri synthesis for diagnosis and medical image analysis.Cell Reports Medicine, 2025
2025
-
[30]
Image quality assessment: from error visibility to structural similarity
Zhou Wang, Alan C Bovik, Hamid R Sheikh, and Eero P Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4):600–612, 2004. 25
2004
-
[31]
Measurement-conditioned denoising diffusion probabilistic model for under-sampled medical image recon- struction
Yutong Xie and Quanzheng Li. Measurement-conditioned denoising diffusion probabilistic model for under-sampled medical image recon- struction. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 655–664, 2022
2022
-
[32]
Learning contrast and content representations for synthesizing magnetic resonance image of arbitrary contrast.Medical Image Analysis, 104:103635, 2025
Honglin Xiong, Yulin Wang, Zhenrong Shen, Kaicong Sun, Yu Fang, Yan Chen, Dinggang Shen, and Qian Wang. Learning contrast and content representations for synthesizing magnetic resonance image of arbitrary contrast.Medical Image Analysis, 104:103635, 2025
2025
-
[33]
Self-supervised learning of physics-guided reconstruction neural networks without fully sampled reference data.Magnetic resonance in medicine, 84(6):3172– 3191, 2020
Burhaneddin Yaman, Seyed Amir Hossein Hosseini, Steen Moeller, Jutta Ellermann, Kâmil Uğurbil, and Mehmet Akçakaya. Self-supervised learning of physics-guided reconstruction neural networks without fully sampled reference data.Magnetic resonance in medicine, 84(6):3172– 3191, 2020
2020
-
[34]
All-in-one medical image restoration via task-adaptive routing
Zhiwen Yang, Haowei Chen, Ziniu Qian, Yang Yi, Hui Zhang, Dan Zhao, Bingzheng Wei, and Yan Xu. All-in-one medical image restoration via task-adaptive routing. InInternational Conference on Medical Image Computing and Computer-Assisted Intervention, pages 67–77. Springer, 2024
2024
-
[35]
Resshift: Efficient diffusion model for image super-resolution by residual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023
Zongsheng Yue, Jianyi Wang, and Chen Change Loy. Resshift: Efficient diffusion model for image super-resolution by residual shifting.Advances in Neural Information Processing Systems, 36:13294–13307, 2023
2023
-
[37]
Motion artifacts in mri: A complex problem with many partial solutions.Journal of Magnetic Resonance Imaging, 42(4):887–901, 2015
Maxim Zaitsev, Julian Maclaren, and Michael Herbst. Motion artifacts in mri: A complex problem with many partial solutions.Journal of Magnetic Resonance Imaging, 42(4):887–901, 2015
2015
-
[38]
Restormer: Efficient transformer for high-resolution image restoration
Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, and Ming-Hsuan Yang. Restormer: Efficient transformer for high-resolution image restoration. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5728–5739, 2022. 26
2022
-
[39]
Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE transactions on image processing, 26(7):3142–3155, 2017
Kai Zhang, Wangmeng Zuo, Yunjin Chen, Deyu Meng, and Lei Zhang. Beyond a gaussian denoiser: Residual learning of deep cnn for image denoising.IEEE transactions on image processing, 26(7):3142–3155, 2017
2017
-
[40]
Segmentation of brain mr images through a hidden markov random field model and the expectation-maximization algorithm.IEEE Transactions on Medical Imaging, 20(1):45–57, 2001
Yongyue Zhang, Michael Brady, and Stephen Smith. Segmentation of brain mr images through a hidden markov random field model and the expectation-maximization algorithm.IEEE Transactions on Medical Imaging, 20(1):45–57, 2001. 27
2001
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.