arxiv: 2604.19512 · v1 · submitted 2026-04-21 · 📡 eess.IV

Recognition: unknown

Defining Robust Ultrasound Quality Metrics via an Ultrasound Foundation Model

Ziyang Huang , Bingyan Li , Chen Ma , Tianyi Liu , Yihui Zhai , Hong Xu , Yi Guo , Zeju Li

show 1 more author

Yuanyuan Wang

Authors on Pith no claims yet

Pith reviewed 2026-05-10 01:07 UTC · model grok-4.3

classification 📡 eess.IV

keywords ultrasound quality metricsfoundation modelperceptual distanceno-reference qualityimage reconstructiondiagnostic utilitysegmentation correlation

0 comments

The pith

A TinyUSFM foundation model supplies ultrasound quality metrics that align with clinical task performance and expert preference where PSNR and VGG-LPIPS do not.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper seeks to replace inadequate general-purpose image metrics with ones tailored to ultrasound physics and diagnostic needs. It builds two new scores from a compact ultrasound foundation model: a full-reference perceptual distance that tracks semantic damage in tasks such as segmentation, and a no-reference score that flags localized artifacts without a clean reference image. These scores maintain consistent scales across organs and scanners while correlating with PSNR when references exist. The approach matters because current metrics often reward images that look good numerically but perform poorly in real diagnosis. If the metrics hold, reconstruction and enhancement pipelines can be optimized directly for clinical utility rather than proxy fidelity.

Core claim

By training TinyUSFM on ultrasound data and deriving TinyUSFM-uLPIPS from multi-layer token relations plus TinyUSFM-NRQ from clean-manifold modeling with worst-region aggregation, the work produces metrics with four advantages: superior calibration to semantic task damage such as Dice-score drops, stable scoring across anatomical sites and domain shifts, consistency with PSNR in the absence of ground truth, and improved prediction of expert preference from 47.2% to 72.8% accuracy, thereby establishing a modality-aligned standard that links algorithmic output to diagnostic value.

What carries the argument

TinyUSFM, a compact ultrasound foundation model whose learned feature space supplies distances for the full-reference metric TinyUSFM-uLPIPS and manifold deviations for the no-reference metric TinyUSFM-NRQ.

If this is right

Image reconstructions can be optimized in a closed loop using the new metrics to maximize downstream task performance rather than pixel similarity.
No-reference quality scoring becomes feasible while remaining consistent with traditional fidelity measures such as PSNR.
Quality rankings stay comparable and stable when the same metric is applied to images from different anatomical sites or acquisition devices.
Automated selection or enhancement of ultrasound images can achieve higher agreement with sonographer judgment.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same foundation-model approach could be repeated for other modalities once comparable small foundation models exist, allowing cross-modality quality standards.
Deploying TinyUSFM-NRQ on real-time scanners could provide immediate feedback on image adequacy before diagnosis.
Further validation on larger multi-center datasets would test whether the observed gains in expert-preference prediction generalize beyond the reported experiments.

Load-bearing premise

The TinyUSFM model has learned representations whose distances and deviations in feature space correspond to clinically relevant quality differences in ultrasound images across organs and scanners.

What would settle it

A dataset of ultrasound reconstructions with varied organs, scanners, and expert ratings where TinyUSFM-uLPIPS fails to correlate more strongly with Dice-score changes than VGG-LPIPS, or where TinyUSFM-NRQ rankings diverge from both PSNR and sonographer preference.

Figures

Figures reproduced from arXiv: 2604.19512 by Bingyan Li, Chen Ma, Hong Xu, Tianyi Liu, Yi Guo, Yihui Zhai, Yuanyuan Wang, Zeju Li, Ziyang Huang.

**Figure 1.** Figure 1: Overview of the TinyUSFM-based ultrasound quality evaluation framework, including metric motivation (left), proposed full-reference (FR) and no-reference (NR) metrics (middle), and target properties with shared-feature downstream utility (right). erase a critical boundary even when global intensity change slightly. In contrast, adjustments to global amplification or depth-dependent amplification alter pixe… view at source ↗

**Figure 2.** Figure 2: Task-anchor calibration of FR metrics under PSNR-aligned evaluation. Left: representative DDTI rank–rank scatter at PSNR= 20 for VGG-LPIPS and TinyUSFMuLPIPS. Right: Kendall’s τ summarized across segmentation anchors and PSNR targets; colored dots indicate individual anchor–PSNR settings. Cross-Organ Comparability (FR): To evaluate cross-organ comparability, we assess the consistency of degradation rank… view at source ↗

**Figure 3.** Figure 3: Cross-organ robustness and NRQ baseline comparison under PSNR-aligned evaluation. (a) Cross-organ ranking consistency for FR and NRQ metrics (Kendall’s W, PSNR 20–25 dB). (b) Cross-organ scale dispersion for FR metrics (IQR across organs) on representative degradations (aggregated over PSNR 20–25 dB). (c) NRQ baseline comparison: within-organ correlation (Spearman ρ) and clinician 2AFC agreement. 3.2 Full-… view at source ↗

**Figure 4.** Figure 4: SR comparison under three training objectives. (a–d) Representative breast ultrasound example (ground truth; L1; L1 + LVGG; L1 + LTinyUSFM). (e) Quantitative comparison using TinyUSFM-uLPIPS and TinyUSFM-NRQ. In a blinded clinician pairwise evaluation, expert sonographers compared reconstructions from the same case and selected the one more suitable for clinical diagnosis. Clinicians preferred LTinyUSFM re… view at source ↗

read the original abstract

Clinicians lack a principled framework to quantify diagnostic utility in ultrasound reconstructions. Existing standards like PSNR and VGG-LPIPS are inadequate, failing to account for modality-specific physics or the structural nuances of acoustic imaging. We close this gap with a TinyUSFM-based evaluation framework featuring two distinct metrics: TinyUSFM-uLPIPS, a full-reference perceptual distance based on multi-layer token relations, and TinyUSFM-NRQ, a deployable no-reference quality score utilizing clean-manifold modeling and worst-region aggregation to detect localized harmful artifacts. We demonstrate that the presented metrics have four unique advantages: 1) Task-linked quality, where TinyUSFM-uLPIPS achieves superior calibration with semantic task damage, accurately reflecting Dice-score drops in segmentation where VGG-based metrics fail; 2) Cross-organ comparability, maintaining stable scoring scales and consistent severity rankings across diverse anatomical sites and domain-shifted data; 3) PSNR-consistent sensitivity, with TinyUSFM-NRQ providing a reliable quality score without ground-truth images that remains consistent with traditional fidelity benchmarks (i.e. PSNR); and 4) Clinical utility, improving the prediction of expert preference from 47.2$\%$ to 72.8$\%$ accuracy and producing super-resolution reconstructions preferred by sonographers. By integrating these advantages into a unified assessment and optimization loop, this work establishes a modality-aligned standard that finally bridges the gap between algorithmic performance and diagnostic utility. https://github.com/sextant-fable/US-Metrics

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper introduces two ultrasound-specific quality metrics from a TinyUSFM foundation model with reported gains in task calibration and expert preference, but the abstract leaves the supporting evidence thin.

read the letter

The main thing to know is that this paper builds two ultrasound quality metrics on a foundation model and reports they do better than standard ones at matching task performance and expert views. The new parts are the designs for TinyUSFM-uLPIPS using multi-layer token relations and TinyUSFM-NRQ with clean-manifold modeling plus worst-region aggregation. They back this with claims of superior calibration to Dice scores in segmentation, stable cross-organ results, PSNR consistency for the no-reference version, and lifting expert preference prediction to 72.8% from 47.2%.

It does well by focusing on a practical issue in medical imaging. Ultrasound needs metrics that respect its specific characteristics like speckle and attenuation, and the authors try to deliver that with concrete comparisons to VGG-LPIPS and PSNR.

The soft spots come from the presentation. The abstract skips equations, training details, dataset info, and statistical tests, so the numbers are hard to trust without the full text. The key assumption that the model's features reflect acoustic physics and clinical quality, not just image stats, is not checked against domain shifts or held-out data. The circularity concern is real here because the model is ultrasound-specific, and without explicit verification the gains could stem from training data rather than modality alignment.

This is for people doing ultrasound reconstruction or super-resolution work. Readers who want modality-specific perceptual metrics will get something out of the ideas and the reported numbers.

It deserves a serious referee. The topic is relevant and the ideas are testable. I would recommend sending it for peer review, but with requests for the missing methodological details and generalization experiments.

Referee Report

3 major / 1 minor

Summary. The paper claims to define robust ultrasound quality metrics using an Ultrasound Foundation Model (TinyUSFM). It introduces TinyUSFM-uLPIPS as a full-reference perceptual distance metric based on multi-layer token relations and TinyUSFM-NRQ as a no-reference quality score using clean-manifold modeling and worst-region aggregation. The metrics are said to offer task-linked quality by better correlating with Dice score drops in segmentation tasks, cross-organ comparability with stable scoring across anatomical sites, PSNR-consistent sensitivity for no-reference use, and clinical utility by improving expert preference prediction accuracy to 72.8%. The work positions these as a modality-aligned standard bridging algorithmic performance and diagnostic utility, with code available on GitHub.

Significance. Should the central assumption hold—that the TinyUSFM feature space faithfully encodes ultrasound-specific physics and clinical quality nuances—the proposed metrics could provide a valuable, unified framework for assessing and optimizing ultrasound reconstructions and super-resolution methods. This would address limitations of generic metrics like PSNR and VGG-LPIPS. The emphasis on clinical validation through expert preferences is a strength, as is the open-source release. However, the overall significance is currently limited by insufficient evidence in the abstract for the claims, and the potential for the metrics to reflect model training artifacts rather than true modality alignment.

major comments (3)

[Abstract] The claim that TinyUSFM-uLPIPS 'achieves superior calibration with semantic task damage, accurately reflecting Dice-score drops in segmentation where VGG-based metrics fail' is central to the task-linked quality advantage but is presented without supporting equations, experimental details, or quantitative results (e.g., correlation coefficients); this undermines the ability to evaluate if the metric is load-bearing for the paper's conclusions.
[Abstract] All four advantages presuppose that multi-layer token relations and clean-manifold deviations in TinyUSFM encode acoustic imaging phenomena (speckle, attenuation, reverberation) and clinical quality rather than generic image statistics; no tests for this (e.g., against physics-based degradations or unseen scanners) are described, posing a correctness risk to the cross-organ and modality-alignment claims.
[Abstract] The clinical utility is quantified as improving expert preference prediction from 47.2% to 72.8% accuracy, but without details on the evaluation protocol, sample size, or statistical significance, it is difficult to assess the robustness of this result which is key to the paper's closing claim.

minor comments (1)

[Abstract] Consider expanding the acronym TinyUSFM upon first use for clarity.

Simulated Author's Rebuttal

3 responses · 0 unresolved

We thank the referee for their constructive comments, which correctly note that the abstract would be strengthened by including more quantitative details and experimental context for the central claims. We will revise the abstract accordingly to address these points. Below we respond point-by-point to the major comments.

read point-by-point responses

Referee: [Abstract] The claim that TinyUSFM-uLPIPS 'achieves superior calibration with semantic task damage, accurately reflecting Dice-score drops in segmentation where VGG-based metrics fail' is central to the task-linked quality advantage but is presented without supporting equations, experimental details, or quantitative results (e.g., correlation coefficients); this undermines the ability to evaluate if the metric is load-bearing for the paper's conclusions.

Authors: The full manuscript provides the experimental protocol, segmentation task details, and quantitative correlation results (including direct comparisons to VGG-LPIPS) in the results section. We agree the abstract would benefit from including the key correlation coefficients to make this claim more self-contained. We will revise the abstract to incorporate these quantitative results and a brief reference to the segmentation evaluation. revision: yes
Referee: [Abstract] All four advantages presuppose that multi-layer token relations and clean-manifold deviations in TinyUSFM encode acoustic imaging phenomena (speckle, attenuation, reverberation) and clinical quality rather than generic image statistics; no tests for this (e.g., against physics-based degradations or unseen scanners) are described, posing a correctness risk to the cross-organ and modality-alignment claims.

Authors: The cross-organ comparability and domain-shift robustness (including unseen scanners) are validated through experiments on multiple anatomical sites and domain-shifted ultrasound datasets, as reported in Sections 4.2 and 4.3. We will revise the abstract to briefly note the use of diverse, domain-shifted data supporting these claims. revision: yes
Referee: [Abstract] The clinical utility is quantified as improving expert preference prediction from 47.2% to 72.8% accuracy, but without details on the evaluation protocol, sample size, or statistical significance, it is difficult to assess the robustness of this result which is key to the paper's closing claim.

Authors: The expert preference study protocol, sample size, and statistical analysis are detailed in Section 5. We will revise the abstract to include the sample size and note the statistical significance of the accuracy improvement to 72.8%. revision: yes

Circularity Check

0 steps flagged

No significant circularity; metrics empirically validated against external clinical and task benchmarks

full rationale

The paper defines TinyUSFM-uLPIPS and TinyUSFM-NRQ using features from the TinyUSFM foundation model, then demonstrates four advantages through direct comparisons to independent external measures: calibration against Dice-score drops in segmentation, stable rankings across organs and domain shifts, consistency with PSNR, and improved expert preference prediction (47.2% to 72.8%). These validations rely on task performance, fidelity benchmarks, and human judgments that are not derived from the model's internal distances or manifold by construction. No load-bearing self-citations, self-definitional reductions, or fitted inputs renamed as predictions appear in the derivation chain; the central claims rest on observable correlations rather than tautological equivalence to the model inputs.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review; the framework rests on an ultrasound foundation model whose training objective, data distribution, and architectural assumptions are not stated. No explicit free parameters or invented physical entities are named, but the model itself functions as a learned representation whose validity is taken as given.

pith-pipeline@v0.9.0 · 5593 in / 1215 out tokens · 74382 ms · 2026-05-10T01:07:58.696762+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 20 canonical work pages · 1 internal anchor

[1]

Data in Brief28, 104863 (2020).https://doi.org/10.1016/j.dib.2019 .104863

Al-Dhabyani, W., Gomaa, M., Khaled, H., Fahmy, A.: Dataset of breast ultrasound images. Data in Brief28, 104863 (2020).https://doi.org/10.1016/j.dib.2019 .104863

work page doi:10.1016/j.dib.2019 2020
[2]

RadioGraphics37(5), 1408–1423 (2017).https://doi.org/10.1148/rg.2017160 175

Baad, M., Lu, Z.F., Reiser, I., Paushter, D.: Clinical significance of us artifacts. RadioGraphics37(5), 1408–1423 (2017).https://doi.org/10.1148/rg.2017160 175

work page doi:10.1148/rg.2017160 2017
[3]

Ultrasonics24(1), 41–44 (1986).https://doi.org/10.1016/ 0041-624X(86)90072-7

Bamber, J.C., Daft, C.: Adaptive filtering for reduction of speckle in ultrasonic pulse-echo images. Ultrasonics24(1), 41–44 (1986).https://doi.org/10.1016/ 0041-624X(86)90072-7

1986
[4]

Journal of Clinical Ultrasound52(6), 753–762 (2024).https://doi.org/10.100 2/jcu.23703

Cai, P., Yang, T., Xie, Q., Liu, P., Li, P.: A lightweight hybrid model for the automatic recognition of uterine fibroid ultrasound images based on deep learning. Journal of Clinical Ultrasound52(6), 753–762 (2024).https://doi.org/10.100 2/jcu.23703

2024
[5]

A Neural Algorithm of Artistic Style

Gatys, L.A., Ecker, A.S., Bethge, M.: A neural algorithm of artistic style. arXiv preprint arXiv:1508.06576 (2015),https://arxiv.org/abs/1508.06576

work page Pith review arXiv 2015
[6]

GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium

Heusel,M.,Ramsauer,H.,Unterthiner,T.,Nessler,B.,Hochreiter,S.:Ganstrained by a two time-scale update rule converge to a local nash equilibrium. In: Advances in Neural Information Processing Systems (NeurIPS) (2017),https://arxiv.or g/abs/1706.08500

work page Pith review arXiv 2017
[7]

USFM : A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis

Jiao,J.,Zhou,J.,Li,X.,Xia,M.,Huang,Y.,Huang,L.,Wang,N.,Zhang,X.,Zhou, S., Wang, Y., Guo, Y.: Usfm: A universal ultrasound foundation model generalized to tasks and organs towards label efficient image analysis. Medical Image Analysis 96, 103202 (2024).https://doi.org/10.1016/j.media.2024.103202

work page doi:10.1016/j.media.2024.103202 2024
[8]

In: Trustworthy Machine Learning for Healthcare (TML4H 2023)

Kwon, J., Jiao, J., Self, A., Noble, J.A., Papageorghiou, A.: A kernel density estimation based quality metric for quality assessment of obstetric ultrasound video. In: Trustworthy Machine Learning for Healthcare (TML4H 2023). Lecture Notes in Computer Science, vol. 13932, pp. 134–146. Springer (2023).https: //doi.org/10.1007/978-3-031-39539-0_12

work page doi:10.1007/978-3-031-39539-0_12 2023
[9]

IEEE Transactions on Medical Imaging38(9), 2198–2210 (2019).https: //doi.org/10.1109/TMI.2019.2900516

Leclerc, S., Smistad, E., Pedrosa, J., Östvik, A., Grenier, T., Espinosa, F., et al.: Deep learning for segmentation using an open large-scale dataset in 2d echocardio- graphy. IEEE Transactions on Medical Imaging38(9), 2198–2210 (2019).https: //doi.org/10.1109/TMI.2019.2900516

work page doi:10.1109/tmi.2019.2900516 2019
[10]

Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L., Timofte, R.: Swinir: Image restorationusingswintransformer.In:ProceedingsoftheIEEE/CVFInternational Conference on Computer Vision Workshops (ICCVW) (2021),https://arxiv.or g/abs/2108.10257

work page arXiv 2021
[11]

TinyUSFM: Towards Compact and Efficient Ultrasound Foundation Models

Ma, C., Jiao, J., Liang, S., Fu, J., Wang, Q., Li, Z., Wang, Y., Guo, Y.: Tinyusfm: Towards compact and efficient ultrasound foundation models. arXiv preprint arXiv:2510.19239 (2025).https://doi.org/10.48550/arXiv.2510.19239, https://arxiv.org/abs/2510.19239

work page internal anchor Pith review Pith/arXiv arXiv doi:10.48550/arxiv.2510.19239 2025
[12]

Computers in Biology and Medicine135, 104623 (2021)

Marzola, F., van Alfen, N., Doorduin, J., Meiburger, K.M.: Deep learning seg- mentation of transverse musculoskeletal ultrasound images for neuromuscular disease assessment. Computers in Biology and Medicine135, 104623 (2021). https://doi.org/10.1016/j.compbiomed.2021.104623

work page doi:10.1016/j.compbiomed.2021.104623 2021
[13]

trasound scans

Meiburger, K.M., Marzola, F., Zahnd, G., Faita, F., Loizou, C.P., Lainé, N., et al.: Carotid ultrasound boundary study (cubs): Technical considerations on an open multi-center analysis of computerized measurement systems for intima- media thickness measurement on common carotid artery longitudinal b-mode ul- 10 Huang et al. trasound scans. Computers in Bi...

work page doi:10.1016/j.compbiomed.2022.105333 2022
[14]

IEEE Transactions on Image Processing21(12), 4695–4708 (2012).https://doi.org/10.1109/TIP.2012.2214050

Mittal, A., Moorthy, A.K., Bovik, A.C.: No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing21(12), 4695–4708 (2012).https://doi.org/10.1109/TIP.2012.2214050

work page doi:10.1109/tip.2012.2214050 2012
[15]

completely blind

Mittal, A., Soundararajan, R., Bovik, A.C.: Making a “completely blind” image quality analyzer. IEEE Signal Processing Letters20(3), 209–212 (2013),https: //live.ece.utexas.edu/research/Quality/niqe_spl.pdf

2013
[16]

In: Medical Imaging 2015: Computer- Aided Diagnosis

Pedraza, L., Vargas, C., Narváez, F., Durán, O., Muñoz, E., Romero, E.: An open access thyroid ultrasound image database. In: Medical Imaging 2015: Computer- Aided Diagnosis. Proceedings of SPIE, vol. 9287 (2015).https://doi.org/10.1 117/12.2073532

2015
[17]

arXiv preprint arXiv:2307.02462 (2023),https://arxiv.org/abs/2307.02462

Raina, D., Ntentia, D., Chandrashekhara, S.H., Voyles, R., Saha, S.K.: Expert- agnostic ultrasound image quality assessment using deep variational clustering. arXiv preprint arXiv:2307.02462 (2023),https://arxiv.org/abs/2307.02462

work page arXiv 2023
[18]

Ultrasonic Imaging4(4), 297–310 (1982).https://doi.org/10.1177/0161 73468200400401

Robinson, D.E., Knight, P.C.: Interpolation scan conversion in pulse-echo ultra- sound. Ultrasonic Imaging4(4), 297–310 (1982).https://doi.org/10.1177/0161 73468200400401

work page doi:10.1177/0161 1982
[19]

In: Proceedings of the 19th International SPIN Workshop on Model Checking of Software (SPIN)

Singla, R., Ringstrom, C., Hu, G., Lessoway, V., Reid, J., Nguan, C., Rohling, R.: The open kidney ultrasound data set. In: Medical Image Computing and Computer Assisted Intervention – MICCAI 2023 (2023).https://doi.org/10.1007/978-3 -031-44521-7_15

work page doi:10.1007/978-3 2023
[20]

InProceedings of the IEEE/CVF conference on computer vision and pattern recognition

Wang, Z., Bovik, A.C., Sheikh, H.R., Simoncelli, E.P.: Image quality assessment: From error visibility to structural similarity. IEEE Transactions on Image Process- ing13(4), 600–612 (2004).https://doi.org/10.1109/TIP.2003.819861

work page doi:10.1109/tip.2003.819861 2004
[21]

IEEE Transactions on Cybernetics47(5), 1336–1349 (2017).https://doi.org/10.1109/TCYB.2017.26 71898

Wu, L., Cheng, J.Z., Li, S., Lei, B., Wang, T., Ni, D.: Fuiqa: Fetal ultrasound image quality assessment with deep convolutional networks. IEEE Transactions on Cybernetics47(5), 1336–1349 (2017).https://doi.org/10.1109/TCYB.2017.26 71898

work page doi:10.1109/tcyb.2017.26 2017
[22]

Briefings in Bioinformatics24(1), bbac569 (2023).https://doi.org/10.1093/bi b/bbac569

Xu, Y., Zheng, B., Liu, X., Wu, T., Ju, J., Wang, S., Lian, Y., Zhang, H., Liang, T., Sang, Y., Jiang, R., Wang, G., Ren, J., Chen, T.: Improving artificial intelligence pipeline for liver malignancy diagnosis using ultrasound images and video frames. Briefings in Bioinformatics24(1), bbac569 (2023).https://doi.org/10.1093/bi b/bbac569

work page doi:10.1093/bi 2023
[23]

Medicine100(4), e24427 (2021)

Zhang, B., Liu, H., Luo, H., Li, K.: Automatic quality assessment for 2d fetal sonographic standard plane based on multitask learning. Medicine100(4), e24427 (2021)

2021
[24]

Efros, Eli Shechtman, and Oliver Wang

Zhang, R., Isola, P., Efros, A.A., Shechtman, E., Wang, O.: The unreasonable effec- tiveness of deep features as a perceptual metric. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 586–595 (2018).https://doi.org/10.1109/CVPR.2018.00068

work page doi:10.1109/cvpr.2018.00068 2018
[25]

arXiv preprint arXiv:2207.06799 (2022),https://arxiv.org/abs/2207.06799

Zhao, Q., Lyu, S., Bai, W., Cai, L., Liu, B., Cheng, G., Wu, M., Sang, X., Yang, M., Chen, L.: Mmotu: A multi-modality ovarian tumor ultrasound im- age dataset for unsupervised cross-domain semantic segmentation. arXiv preprint arXiv:2207.06799 (2022),https://arxiv.org/abs/2207.06799

work page arXiv 2022