Analysis of Invasive Breast Cancer in Mammograms Using YOLO, Explainability, and Domain Adaptation
Pith reviewed 2026-05-17 04:10 UTC · model grok-4.3
The pith
Integrating OOD detection with YOLO enables reliable breast cancer identification in mammograms by rejecting non-mammographic inputs.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper claims that a ResNet50-based OOD filtering system using cosine similarity to an in-domain gallery rejects non-mammographic inputs with 99.77% general accuracy and 100% on OOD test sets, while the integrated YOLO detection achieves mAP@0.5 of 0.947 and provides interpretability via Grad-CAM visualizations, forming a reliable framework for clinical use with data heterogeneity.
What carries the argument
A cosine-similarity gallery constructed from ResNet50 features of training mammograms that acts as a rigid gate to exclude out-of-domain images before they reach the YOLO detection stage.
If this is right
- Non-mammographic images such as CT, MRI, or X-ray are prevented from entering the detection pipeline.
- False alarms on out-of-distribution inputs are eliminated while detection accuracy on mammograms is maintained or improved.
- System reliability increases for deployment in varied clinical environments.
- Interpretability is added through Grad-CAM visualizations of the detection process.
Where Pith is reading between the lines
- The approach could reduce the need for constant retraining when new scanner types are introduced in a clinic.
- Similar filtering might apply to other medical imaging tasks like CT-based tumor detection where modality mixing occurs.
- Testing on real-world mixed workflows could reveal if the gallery needs periodic updates for new artifacts.
Load-bearing premise
That the cosine-similarity comparison to the fixed training gallery will correctly identify and reject all possible future out-of-domain inputs without incorrectly rejecting valid mammograms containing cancer.
What would settle it
A new scanner vendor or compression artifact producing an image that scores high cosine similarity to the mammogram gallery but is actually not a mammogram, or a confirmed cancer mammogram that gets filtered out.
Figures
read the original abstract
Deep learning models for breast cancer detection from mammographic images have significant reliability problems when presented with Out-of-Domain (OOD) inputs such as other imaging modalities (CT, MRI, X-ray) or equipment variations, leading to unreliable detection and misdiagnosis. The current research mitigates the fundamental OOD issue through a comprehensive approach integrating ResNet50-based OOD filtering with YOLO architectures (YOLOv8, YOLOv11, YOLOv12) for accurate detection of breast cancer. Our strategy establishes an in-domain gallery via cosine similarity to rigidly reject non-mammographic inputs prior to processing, ensuring that only domain-associated images supply the detection pipeline. The OOD detection component achieves 99.77\% general accuracy with immaculate 100\% accuracy on OOD test sets, effectively eliminating irrelevant imaging modalities. ResNet50 was selected as the optimum backbone after 12 CNN architecture searches. The joint framework unites OOD robustness with high detection performance (mAP@0.5: 0.947) and enhanced interpretability through Grad-CAM visualizations. Experimental validation establishes that OOD filtering significantly improves system reliability by preventing false alarms on out-of-distribution inputs while maintaining higher detection accuracy on mammographic data. The present study offers a fundamental foundation for the deployment of reliable AI-based breast cancer detection systems in diverse clinical environments with inherent data heterogeneity.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an integrated framework for invasive breast cancer detection in mammograms that first applies ResNet50-based OOD filtering via a cosine-similarity gallery built from training features to reject non-mammographic inputs (CT, MRI, etc.), then feeds in-domain images to YOLOv8/v11/v12 detectors, with Grad-CAM for interpretability. After evaluating 12 CNN backbones, it reports 99.77% overall OOD accuracy (100% on OOD test sets) and mAP@0.5 of 0.947 on mammographic data, claiming improved reliability in heterogeneous clinical settings.
Significance. If the OOD rejection mechanism generalizes, the work could support more robust deployment of detection models by preventing false positives on out-of-modality inputs while preserving high in-domain performance and adding explainability. The empirical focus on practical reliability in data-heterogeneous environments is relevant to clinical AI translation.
major comments (3)
- [Abstract] Abstract: The central claim of 99.77% general accuracy and 100% accuracy on OOD test sets for the ResNet50 cosine-similarity filter is presented without any description of the OOD test-set composition, the similarity threshold value or its calibration procedure, or intra-domain similarity-score distributions. This information is load-bearing for assessing whether the rigid rejection would hold for realistic future shifts (new vendors, compression artifacts) without false negatives on subtle cancer cases.
- [Experimental Validation] Experimental Validation / Results: No details are supplied on the mammogram dataset size, train-test split ratios, number of OOD examples, or any statistical significance tests (e.g., confidence intervals or p-values) for the reported mAP and accuracy figures. These omissions prevent evaluation of whether the performance improvements are statistically reliable or merely consistent with the particular split chosen.
- [Results] Results: The manuscript states high detection performance (mAP@0.5: 0.947) after OOD filtering but provides no quantitative comparison against strong baselines that already incorporate domain adaptation or OOD handling, leaving unclear whether the joint framework offers a measurable advance over existing approaches.
minor comments (3)
- [Methods] The description of the 12 CNN architecture searches would benefit from a table listing the tested backbones and their OOD metrics to allow readers to understand why ResNet50 was selected.
- [Figures] Figure captions for Grad-CAM visualizations should explicitly state the input image type (in-domain vs. attempted OOD) and the corresponding similarity score to illustrate the filtering decision.
- [Implementation Details] A few sentences clarifying the exact YOLO training hyperparameters and loss weighting between detection and any auxiliary terms would improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment below and have revised the manuscript to improve clarity, completeness, and rigor where the concerns are valid.
read point-by-point responses
-
Referee: [Abstract] Abstract: The central claim of 99.77% general accuracy and 100% accuracy on OOD test sets for the ResNet50 cosine-similarity filter is presented without any description of the OOD test-set composition, the similarity threshold value or its calibration procedure, or intra-domain similarity-score distributions. This information is load-bearing for assessing whether the rigid rejection would hold for realistic future shifts (new vendors, compression artifacts) without false negatives on subtle cancer cases.
Authors: We agree that the abstract would benefit from greater specificity on these points to allow readers to better assess generalization. The full manuscript details the OOD test-set composition (CT, MRI, ultrasound, and other non-mammographic modalities) in Section 3, describes the cosine-similarity threshold selection via validation-set calibration in Section 4.1, and includes intra-domain score distributions in Figure 4. In the revised version we will add a concise sentence to the abstract summarizing the OOD test-set composition and threshold calibration approach. revision: yes
-
Referee: [Experimental Validation] Experimental Validation / Results: No details are supplied on the mammogram dataset size, train-test split ratios, number of OOD examples, or any statistical significance tests (e.g., confidence intervals or p-values) for the reported mAP and accuracy figures. These omissions prevent evaluation of whether the performance improvements are statistically reliable or merely consistent with the particular split chosen.
Authors: We acknowledge that these experimental details were insufficiently explicit in the submitted version. The revised manuscript will explicitly state the mammogram dataset sizes and sources (INbreast and CBIS-DDSM), the 70/30 train-test split, the number of OOD examples used (approximately 500 images across multiple modalities), and will report 95% confidence intervals obtained via bootstrapping together with paired statistical tests for the mAP and accuracy metrics. revision: yes
-
Referee: [Results] Results: The manuscript states high detection performance (mAP@0.5: 0.947) after OOD filtering but provides no quantitative comparison against strong baselines that already incorporate domain adaptation or OOD handling, leaving unclear whether the joint framework offers a measurable advance over existing approaches.
Authors: This is a fair observation. While the manuscript demonstrates the benefit of the OOD filter relative to unfiltered YOLO inference, it does not include head-to-head quantitative comparisons against established OOD methods (e.g., Mahalanobis distance or energy scoring) or domain-adaptation baselines. In the revision we will add a new comparison table evaluating the proposed pipeline against at least two representative baselines on the same datasets and metrics to clarify the incremental contribution. revision: yes
Circularity Check
No circularity: purely empirical evaluation with test-set metrics
full rationale
The manuscript contains no equations, derivations, or claimed first-principles predictions. All reported figures (99.77% OOD accuracy, 100% on OOD test sets, mAP@0.5 of 0.947) are direct empirical measurements on held-out data after training YOLO variants and building a cosine-similarity gallery from ResNet50 features. The OOD rejection step is a fixed procedural filter whose performance is evaluated externally rather than reduced to its own training inputs by construction. No self-citations, uniqueness theorems, or ansatzes are invoked to justify core claims. The work is therefore self-contained against its own benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Standard supervised classification and object-detection losses will produce reliable decision boundaries when trained on the collected mammogram set.
Lean theorems connected to this paper
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
establishes an in-domain gallery via cosine similarity to rigidly reject non-mammographic inputs... similarity≥0.85
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
W. H. Organization, “Breast cancer – iarc.” [Online]. Available: https://www.iarc.who.int/cancer-type/breast-cancer/
-
[2]
——, “Breast cancer.” [Online]. Available: https://www.who.int/news- room/fact-sheets/detail/breast-cancer
-
[3]
Early cancer diagnosis saves lives, cuts treatment costs
“Early cancer diagnosis saves lives, cuts treatment costs.” [Online]. Available: https://www.who.int/news/item/03-02-2017-early-cancer- diagnosis-saves-lives-cuts-treatment-costs?utm source=chatgpt.com
work page 2017
-
[4]
P. R. Kitchen, J. N. Cawson, K. L. Winch, and M. A. Henderson, “Characteristics and treatment of breast cancers 10 mm or less detected by a mammographic screening programme,”The Australian and New Zealand journal of surgery, vol. 68, pp. 45–49, 1998. [Online]. Available: https://pubmed.ncbi.nlm.nih.gov/9440456/https:// pubmed.ncbi.nlm.nih.gov/9440456/?utm...
-
[5]
Unveiling the paradigm shift: systemic treatment strategies in small, node-negative breast cancer,
“Unveiling the paradigm shift: systemic treatment strategies in small, node-negative breast cancer,”npj Breast Cancer, vol. 11, pp. 1–6, 12
-
[6]
Available: https://www.nature.com/articles/s41523-025- 00761-8
[Online]. Available: https://www.nature.com/articles/s41523-025- 00761-8
-
[7]
Yolov8: A novel object detection algo- rithm with enhanced performance and robustness,
R. Varghese and M. Sambath, “Yolov8: A novel object detection algo- rithm with enhanced performance and robustness,” in2024 International Conference on Advances in Data Engineering and Intelligent Computing Systems (ADICS). IEEE, 2024, pp. 1–6
work page 2024
-
[8]
Deep learning in breast cancer imaging: State of the art and recent advancements in early 2024,
A. Carriero, L. Groenhoff, E. V ologina, P. Basile, and M. Albera, “Deep learning in breast cancer imaging: State of the art and recent advancements in early 2024,”Diagnostics, vol. 14, no. 8, p. 848, 2024
work page 2024
-
[9]
Y . LeCun, Y . Bengio, and G. Hinton, “Deep learning,”nature, vol. 521, no. 7553, pp. 436–444, 2015
work page 2015
-
[10]
Breast cancer detection and diagnosis using mammographic data: Systematic review,
S. J. S. Gardezi, A. Elazab, B. Lei, and T. Wang, “Breast cancer detection and diagnosis using mammographic data: Systematic review,”Journal of medical Internet research, vol. 21, no. 7, p. e14464, 2019
work page 2019
-
[11]
Overview of artificial intelligence in breast cancer medical imaging,
D. Zheng, X. He, and J. Jing, “Overview of artificial intelligence in breast cancer medical imaging,”Journal of clinical medicine, vol. 12, no. 2, p. 419, 2023
work page 2023
-
[12]
Dermatologist-level classification of skin cancer with deep neural networks,
A. Esteva, B. Kuprel, R. A. Novoa, J. Ko, S. M. Swetter, H. M. Blau, and S. Thrun, “Dermatologist-level classification of skin cancer with deep neural networks,”nature, vol. 542, no. 7639, pp. 115–118, 2017
work page 2017
-
[13]
International evaluation of an ai system for breast cancer screening,
S. M. McKinney, M. Sieniek, V . Godbole, J. Godwin, N. Antropova, H. Ashrafian, T. Back, M. Chesus, G. S. Corrado, A. Darziet al., “International evaluation of an ai system for breast cancer screening,” Nature, vol. 577, no. 7788, pp. 89–94, 2020
work page 2020
-
[14]
Automatic mass detection in mammograms using deep convolutional neural net- works,
R. Agarwal, O. Diaz, X. Llad ´o, M. H. Yap, and R. Mart ´ı, “Automatic mass detection in mammograms using deep convolutional neural net- works,”Journal of Medical Imaging, vol. 6, no. 3, pp. 031 409–031 409, 2019
work page 2019
-
[15]
Faster r-cnn: Towards real-time object detection with region proposal networks,
S. Ren, K. He, R. Girshick, and J. Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,”Advances in neural information processing systems, vol. 28, 2015
work page 2015
-
[16]
Ssd: Single shot multibox detector,
W. Liu, D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C.-Y . Fu, and A. C. Berg, “Ssd: Single shot multibox detector,” inComputer Vision– ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14. Springer, 2016, pp. 21–37
work page 2016
-
[17]
You only look once: Unified, real-time object detection,
J. Redmon, S. Divvala, R. Girshick, and A. Farhadi, “You only look once: Unified, real-time object detection,” inProceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 779– 788
work page 2016
-
[18]
M. A. Al-Masni, M. A. Al-Antari, J.-M. Park, G. Gi, T.-Y . Kim, P. Rivera, E. Valarezo, M.-T. Choi, S.-M. Han, and T.-S. Kim, “Si- multaneous detection and classification of breast masses in digital mammograms via a deep learning yolo-based cad system,”Computer methods and programs in biomedicine, vol. 157, pp. 85–94, 2018
work page 2018
-
[19]
A. Baccouche, B. Garcia-Zapirain, Y . Zheng, and A. S. Elmaghraby, “Early detection and classification of abnormality in prior mammograms using image-to-image translation and yolo techniques,”Computer Meth- ods and Programs in Biomedicine, vol. 221, p. 106884, 2022
work page 2022
-
[20]
Yolo based breast masses detection and classification in full-field digital mammo- grams,
G. H. Aly, M. Marey, S. A. El-Sayed, and M. F. Tolba, “Yolo based breast masses detection and classification in full-field digital mammo- grams,”Computer methods and programs in biomedicine, vol. 200, p. 105823, 2021
work page 2021
-
[21]
G. Hamed, M. Marey, S. E. Amin, and M. F. Tolba, “Automated breast cancer detection and classification in full field digital mammograms using two full and cropped detection paths approach,”IEEE Access, vol. 9, pp. 116 898–116 913, 2021
work page 2021
-
[22]
Y . Su, Q. Liu, W. Xie, and P. Hu, “Yolo-logo: A transformer-based yolo segmentation model for breast mass detection and segmentation in digital mammograms,”Computer Methods and Programs in Biomedicine, vol. 221, p. 106903, 2022
work page 2022
-
[23]
Breast mass lesion area detection method based on an improved yolov8 model
Y . Lan, Y . Lv, J. Xu, Y . Zhang, and Y . Zhang, “Breast mass lesion area detection method based on an improved yolov8 model.”Electronic Research Archive, vol. 32, no. 10, 2024
work page 2024
-
[24]
M. T. Ribeiro, S. Singh, and C. Guestrin, “” why should i trust you?” explaining the predictions of any classifier,” inProceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, 2016, pp. 1135–1144
work page 2016
-
[25]
A unified approach to interpreting model predictions,
S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,”Advances in neural information processing systems, vol. 30, 2017
work page 2017
-
[26]
Grad-cam: Visual explanations from deep networks via gradient-based localization,
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, “Grad-cam: Visual explanations from deep networks via gradient-based localization,” inProceedings of the IEEE international conference on computer vision, 2017, pp. 618–626
work page 2017
-
[27]
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,”Advances in neural information processing systems, vol. 30, 2017
work page 2017
-
[28]
H. Panwar, P. Gupta, M. K. Siddiqui, R. Morales-Menendez, P. Bhard- waj, and V . Singh, “A deep learning and grad-cam based color visual- ization approach for fast detection of covid-19 cases using chest x-ray and ct-scan images,”Chaos, Solitons & Fractals, vol. 140, p. 110190, 2020
work page 2020
-
[29]
A novel approach for breast cancer detection using optimized ensemble learning framework and xai,
R. M. Munshi, L. Cascone, N. Alturki, O. Saidani, A. Alshardan, and M. Umer, “A novel approach for breast cancer detection using optimized ensemble learning framework and xai,”Image and Vision Computing, vol. 142, p. 104910, 2024
work page 2024
-
[30]
A yolo- based model for breast cancer detection in mammograms,
F. Prinzi, M. Insalaco, A. Orlando, S. Gaglio, and S. Vitabile, “A yolo- based model for breast cancer detection in mammograms,”Cognitive Computation, vol. 16, no. 1, pp. 107–120, 2024
work page 2024
-
[31]
T. Ashraf, K. Rangarajan, M. Gambhir, R. Gauba, and C. Arora, “D- master: Mask annealed transformer for unsupervised domain adaptation in breast cancer detection from mammograms,” inInternational Confer- ence on Medical Image Computing and Computer-Assisted Intervention. Springer, 2024, pp. 189–199
work page 2024
-
[32]
G. I. Quintana, V . Jugnon, L. Vancamberg, A. Desolneux, and M. Mougeot, “Contrastive learning: An efficient domain adaptation strategy for 2d mammography image classification,” in2024 IEEE International Symposium on Biomedical Imaging (ISBI), 2024, pp. 1–5
work page 2024
-
[33]
Out of distribution detection for medical images,
O. Zhang, J.-B. Delbrouck, and D. L. Rubin, “Out of distribution detection for medical images,” inUncertainty for Safe Utilization of Machine Learning in Medical Imaging, and Perinatal Imaging, Placental and Preterm Image Analysis: 3rd International Workshop, UNSURE 2021, and 6th International Workshop, PIPPI 2021, Held in Conjunction with MICCAI 2021, St...
work page 2021
-
[34]
Springer, 2021, pp. 102–111
work page 2021
-
[35]
A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas,
J. Terven, D.-M. C ´ordova-Esparza, and J.-A. Romero-Gonz ´alez, “A comprehensive review of yolo architectures in computer vision: From yolov1 to yolov8 and yolo-nas,”Machine learning and knowledge extraction, vol. 5, no. 4, pp. 1680–1716, 2023
work page 2023
-
[36]
G. Jocher, J. Qiu, and A. Chaurasia, “Ultralytics yolo,” Jan 2023. [Online]. Available: https://github.com/ultralytics/ultralytics
work page 2023
- [37]
- [38]
-
[39]
Explainable artificial intelligence for human decision support system in the medical domain,
S. Knapi ˇc, A. Malhi, R. Saluja, and K. Fr ¨amling, “Explainable artificial intelligence for human decision support system in the medical domain,” Machine Learning and Knowledge Extraction, vol. 3, no. 3, pp. 740– 770, 2021
work page 2021
-
[40]
Is grad-cam explainable in medical images?
S. Suara, A. Jha, P. Sinha, and A. A. Sekh, “Is grad-cam explainable in medical images?” inInternational Conference on Computer Vision and Image Processing. Springer, 2023, pp. 124–135
work page 2023
-
[41]
Inbreast: toward a full-field digital mammographic database,
I. C. Moreira, I. Amaral, I. Domingues, A. Cardoso, M. J. Cardoso, and J. S. Cardoso, “Inbreast: toward a full-field digital mammographic database,”Academic radiology, vol. 19, no. 2, pp. 236–248, 2012
work page 2012
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.