arxiv: 2604.22139 · v1 · submitted 2026-04-24 · 💻 cs.CV · cs.LG

Recognition: unknown

Anatomy-Aware Unsupervised Detection and Localization of Retinal Abnormalities in Optical Coherence Tomography

Tania Haghighi , Sina Gholami , Hamed Tabkhi , Minhaj Nur Alam

Authors on Pith no claims yet

Pith reviewed 2026-05-08 12:54 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords unsupervised anomaly detectionoptical coherence tomographyretinal abnormalitiesanomaly localizationdiscrete latent modelmedical image analysisdomain generalization

0 comments

The pith

An unsupervised anomaly detection method for OCT retinal images learns normative anatomy from healthy scans alone to identify pathologies through reconstruction errors.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The authors develop an unsupervised framework for detecting and localizing retinal abnormalities in Optical Coherence Tomography (OCT) images. It trains a discrete latent model on normal B-scans to learn the distribution of healthy retinal anatomy. Retinal layer-aware supervision and structured triplet learning are incorporated to better distinguish healthy from abnormal representations. Anomalies are then detected and segmented based on reconstruction discrepancies at both image and pixel levels. The method outperforms several baseline models on the Kermany dataset with an AUROC of 0.799, shows strong generalization on the Srinivasan dataset with AUROC 0.884, and achieves competitive results on the RETOUCH benchmark.

Core claim

The central discovery is that a discrete latent model trained exclusively on normal OCT B-scans, augmented with anatomy-aware supervision, can capture the normative distribution of healthy retinal structures, allowing reliable detection and localization of abnormalities solely through discrepancies in image reconstruction.

What carries the argument

The discrete latent model enhanced by retinal layer-aware supervision and structured triplet learning, which separates healthy and pathological representations in the latent space.

Load-bearing premise

The model trained only on normal B-scans fully represents the variation in healthy retinal anatomy across different devices, populations, and imaging conditions, so that any reconstruction failure points to true pathology rather than unseen healthy variation or artifacts.

What would settle it

A test set of OCT scans from healthy eyes acquired on a different scanner or from a demographic group absent from the training data, checking if the false positive rate remains low.

Figures

Figures reproduced from arXiv: 2604.22139 by Hamed Tabkhi, Minhaj Nur Alam, Sina Gholami, Tania Haghighi.

**Figure 1.** Figure 1: Overview of the proposed method. (a) Retina layer extraction and generation of perturbed samples for triplet learning. (b) view at source ↗

**Figure 2.** Figure 2: Qualitative comparison of anomaly segmentation results view at source ↗

**Figure 3.** Figure 3: Left: Image-level anomaly scores sorted by datapoint view at source ↗

read the original abstract

Reliable automated analysis of Optical Coherence Tomography (OCT) imaging is crucial for diagnosing retinal disorders but faces a critical barrier: the need for expensive, labor-intensive expert annotations. Supervised deep learning models struggle to generalize across diverse pathologies, imaging devices, and patient populations due to their restricted vocabulary of annotated abnormalities. We propose an unsupervised anomaly detection framework that learns the normative distribution of healthy retinal anatomy without lesion annotations, directly addressing annotation efficiency challenges in clinical deployment. Our approach leverages a discrete latent model trained on normal B-scans to capture OCT-specific structural patterns. To enhance clinical robustness, we incorporate retinal layer-aware supervision and structured triplet learning to separate healthy from pathological representations, improving model reliability across varied imaging conditions. During inference, anomalies are detected and localized via reconstruction discrepancies, enabling both image and pixel-level identification without requiring disease-specific labels. On the Kermany dataset (AUROC: 0.799), our method substantially outperforms VAE, VQVAE, VQGAN, and f-AnoGAN baselines. Critically, cross-dataset evaluation on Srinivasan achieves AUROC 0.884 with superior generalization, demonstrating robust domain adaptation. On the external RETOUCH benchmark, unsupervised anomaly segmentation achieves competitive Dice (0.200) and mIoU (0.117) scores, validating reproducibility across institutions.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper adds retinal layer supervision and triplet loss to a discrete latent model for unsupervised OCT anomaly detection, but the abstract gives no architecture details or checks on healthy variation coverage.

read the letter

The core idea here is training a discrete latent model only on normal OCT B-scans, then flagging anomalies through reconstruction error while using layer-aware supervision and triplet terms to sharpen the separation. This produces AUROC 0.799 on Kermany, 0.884 on Srinivasan cross-dataset, and modest Dice 0.200 on RETOUCH segmentation. Those numbers beat the listed baselines, and the unsupervised framing directly targets the annotation shortage in retinal screening.

Referee Report

2 major / 0 minor

Summary. The paper claims to introduce an unsupervised anomaly detection and localization framework for retinal abnormalities in OCT B-scans. It uses a discrete latent model trained exclusively on normal data, enhanced with retinal layer-aware supervision and structured triplet learning. Anomalies are detected via reconstruction discrepancies at image and pixel levels. Reported results include an AUROC of 0.799 on the Kermany dataset outperforming several baselines (VAE, VQVAE, VQGAN, f-AnoGAN), a cross-dataset AUROC of 0.884 on the Srinivasan dataset, and competitive unsupervised segmentation scores (Dice 0.200, mIoU 0.117) on the RETOUCH benchmark.

Significance. If the central results hold, this work has potential significance in advancing annotation-efficient, generalizable methods for medical image analysis, particularly for OCT where expert annotations are costly. The emphasis on cross-dataset evaluation and anatomy-aware components addresses important practical challenges in clinical deployment. However, the significance is limited by the lack of verification for the key assumption regarding coverage of healthy variations.

major comments (2)

Abstract: The abstract reports AUROC and segmentation metrics but supplies no architecture details, training hyperparameters, statistical tests, or ablation studies, preventing verification that reported gains are robust or free of post-hoc tuning.
Method: The central claim relies on the assumption that the discrete latent model trained on normal B-scans captures the full range of healthy anatomical variation; however, no quantitative validation such as coverage analysis or error histograms on held-out healthy data from different devices is described, undermining the interpretation of the reported cross-dataset generalization (AUROC 0.884 on Srinivasan).

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and detailed review. We address each major comment point by point below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: Abstract: The abstract reports AUROC and segmentation metrics but supplies no architecture details, training hyperparameters, statistical tests, or ablation studies, preventing verification that reported gains are robust or free of post-hoc tuning.

Authors: We agree that the abstract is concise and omits these specifics, which are instead detailed in the main text. Section 3 describes the discrete latent model architecture with layer-aware supervision and triplet learning; Section 4.1 provides training hyperparameters; and Section 5.3 along with the supplementary material present ablation studies and statistical tests. To address the concern, we will revise the abstract to briefly reference these core elements and direct readers to the relevant sections for verification of robustness. revision: yes
Referee: Method: The central claim relies on the assumption that the discrete latent model trained on normal B-scans captures the full range of healthy anatomical variation; however, no quantitative validation such as coverage analysis or error histograms on held-out healthy data from different devices is described, undermining the interpretation of the reported cross-dataset generalization (AUROC 0.884 on Srinivasan).

Authors: We acknowledge this point as valid. While the cross-dataset results on Srinivasan support generalization, the manuscript does not include explicit quantitative validation such as coverage analysis or error histograms on held-out healthy data from different devices. In the revised version, we will add these analyses by computing and presenting reconstruction error distributions and coverage metrics on additional held-out normal samples from both datasets to better substantiate the assumption and the cross-dataset findings. revision: yes

Circularity Check

0 steps flagged

No circularity: standard unsupervised anomaly detection with empirical evaluation

full rationale

The paper trains a discrete latent model on normal B-scans only, then detects anomalies via reconstruction error plus auxiliary layer-aware and triplet terms. Reported AUROCs (0.799 on Kermany, 0.884 on Srinivasan) and RETOUCH segmentation scores are obtained by applying the trained model to labeled test sets containing pathologies; these metrics are not equivalent to the training inputs by construction, nor do they arise from fitting a parameter and relabeling it as a prediction. No self-citation load-bearing steps, uniqueness theorems imported from prior author work, ansatzes smuggled via citation, or renaming of known results appear in the derivation. The cross-dataset generalization claim is an empirical observation rather than a tautological output, leaving the central claims self-contained against external benchmarks.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that healthy retinal anatomy forms a learnable, compact distribution whose deviations reliably mark pathology, plus standard deep-learning assumptions about optimization and representation learning.

free parameters (2)

discrete codebook size and latent dimensionality
Hyperparameters of the discrete latent model chosen to fit normal OCT patterns.
triplet loss margin and layer-supervision weights
Parameters controlling separation of healthy versus pathological representations.

axioms (1)

domain assumption Healthy retinal B-scans form a compact distribution in the learned latent space that is separable from pathological ones via reconstruction error.
Invoked to justify anomaly detection without lesion labels.

pith-pipeline@v0.9.0 · 5550 in / 1403 out tokens · 60013 ms · 2026-05-08T12:54:25.138210+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

38 extracted references · 5 canonical work pages · 1 internal anchor

[1]

Classification of sd-oct images using a deep learning approach

Muhammad Awais, Henning M ¨uller, Tong B Tang, and Fab- rice Meriaudeau. Classification of sd-oct images using a deep learning approach. In2017 IEEE International Confer- ence on Signal and Image Processing Applications (ICSIPA), pages 489–492. IEEE, 2017. 1

2017
[2]

Autoencoders for unsuper- vised anomaly segmentation in brain mr images: a compara- tive study.Medical image analysis, 69:101952, 2021

Christoph Baur, Stefan Denner, Benedikt Wiestler, Nassir Navab, and Shadi Albarqouni. Autoencoders for unsuper- vised anomaly segmentation in brain mr images: a compara- tive study.Medical image analysis, 69:101952, 2021. 2

2021
[3]

Gostar, Kiwan Jeon, Zexuan Ji, Sung Ho Kang, Dara D

Hrvoje Bogunovi ´c, Freerk Venhuizen, Sophie Klimscha, Ste- fanos Apostolopoulos, Alireza Bab-Hadiashar, Ulas Bagci, Mirza Faisal Beg, Loza Bekalo, Qiang Chen, Carlos Ciller, Karthik Gopinath, Amirali K. Gostar, Kiwan Jeon, Zexuan Ji, Sung Ho Kang, Dara D. Koozekanani, Donghuan Lu, Dustin Morley, Keshab K. Parhi, Hyoung Suk Park, Ab- dolreza Rashno, Marin...

2019
[4]

Medianomaly: A comparative study of anomaly detection in medical images.Medical Image Analysis, 102:103500, 2025

Yu Cai, Weiwen Zhang, Hao Chen, and Kwang-Ting Cheng. Medianomaly: A comparative study of anomaly detection in medical images.Medical Image Analysis, 102:103500, 2025. 8

2025
[5]

Sr-anogan: you never detect alone

Minjong Cheon. Sr-anogan: you never detect alone. super resolution in anomaly detection (student abstract). InPro- ceedings of the AAAI Conference on Artificial Intelligence, pages 16194–16195, 2023. 6

2023
[6]

Anomaly detection in retinal images using multi-scale deep feature sparse coding

Sourya Dipta Das, Saikat Dutta, Nisarg A Shah, Dwarikanath Mahapatra, and Zongyuan Ge. Anomaly detection in retinal images using multi-scale deep feature sparse coding. In2022 IEEE 19th International Symposium on Biomedical Imaging (ISBI), pages 1–5. IEEE, 2022. 2

2022
[7]

An introduction to variational autoencoders.Foundations and Trends® in Ma- chine Learning, 12(4):307–392, 2019

P Kingma Diederik and Welling Max. An introduction to variational autoencoders.Foundations and Trends® in Ma- chine Learning, 12(4):307–392, 2019. 6

2019
[8]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Syl- vain Gelly, et al. An image is worth 16x16 words: Trans- formers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 6

work page internal anchor Pith review arXiv 2010
[9]

Unsupervised anomaly detec- tion using aggregated normative diffusion.arXiv preprint arXiv:2312.01904, 2023

Alexander Frotscher, Jaivardhan Kapoor, Thomas Wolfers, and Christian F Baumgartner. Unsupervised anomaly detec- tion using aggregated normative diffusion.arXiv preprint arXiv:2312.01904, 2023. 2

work page arXiv 2023
[10]

Unsupervised anomaly detection using generative adversar- ial networks in 1h-mrs of the brain.Journal of Magnetic Resonance, 325:106936, 2021

Joon Jang, Hyeong Hun Lee, Ji-Ae Park, and Hyeonjin Kim. Unsupervised anomaly detection using generative adversar- ial networks in 1h-mrs of the brain.Journal of Magnetic Resonance, 325:106936, 2021. 6

2021
[11]

Anomaly detection in optical coherence tomography angiog- raphy (octa) with a vector-quantized variational auto-encoder (vq-vae).Bioengineering, 11(7):682, 2024

Hana Jebril, Meltem Eseng ¨on¨ul, and Hrvoje Bogunovi ´c. Anomaly detection in optical coherence tomography angiog- raphy (octa) with a vector-quantized variational auto-encoder (vq-vae).Bioengineering, 11(7):682, 2024. 8

2024
[12]

Denoising autoencoders for unsupervised anomaly detection in brain mri

Antanas Kascenas, Nicolas Pugeault, and Alison Q O’Neil. Denoising autoencoders for unsupervised anomaly detection in brain mri. InInternational Conference on Medical Imag- ing with Deep Learning, pages 653–664. PMLR, 2022. 2

2022
[13]

Identifying medical diagnoses and treatable diseases by image-based deep learning.cell, 172(5):1122–1131, 2018

Daniel S Kermany, Michael Goldbaum, Wenjia Cai, Car- olina CS Valentim, Huiying Liang, Sally L Baxter, Alex McKeown, Ge Yang, Xiaokang Wu, Fangbing Yan, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning.cell, 172(5):1122–1131, 2018. 3

2018
[14]

Mad-gan: Multivariate anomaly de- tection for time series data with generative adversarial net- works

Dan Li, Dacheng Chen, Baihong Jin, Lei Shi, Jonathan Goh, and See-Kiong Ng. Mad-gan: Multivariate anomaly de- tection for time series data with generative adversarial net- works. InInternational conference on artificial neural net- works, pages 703–716. Springer, 2019. 2

2019
[15]

M 2gf: Multi-scale and multi-directional gabor filters for image edge detection.Applied Sciences, 13(16): 9409, 2023

Yunhong Li, Yuandong Bi, Weichuan Zhang, Jie Ren, and Jinni Chen. M 2gf: Multi-scale and multi-directional gabor filters for image edge detection.Applied Sciences, 13(16): 9409, 2023. 4

2023
[16]

Self-supervised anomaly detection, staging and segmentation for retinal images.Med- ical Image Analysis, 87:102805, 2023

Yiyue Li, Qicheng Lao, Qingbo Kang, Zekun Jiang, Shiyi Du, Shaoting Zhang, and Kang Li. Self-supervised anomaly detection, staging and segmentation for retinal images.Med- ical Image Analysis, 87:102805, 2023. 2

2023
[17]

A convnet for the 2020s

Zhuang Liu, Hanzi Mao, Chao-Yuan Wu, Christoph Feicht- enhofer, Trevor Darrell, and Saining Xie. A convnet for the 2020s. InProceedings of the IEEE/CVF conference on com- puter vision and pattern recognition, pages 11976–11986,
[18]

Donghuan Lu, Morgan Heisler, Sieun Lee, Gavin Weiguang Ding, Eduardo Navajas, Marinko V Sarunic, and Mirza Faisal Beg. Deep-learning based multiclass reti- nal fluid segmentation and detection in optical coherence tomography images using a fully convolutional neural network.Medical image analysis, 54:100–110, 2019. 1

2019
[19]

Unsupervised anomaly detec- tion in brain mri: Learning abstract distribution from mas- sive healthy brains.Computers in biology and medicine, 154: 106610, 2023

Guoting Luo, Wei Xie, Ronghui Gao, Tao Zheng, Lei Chen, and Huaiqiang Sun. Unsupervised anomaly detec- tion in brain mri: Learning abstract distribution from mas- sive healthy brains.Computers in biology and medicine, 154: 106610, 2023. 8

2023
[20]

Anomaly detection through latent space restoration using vector- quantized variational autoencoders.arXiv preprint arXiv:2012.06765, 2020

Sergio Naval Marimont and Giacomo Tarroni. Anomaly detection through latent space restoration using vector- quantized variational autoencoders.arXiv preprint arXiv:2012.06765, 2020. 1

work page arXiv 2012
[21]

W. H. L. Pinaya et al. Unsupervised brain imaging 3d anomaly detection and segmentation with transformers: The mosaic+ proposal.Medical Image Analysis, 79:102410,
[22]

Healthy- gan: Learning from unannotated medical images to detect anomalies associated with human disease

Md Mahfuzur Rahman Siddiquee, Jay Shah, Teresa Wu, Catherine Chong, Todd Schwedt, and Baoxin Li. Healthy- gan: Learning from unannotated medical images to detect anomalies associated with human disease. InInternational Workshop on Simulation and Synthesis in Medical Imaging, pages 43–54. Springer, 2022. 8

2022
[23]

Unsupervised Anomaly Detection with Generative Adversarial Networks to Guide Marker Discovery

Thomas Schlegl, Philipp Seeb ¨ock, Sebastian M. Waldstein, Ursula Schmidt-Erfurth, and Georg Langs. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery.CoRR, abs/1703.05921, 2017. 2

work page Pith review arXiv 2017
[24]

f-anogan: Fast unsupervised anomaly detection with generative adversarial networks.Medical image analysis, 54:30–44, 2019

Thomas Schlegl, Philipp Seeb ¨ock, Sebastian M Waldstein, Georg Langs, and Ursula Schmidt-Erfurth. f-anogan: Fast unsupervised anomaly detection with generative adversarial networks.Medical image analysis, 54:30–44, 2019. 2, 6, 8

2019
[25]

Unsupervised identification of disease marker candidates in retinal oct imaging data.IEEE transactions on medical imag- ing, 38(4):1037–1047, 2018

Philipp Seeb ¨ock, Sebastian M Waldstein, Sophie Klimscha, Hrvoje Bogunovic, Thomas Schlegl, Bianca S Gerendas, Rene Donner, Ursula Schmidt-Erfurth, and Georg Langs. Unsupervised identification of disease marker candidates in retinal oct imaging data.IEEE transactions on medical imag- ing, 38(4):1037–1047, 2018. 2

2018
[26]

Exploiting epis- temic uncertainty of anatomy segmentation for anomaly de- tection in retinal oct.IEEE transactions on medical imaging, 39(1):87–98, 2019

Philipp Seeb ¨ock, Jos´e Ignacio Orlando, Thomas Schlegl, Se- bastian M Waldstein, Hrvoje Bogunovi ´c, Sophie Klimscha, Georg Langs, and Ursula Schmidt-Erfurth. Exploiting epis- temic uncertainty of anatomy segmentation for anomaly de- tection in retinal oct.IEEE transactions on medical imaging, 39(1):87–98, 2019. 1, 2

2019
[27]

Philipp Seeb ¨ock, Jos ´e Ignacio Orlando, Martin Michl, Ju- lia Mai, Ursula Schmidt-Erfurth, and Hrvoje Bogunovi ´c. Anomaly guided segmentation: Introducing semantic con- text for lesion segmentation in retinal oct using weak context supervision from anomaly detection.Medical Image Analy- sis, 93:103104, 2024. 8

2024
[28]

Anomaly detection in medical imaging with deep perceptual autoencoders.IEEE Access, 9: 118571–118583, 2021

Nina Shvetsova, Bart Bakker, Irina Fedulova, Heinrich Schulz, and Dmitry V Dylov. Anomaly detection in medical imaging with deep perceptual autoencoders.IEEE Access, 9: 118571–118583, 2021. 2

2021
[29]

Pratul P Srinivasan, Leo A Kim, Priyatham S Mettu, Scott W Cousins, Grant M Comer, Joseph A Izatt, and Sina Farsiu. Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence to- mography images.Biomedical optics express, 5(10):3568– 3577, 2014. 3

2014
[30]

Anomaly detection in medical imaging-a mini review

Maximilian E Tschuchnig and Michael Gadermayr. Anomaly detection in medical imaging-a mini review. InInternational Data Science Conference, pages 33–38. Springer, 2021. 6

2021
[31]

Neural discrete representation learning.Advances in neural information pro- cessing systems, 30, 2017

Aaron Van Den Oord, Oriol Vinyals, et al. Neural discrete representation learning.Advances in neural information pro- cessing systems, 30, 2017. 6

2017
[32]

On oct image classifica- tion via deep learning.IEEE Photonics Journal, 11(5):1–14,

Depeng Wang and Liejun Wang. On oct image classifica- tion via deep learning.IEEE Photonics Journal, 11(5):1–14,
[33]

Weakly supervised anomaly segmentation in retinal oct images using an adversarial learn- ing approach.Biomedical optics express, 12(8):4713–4729,

Jing Wang, Wanyue Li, Yiwei Chen, Wangyi Fang, Wen Kong, Yi He, and Guohua Shi. Weakly supervised anomaly segmentation in retinal oct images using an adversarial learn- ing approach.Biomedical optics express, 12(8):4713–4729,
[34]

Tri-vae: Triplet variational autoencoder for un- supervised anomaly detection in brain tumor mri

Hansen Wijanarko, Evelyne Calista, Li-Fen Chen, and Yong- Sheng Chen. Tri-vae: Triplet variational autoencoder for un- supervised anomaly detection in brain tumor mri. InPro- ceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3930–3939, 2024. 2, 6

2024
[35]

Gongyu Zhang, Dun Jack Fu, Bart Liefers, Livia Faes, So- phie Glinton, Siegfried Wagner, Robbert Struyven, Nikolas Pontikos, Pearse A Keane, and Konstantinos Balaskas. Clin- ically relevant deep learning for detection and quantification of geographic atrophy from optical coherence tomography: a model development and external validation study.The Lancet Di...

2021
[36]

Encoding structure-texture relation with p-net for anomaly detection in retinal images

Kang Zhou, Yuting Xiao, Jianlong Yang, Jun Cheng, Wen Liu, Weixin Luo, Zaiwang Gu, Jiang Liu, and Shenghua Gao. Encoding structure-texture relation with p-net for anomaly detection in retinal images. InEuropean conference on computer vision, pages 360–377. Springer, 2020. 2

2020
[37]

Spatial–contextual vari- ational autoencoder with attention correction for anomaly detection in retinal oct images.Computers in biology and medicine, 152:106328, 2023

Xueying Zhou, Sijie Niu, Xiaohui Li, Hui Zhao, Xizhan Gao, Tingting Liu, and Jiwen Dong. Spatial–contextual vari- ational autoencoder with attention correction for anomaly detection in retinal oct images.Computers in biology and medicine, 152:106328, 2023. 2 Anatomy-Aware Unsupervised Detection and Localization of Retinal Abnormalities in Optical Coherenc...

work page arXiv 2023
[38]

Reconstruction Metric Analysis for Anomaly Segmentation This appendix provides a detailed analysis of the impact of different reconstruction discrepancy measures on anomaly localization performance. While the main paper reports segmentation results using our weighted reconstruction for- mulation, here we compare several commonly used pixel- wise error met...