Recognition: 2 theorem links
· Lean TheoremRepresentation learning from OCT images
Pith reviewed 2026-05-08 18:27 UTC · model grok-4.3
The pith
Representation learning for retinal OCT images advances from supervised CNNs and transformers through self-supervised and generative methods to foundation models and vision-language systems.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The survey reviews representation learning for retinal OCT images by covering supervised CNN-based and transformer architectures, self-supervised and semi-supervised methods, generative approaches, 3D volumetric modeling, multimodal representation learning, and large-scale pretrained foundation models. For each paradigm it examines core methodological contributions, identifies persistent limitations, and traces connections between successive approaches. It supplies a structured list of public OCT datasets, discusses evaluation protocol issues, presents a unified mathematical formulation that places every paradigm inside the same problem setup, and outlines pressing open directions including
What carries the argument
The principled taxonomy of learning paradigms together with the unified mathematical formulation that situates supervised, self-supervised, generative, multimodal, and foundation-model methods inside one common framework for OCT image representation.
If this is right
- The taxonomy reveals a clear progression from label-heavy supervised methods toward approaches that need far less manual annotation.
- The unified formulation lets researchers compare different paradigms directly on the same mathematical footing.
- Listing public datasets and evaluation considerations supports more standardized benchmarking across studies.
- The highlighted open directions, such as volumetric foundation-model pretraining and fairness mitigation, follow directly from the limitations identified in earlier paradigms.
Where Pith is reading between the lines
- The same taxonomy could be applied to representation learning in other high-volume medical scans such as ultrasound or MRI to spot reusable techniques.
- Vision-language models listed in the open directions may eventually supply natural-language explanations that help clinicians trust the outputs.
- Federated and privacy-preserving training highlighted in the survey could become necessary for real-world deployment where patient data cannot leave local hospitals.
- An implicit next step is to test whether models pretrained on the public OCT datasets listed in the survey generalize to new scanner vendors without additional labels.
Load-bearing premise
The selected papers and the chosen taxonomy of paradigms give an accurate and reasonably complete picture of the field without major omissions or systematic bias in coverage.
What would settle it
A major OCT representation learning paper or method published before the survey cutoff that fits none of the described paradigms and is absent from the review would show the taxonomy and coverage are incomplete.
Figures
read the original abstract
Optical Coherence Tomography (OCT) has become one of the most used imaging modality in ophthalmology. It provides high-resolution, non-invasive visualization of retinal microarchitecture. The automated analysis of OCT images through representation learning has emerged as a central research frontier. This has mainly been driven by the clinical need to process large acquisition volumes. The objective is to reduce the reliance on expert annotation, and improve diagnostic consistency across devices and populations. This survey provides a comprehensive and structured review of representation learning methods for retinal OCT image analysis. It covers the period from early deep learning approaches to the most recent developments in foundation models and vision-language systems. We organize the literature along a principled taxonomy of learning paradigms, encompassing supervised learning with CNN-based and transformer-based architectures, self-supervised and semi-supervised methods, generative approaches, as well as 3D volumetric modeling, multimodal representation learning, and large-scale pretrained foundation models. For each paradigm, we analyze the core methodological contributions, identify persistent limitations, and trace the connections between successive approaches. We further provide a structured overview of publicly available OCT datasets, discuss evaluation protocol considerations, and present a unified problem formulation that situates each learning paradigm within a common mathematical framework. Building on this analysis, we identify and discuss the most pressing open research directions emerging in the literature. This includes volumetric foundation model pretraining, uncertainty-aware representation learning, federated and privacy-preserving training, fairness and bias mitigation, concept-based interpretability,...
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript is a survey reviewing representation learning methods for retinal OCT image analysis in ophthalmology. It organizes prior work according to a taxonomy of paradigms (supervised CNN/transformer architectures, self-supervised/semi-supervised learning, generative models, 3D volumetric modeling, multimodal learning, and large-scale foundation models), provides a unified mathematical formulation, surveys public datasets and evaluation protocols, and outlines open challenges such as volumetric pretraining, uncertainty modeling, federated learning, fairness, and interpretability.
Significance. If the coverage proves complete and the taxonomy reproducible, the survey would offer a timely consolidation of a fast-moving area at the intersection of computer vision and ophthalmic imaging, particularly by tracing the shift toward foundation and vision-language models. The unified formulation and dataset overview add practical value for new researchers. However, the absence of a documented literature selection process substantially weakens its authority as a field-wide reference.
major comments (1)
- [Abstract and Introduction] Abstract and Introduction: The central claim that the work delivers a 'comprehensive and structured review' organized along a 'principled taxonomy' is not supported by any description of the literature search methodology (databases, keywords, date ranges, inclusion/exclusion criteria, or screening process). Without this, it is impossible to verify that the selected papers form an exhaustive or unbiased map of the field rather than reflecting author curation or recency bias, directly undermining the survey's core contribution.
minor comments (1)
- [Abstract] The abstract sentence on open directions is truncated after 'concept-based interpretability,...'; this should be completed for clarity.
Simulated Author's Rebuttal
We thank the referee for the constructive feedback on our survey of representation learning methods for retinal OCT image analysis. We address the major comment regarding the literature search methodology below.
read point-by-point responses
-
Referee: [Abstract and Introduction] Abstract and Introduction: The central claim that the work delivers a 'comprehensive and structured review' organized along a 'principled taxonomy' is not supported by any description of the literature search methodology (databases, keywords, date ranges, inclusion/exclusion criteria, or screening process). Without this, it is impossible to verify that the selected papers form an exhaustive or unbiased map of the field rather than reflecting author curation or recency bias, directly undermining the survey's core contribution.
Authors: We agree that explicitly documenting the literature selection process would strengthen the transparency and authority of the survey. Although the taxonomy is organized by learning paradigms (supervised, self-supervised, generative, multimodal, and foundation models) rather than an exhaustive enumeration of every paper, the selection was informed by a broad review of influential works tracing the field's evolution. In the revised manuscript, we will add a dedicated subsection in the Introduction (or a new 'Survey Methodology' section) that details the process: databases searched (PubMed, IEEE Xplore, arXiv, Google Scholar), primary keywords and combinations (e.g., 'OCT' OR 'optical coherence tomography' AND ('representation learning' OR 'self-supervised' OR 'foundation model' OR 'vision-language')), date range (2012–2024 to cover the deep learning era onward), inclusion criteria (peer-reviewed works focused on representation learning for retinal OCT, with emphasis on methodological novelty), and screening (initial search followed by title/abstract review and full-text assessment for relevance to the taxonomy categories). This addition will clarify scope and potential biases while preserving the paper's structure and contributions. revision: yes
Circularity Check
No circularity: survey paper with no derivations or predictions
full rationale
This is a literature review that organizes prior work under a taxonomy and presents a unified problem formulation as an organizational device. No original derivations, equations, fitted parameters, or predictions are claimed. The patterns for circularity (self-definitional claims, fitted inputs renamed as predictions, load-bearing self-citations, uniqueness theorems, ansatz smuggling, or renaming known results) do not apply because there is no derivation chain to inspect. Literature selection and taxonomy choices may reflect author judgment, but this is not circularity under the specified criteria.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
Cost.FunctionalEquation (J uniqueness); Foundation.LogicAsFunctionalEquationwashburn_uniqueness_aczel — no relation; paper uses generic ML losses, not RS J-cost unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
min_{θ,ϕ} E_{(x,y)~D} [ L( g_ϕ(f_θ(x)), y ) ] ... ELBO ... GAN minimax ... diffusion denoising loss ‖ε − ε_θ(x_t,t)‖²
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Optical coherence tomography.science, 254(5035):1178–1181, 1991
David Huang, Eric A Swanson, Charles P Lin, Joel S Schuman, William G Stinson, Warren Chang, Michael R Hee, Thomas Flotte, Kenton Gregory, Carmen A Puliafito, et al. Optical coherence tomography.science, 254(5035):1178–1181, 1991
1991
-
[2]
CRC Press, 2024
Joel S Schuman, James G Fujimoto, Jay Duker, and Hiroshi Ishikawa.Optical coherence tomography of ocular diseases. CRC Press, 2024
2024
-
[3]
Optical coherence tomography of age-related macular degeneration and choroidal neovascularization.Ophthalmology, 103(8):1260–1270, 1996
Michael R Hee, Caroline R Baumal, Carmen A Puliafito, Jay S Duker, Elias Reichel, Jason R Wilkins, Jeffery G Coker, Joel S Schuman, Eric A Swanson, and James G Fujimoto. Optical coherence tomography of age-related macular degeneration and choroidal neovascularization.Ophthalmology, 103(8):1260–1270, 1996
1996
-
[4]
Comparison of the clinical diagnosis of diabetic macular edema with diagnosis by optical coherence tomography.Ophthalmology, 111(4):712–715, 2004
David J Browning, Michael D McOwen, Robert M Bowen Jr, and Tisha L O’Marah. Comparison of the clinical diagnosis of diabetic macular edema with diagnosis by optical coherence tomography.Ophthalmology, 111(4):712–715, 2004
2004
-
[5]
Optical coherence tomography to detect and manage retinal disease and glaucoma.American journal of ophthalmology, 137(1):156–169, 2004
Glenn J Jaffe and Joseph Caprioli. Optical coherence tomography to detect and manage retinal disease and glaucoma.American journal of ophthalmology, 137(1):156–169, 2004
2004
-
[6]
Optical coherence tomography findings after an intravitreal injection of bevacizumab (avastin®) for macular edema from central retinal vein occlusion, 2005
Philip J Rosenfeld, Anne E Fung, and Carmen A Puliafito. Optical coherence tomography findings after an intravitreal injection of bevacizumab (avastin®) for macular edema from central retinal vein occlusion, 2005
2005
-
[7]
Accuracy of spectral-domain oct of the macula for detection of complete posterior vitreous detachment.Ophthalmology Retina, 4(2):148–153, 2020
Eileen S Hwang, Jessica A Kraker, Kim J Griffin, J Sebag, David V Weinberg, and Judy E Kim. Accuracy of spectral-domain oct of the macula for detection of complete posterior vitreous detachment.Ophthalmology Retina, 4(2):148–153, 2020
2020
-
[8]
Anterior segment optical coherence tomography with angiography for the cornea and ocular surface.Journal of Clinical Medicine, 15(6):2402, 2026
Qiu Ying Wong, Sim Ralene, and Marcus Ang. Anterior segment optical coherence tomography with angiography for the cornea and ocular surface.Journal of Clinical Medicine, 15(6):2402, 2026
2026
-
[9]
Speckle in optical coherence tomography.Journal of biomedical optics, 4(1):95–105, 1999
Joseph M Schmitt, SH Xiang, and Kin Man Yung. Speckle in optical coherence tomography.Journal of biomedical optics, 4(1):95–105, 1999
1999
-
[10]
Efficient reduction of speckle noise in optical coherence tomography.Optics express, 20(2):1337– 1359, 2012
Maciej Szkulmowski, Iwona Gorczynska, Daniel Szlag, Marcin Sylwestrzak, Andrzej Kowalczyk, and Maciej Wojtkowski. Efficient reduction of speckle noise in optical coherence tomography.Optics express, 20(2):1337– 1359, 2012
2012
-
[11]
Agreement between spectral-domain and time-domain oct for measuring rnfl thickness.British Journal of Ophthalmology, 93(6):775–781, 2009
Gianmarco Vizzeri, Robert N Weinreb, Alberto O Gonzalez-Garcia, Christopher Bowd, Felipe A Medeiros, Pamela A Sample, and Linda M Zangwill. Agreement between spectral-domain and time-domain oct for measuring rnfl thickness.British Journal of Ophthalmology, 93(6):775–781, 2009
2009
-
[12]
Clinical factors associated with long-term oct variability in glaucoma.American journal of ophthalmology, 255:98–106, 2023
Jo-Hsuan Wu, Sasan Moghimi, Evan Walker, Takashi Nishida, Jeffrey M Liebmann, Massimo Fazio, Christo- pher A Girkin, Linda M Zangwill, and Robert N Weinreb. Clinical factors associated with long-term oct variability in glaucoma.American journal of ophthalmology, 255:98–106, 2023
2023
-
[13]
Performance evaluation of retinal oct fluid segmentation, detection, and generalization over variations of data sources.IEEE Access, 12:31719–31735, 2024
Nchongmaje Ndipenoch, Alina Miron, and Yongmin Li. Performance evaluation of retinal oct fluid segmentation, detection, and generalization over variations of data sources.IEEE Access, 12:31719–31735, 2024
2024
-
[14]
Assessment of artifacts and reproducibility across spectral-and time-domain optical coherence tomography devices.Ophthalmology, 116(10):1960–1970, 2009
Joseph Ho, Alan C Sull, Laurel N Vuong, Yueli Chen, Jonathan Liu, James G Fujimoto, Joel S Schuman, and Jay S Duker. Assessment of artifacts and reproducibility across spectral-and time-domain optical coherence tomography devices.Ophthalmology, 116(10):1960–1970, 2009
1960
-
[15]
Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002
Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition.Proceedings of the IEEE, 86(11):2278–2324, 2002
2002
-
[16]
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. An image is worth 16x16 words: Transformers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020
work page Pith review arXiv 2010
-
[17]
Transforming auto-encoders
Geoffrey E Hinton, Alex Krizhevsky, and Sida D Wang. Transforming auto-encoders. InInternational conference on artificial neural networks, pages 44–51. Springer, 2011
2011
-
[18]
Generative adversarial networks.Communications of the ACM, 63(11):139–144, 2020
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial networks.Communications of the ACM, 63(11):139–144, 2020. 20
2020
-
[19]
Self-supervised learning: Generative or contrastive.IEEE transactions on knowledge and data engineering, 35(1):857–876, 2021
Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, and Jie Tang. Self-supervised learning: Generative or contrastive.IEEE transactions on knowledge and data engineering, 35(1):857–876, 2021
2021
-
[20]
Kermany, Michael Goldbaum, Wenjia Cai, Carolina C.S
Daniel S. Kermany, Michael Goldbaum, Wenjia Cai, Carolina C.S. Valentim, Huiying Liang, Sally L. Baxter, Alex McKeown, Ge Yang, Xiaokang Wu, Fangbing Yan, et al. Identifying medical diagnoses and treatable diseases by image-based deep learning.Cell, 172(5):1122–1131, 2018
2018
-
[21]
Statistical model for oct image denoising.Biomedical optics express, 8(9):3903–3917, 2017
Muxingzi Li, Ramzi Idoughi, Biswarup Choudhury, and Wolfgang Heidrich. Statistical model for oct image denoising.Biomedical optics express, 8(9):3903–3917, 2017
2017
-
[22]
A review of state-of-the-art speckle reduction techniques for optical coherence tomography fingertip scans
Luke Nicholas Darlow, Sharat Saurabh Akhoury, and James Connan. A review of state-of-the-art speckle reduction techniques for optical coherence tomography fingertip scans. InSeventh International Conference on Machine Vision (ICMV 2014), volume 9445, pages 418–426. SPIE, 2015
2014
-
[23]
Retinal imaging and image analysis.IEEE reviews in biomedical engineering, 3:169–208, 2010
Michael D Abràmoff, Mona K Garvin, and Milan Sonka. Retinal imaging and image analysis.IEEE reviews in biomedical engineering, 3:169–208, 2010
2010
-
[24]
State-of-the-art in retinal optical coherence tomography image analysis.Quantitative imaging in medicine and surgery, 5(4):603, 2015
Ahmadreza Baghaie, Zeyun Yu, and Roshan M D’Souza. State-of-the-art in retinal optical coherence tomography image analysis.Quantitative imaging in medicine and surgery, 5(4):603, 2015
2015
-
[25]
Machine learning techniques for diabetic macular edema (dme) classification on sd-oct images.Biomedical engineering online, 16(1):68, 2017
Khaled Alsaih, Guillaume Lemaitre, Mojdeh Rastgoo, Joan Massich, Désiré Sidibé, and Fabrice Meriaudeau. Machine learning techniques for diabetic macular edema (dme) classification on sd-oct images.Biomedical engineering online, 16(1):68, 2017
2017
-
[26]
Octdl: Optical coherence tomography dataset for image-based deep learning methods.Scientific data, 11(1):365, 2024
Mikhail Kulyabin, Aleksei Zhdanov, Anastasia Nikiforova, Andrey Stepichev, Anna Kuznetsova, Mikhail Ronkin, Vasilii Borisov, Alexander Bogachev, Sergey Korotkich, Paul A Constable, et al. Octdl: Optical coherence tomography dataset for image-based deep learning methods.Scientific data, 11(1):365, 2024
2024
-
[27]
Classifica- tion of retinal oct images using deep learning
Malliga Subramanian, Kogilavani Shanmugavadivel, Obuli Sai Naren, K Premkumar, and K Rankish. Classifica- tion of retinal oct images using deep learning. In2022 international conference on computer communication and informatics (ICCCI), pages 1–7. IEEE, 2022
2022
-
[28]
Chiu, Michael J
Stephanie J. Chiu, Michael J. Allingham, Priyatham S. Mettu, Scott W. Cousins, Joseph A. Izatt, and Sina Farsiu. Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema. Biomedical Optics Express, 6(4):1172–1194, 2015
2015
-
[29]
Waldstein, José Ignacio Orlando, Matteo Baroni, Ciarán N
Hrvoje Bogunovi´c, Freya Maintau-Sanchez, Sebastian M. Waldstein, José Ignacio Orlando, Matteo Baroni, Ciarán N. Bhreartaigh, et al. RETOUCH: The retinal OCT fluid detection and segmentation benchmark and challenge.IEEE Transactions on Medical Imaging, 38(8):1858–1874, 2019
2019
-
[30]
Annotated retinal optical coherence tomography images (AROI) database for joint retinal layer and fluid segmentation.Automatika, 62(3):375–385, 2021
Matej Melinšˇcak, Marin Radmilovi´c, Zoran Vatavuk, and Sven Lonˇcari´c. Annotated retinal optical coherence tomography images (AROI) database for joint retinal layer and fluid segmentation.Automatika, 62(3):375–385, 2021
2021
-
[31]
Oimhs: An optical coherence tomography image dataset based on macular hole manual segmentation.Scientific Data, 10(1):769, 2023
Xin Ye, Shucheng He, Xiaxing Zhong, Jiafeng Yu, Shangchao Yang, Yingjiao Shen, Yiqi Chen, Yaqi Wang, Xingru Huang, and Lijun Shen. Oimhs: An optical coherence tomography image dataset based on macular hole manual segmentation.Scientific Data, 10(1):769, 2023
2023
-
[32]
OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers.Scientific Data, 12(1):267, 2025
Murat Arikan, James Willoughby, Serhat Ongun, et al. OCT5k: A dataset of multi-disease and multi-graded annotations for retinal layers.Scientific Data, 12(1):267, 2025
2025
-
[33]
Gamma challenge: glaucoma grading from multi-modality images.Medical Image Analysis, 90:102938, 2023
Junde Wu, Huihui Fang, Fei Li, Huazhu Fu, Fengbin Lin, Jiongcheng Li, Yue Huang, Qinji Yu, Sifan Song, Xinxing Xu, et al. Gamma challenge: glaucoma grading from multi-modality images.Medical Image Analysis, 90:102938, 2023
2023
-
[34]
MultiEYE: Dataset and benchmark for OCT-enhanced retinal disease recognition from fundus images.IEEE Transactions on Medical Imaging, 44(4):1711–1722, 2025
Lehan Wang, Chongchong Qi, Chubin Ou, Lin An, Mei Jin, Xiangbin Kong, and Xiaomeng Li. MultiEYE: Dataset and benchmark for OCT-enhanced retinal disease recognition from fundus images.IEEE Transactions on Medical Imaging, 44(4):1711–1722, 2025
2025
-
[35]
Octa-500: a retinal dataset for optical coherence tomography angiography study
Mingchao Li, Kun Huang, Qiuzhuo Xu, Jiadong Yang, Yuhan Zhang, Zexuan Ji, Keren Xie, Songtao Yuan, Qinghuai Liu, and Qiang Chen. Octa-500: a retinal dataset for optical coherence tomography angiography study. Medical image analysis, 93:103092, 2024
2024
-
[36]
Srinivasan, Leo A
Pratul P. Srinivasan, Leo A. Kim, Priyatham S. Mettu, Scott W. Cousins, Grant M. Comer, Joseph A. Izatt, and Sina Farsiu. Fully automated detection of diabetic macular edema and dry age-related macular degeneration from optical coherence tomography images.Biomedical Optics Express, 5(10):3568–3577, 2014
2014
-
[37]
OCTID: Optical coherence tomography image database.Computers & Electrical Engineering, 81:106532, 2020
Peyman Gholami, Priyanka Roy, Mohana Kuppuswamy Parthasarathy, and Vasudevan Lakshminarayanan. OCTID: Optical coherence tomography image database.Computers & Electrical Engineering, 81:106532, 2020. 21
2020
-
[38]
Jedynak, Sharon D
Yufan He, Aaron Carass, Yihao Liu, Bruno M. Jedynak, Sharon D. Solomon, Shiv Saidha, Peter A. Calabresi, and Jerry L. Prince. Retinal layer parcellation of optical coherence tomography images: Data resource for multiple sclerosis and healthy controls.Data in Brief, 22:601–604, 2019
2019
-
[39]
Fully-automated segmentation of fluid regions in exudative age-related macular degeneration subjects: Kernel graph cut in neutrosophic domain.PloS one, 12(10):e0186949, 2017
Abdolreza Rashno, Behzad Nazari, Dara D Koozekanani, Paul M Drayna, Saeed Sadri, Hossein Rabbani, and Keshab K Parhi. Fully-automated segmentation of fluid regions in exudative age-related macular degeneration subjects: Kernel graph cut in neutrosophic domain.PloS one, 12(10):e0186949, 2017
2017
-
[40]
A composite retinal fundus and oct dataset to grade macular and glaucomatous disorders
Taimur Hassan, Hina Raja, Bilal Hassan, Muhammad Usman Akram, Jorge Dias, and Naoufel Werghi. A composite retinal fundus and oct dataset to grade macular and glaucomatous disorders. In2022 2nd International Conference on Digital Futures and Transformative Technologies (ICoDT2), pages 1–6. IEEE, 2022
2022
-
[41]
Learning two-stream cnn for multi-modal age-related macular degeneration categorization.IEEE Journal of Biomedical and Health Informatics, 26(8):4111–4122, 2022
Weisen Wang, Xirong Li, Zhiyan Xu, Weihong Yu, Jianchun Zhao, Dayong Ding, and Youxin Chen. Learning two-stream cnn for multi-modal age-related macular degeneration categorization.IEEE Journal of Biomedical and Health Informatics, 26(8):4111–4122, 2022
2022
-
[42]
A retinal oct- angiography and cardiovascular status (rasta) dataset of swept-source microvascular imaging for cardiovascular risk assessment.Data, 8(10):147, 2023
Clément Germanèse, Fabrice Meriaudeau, Pétra Eid, Ramin Tadayoni, Dominique Ginhac, Atif Anwer, Steinberg Laure-Anne, Charles Guenancia, Catherine Creuzot-Garcher, Pierre-Henry Gabrielle, et al. A retinal oct- angiography and cardiovascular status (rasta) dataset of swept-source microvascular imaging for cardiovascular risk assessment.Data, 8(10):147, 2023
2023
-
[43]
Rose: a retinal oct-angiography vessel segmentation dataset and new model.IEEE transactions on medical imaging, 40(3):928–939, 2020
Yuhui Ma, Huaying Hao, Jianyang Xie, Huazhu Fu, Jiong Zhang, Jianlong Yang, Zhen Wang, Jiang Liu, Yalin Zheng, and Yitian Zhao. Rose: a retinal oct-angiography vessel segmentation dataset and new model.IEEE transactions on medical imaging, 40(3):928–939, 2020
2020
-
[44]
Syn-oct: A synthetic dataset of ocular optical coherence tomography images from healthy and glaucoma eyes.Scientific Data, 2026
Damon Wong, Ashish Jith Sreejith Kumar, Rachel S Chong, Monisha E Nongpiur, Rahat Husain, Tina Wong, Shamira Perera, Tin Aung, Bingyao Tan, Ching-Yu Cheng, et al. Syn-oct: A synthetic dataset of ocular optical coherence tomography images from healthy and glaucoma eyes.Scientific Data, 2026
2026
-
[45]
Deep residual learning for image recognition
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016
2016
-
[46]
Densely connected convolutional networks
Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger. Densely connected convolutional networks. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 4700–4708, 2017
2017
-
[47]
U-net: Convolutional networks for biomedical image segmentation
Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. InInternational Conference on Medical image computing and computer-assisted intervention, pages 234–241. Springer, 2015
2015
-
[48]
A simple framework for contrastive learning of visual representations
Ting Chen, Simon Kornblith, Mohammad Norouzi, and Geoffrey Hinton. A simple framework for contrastive learning of visual representations. InInternational conference on machine learning, pages 1597–1607. PmLR, 2020
2020
-
[49]
Momentum contrast for unsupervised visual representation learning
Kaiming He, Haoqi Fan, Yuxin Wu, Saining Xie, and Ross Girshick. Momentum contrast for unsupervised visual representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9729–9738, 2020
2020
-
[50]
Bootstrap your own latent-a new approach to self-supervised learning.Advances in neural information processing systems, 33:21271–21284, 2020
Jean-Bastien Grill, Florian Strub, Florent Altché, Corentin Tallec, Pierre Richemond, Elena Buchatskaya, Carl Doersch, Bernardo Avila Pires, Zhaohan Guo, Mohammad Gheshlaghi Azar, et al. Bootstrap your own latent-a new approach to self-supervised learning.Advances in neural information processing systems, 33:21271–21284, 2020
2020
-
[51]
Octnet: A lightweight cnn for retinal disease classification from optical coherence tomography images.Computer methods and programs in biomedicine, 200:105877, 2021
AP Sunija, Saikat Kar, S Gayathri, Varun P Gopi, and Ponnusamy Palanisamy. Octnet: A lightweight cnn for retinal disease classification from optical coherence tomography images.Computer methods and programs in biomedicine, 200:105877, 2021
2021
-
[52]
Ali Mohammad Alqudah. Aoct-net: a convolutional network automated classification of multiclass retinal diseases using spectral-domain optical coherence tomography images.Medical & biological engineering & computing, 58(1):41–53, 2020
2020
-
[53]
Classification of sd-oct images using a deep learning approach
Muhammad Awais, Henning Müller, Tong B Tang, and Fabrice Meriaudeau. Classification of sd-oct images using a deep learning approach. In2017 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pages 489–492. IEEE, 2017
2017
-
[54]
Prediction of postoperative visual acuity in rhegmatogenous retinal detachment using oct images.IEEE Access, 11:135435–135448, 2023
Sinda Hosni, Hajer Khachnaoui, Hsouna Mehdi Zgolli, Sonya Mabrouk, Désiré Sidibé, Hedi Tabia, and Nawres Khlifa. Prediction of postoperative visual acuity in rhegmatogenous retinal detachment using oct images.IEEE Access, 11:135435–135448, 2023. 22
2023
-
[55]
Surrogate-assisted retinal oct image classification based on convolutional neural networks.IEEE journal of biomedical and health informatics, 23(1):253–263, 2018
Yibiao Rong, Dehui Xiang, Weifang Zhu, Kai Yu, Fei Shi, Zhun Fan, and Xinjian Chen. Surrogate-assisted retinal oct image classification based on convolutional neural networks.IEEE journal of biomedical and health informatics, 23(1):253–263, 2018
2018
-
[56]
Deep learning-based classification of eye diseases using convolutional neural network for oct images.Frontiers in Computer Science, 5:1252295, 2024
Mohamed Elkholy and Marwa A Marzouk. Deep learning-based classification of eye diseases using convolutional neural network for oct images.Frontiers in Computer Science, 5:1252295, 2024
2024
-
[57]
Segmentation- enhanced deep learning for amd detection from oct images
Zainab Haddad, Sirine Elhoula, Désiré Sidibé, Hedi Tabia, Imen Zeghal, and Nawres Khlifa. Segmentation- enhanced deep learning for amd detection from oct images. In2025 IEEE International Conference on Advances in Data-Driven Analytics And Intelligent Systems (ADACIS), pages 1–6. IEEE, 2025
2025
-
[58]
Advanced deep learning techniques for evaluating oct image quality and detecting retinal pathologies
Arij Mlaouhi, Zainab Haddad, Hsouna Zgolli, Hedi Tabia, Désiré Sidibé, and Nawres Khlifa. Advanced deep learning techniques for evaluating oct image quality and detecting retinal pathologies. In2025 IEEE/ACS 22nd International Conference on Computer Systems and Applications (AICCSA), pages 1–2. IEEE, 2025
2025
-
[59]
Extraction of retinal layers through convolution neural network (cnn) in an oct image for glaucoma diagnosis.Journal of Digital Imaging, 33(6):1428–1442, 2020
Hina Raja, M Usman Akram, Arslan Shaukat, Shoab Ahmed Khan, Norah Alghamdi, Sajid Gul Khawaja, and Noman Nazir. Extraction of retinal layers through convolution neural network (cnn) in an oct image for glaucoma diagnosis.Journal of Digital Imaging, 33(6):1428–1442, 2020
2020
-
[60]
Grad-cam: Visual explanations from deep networks via gradient-based localization
Ramprasaath R Selvaraju, Michael Cogswell, Abhishek Das, Ramakrishna Vedantam, Devi Parikh, and Dhruv Batra. Grad-cam: Visual explanations from deep networks via gradient-based localization. InProceedings of the IEEE international conference on computer vision, pages 618–626, 2017
2017
-
[61]
Christian Etmann, Sebastian Lunz, Peter Maass, and Carola-Bibiane Schönlieb. On the connection between adversarial robustness and saliency map interpretability.arXiv preprint arXiv:1905.04172, 2019
-
[62]
Explainable ai for retinal pathology detection in oct images
Zainab Haddad, Hsouna Zgolli, Désiré Sidibé, Hedi Tabia, and Nawres Khlifa. Explainable ai for retinal pathology detection in oct images. In2024 10th International Conference on Control, Decision and Information Technologies (CoDIT), pages 2401–2406. IEEE, 2024
2024
-
[63]
Attention is all you need.Advances in neural information processing systems, 30, 2017
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017
2017
-
[64]
Svit: Scaling up visual instruction tuning
Bo Zhao, Boya Wu, Muyang He, and Tiejun Huang. Svit: Scaling up visual instruction tuning.arXiv preprint arXiv:2307.04087, 2023
-
[65]
Automated retinal disease classification using hybrid transformer model (svit) using optical coherence tomography images.Neural Computing and Applications, 36(16):9171–9188, 2024
GR Hemalakshmi, M Murugappan, Mohamed Yacin Sikkandar, S Sabarunisha Begum, and NB Prakash. Automated retinal disease classification using hybrid transformer model (svit) using optical coherence tomography images.Neural Computing and Applications, 36(16):9171–9188, 2024
2024
-
[66]
Haoran Wang, Xinyu Guo, Kaiwen Song, Mingyang Sun, Yanbin Shao, Songfeng Xue, Hongwei Zhang, and Tianyu Zhang. Octformer: an efficient hierarchical transformer network specialized for retinal optical coherence tomography image recognition.IEEE Transactions on Instrumentation and Measurement, 72:1–17, 2023
2023
-
[67]
Crat: advanced transformer-based deep learning algorithms in oct image classification.Biomedical Signal Processing and Control, 104:107544, 2025
Mingming Yang, Junhui Du, and Ruichan Lv. Crat: advanced transformer-based deep learning algorithms in oct image classification.Biomedical Signal Processing and Control, 104:107544, 2025
2025
-
[68]
Oct-trans: A novel transformer backbone with multimodal feature extraction in oct-based retinal disease classification
Mohamed Elsharkawy, Ibrahim Abdelhalim, Mohammed Ghazal, Ali Mahmoud, Harpal S Sandhu, Aristomenis Thanos, and Ayman El-Baz. Oct-trans: A novel transformer backbone with multimodal feature extraction in oct-based retinal disease classification. In2025 IEEE 22nd International Symposium on Biomedical Imaging (ISBI), pages 1–4. IEEE, 2025
2025
-
[69]
Classification of diabetic maculopathy based on optical coherence tomography images using a vision transformer model.BMJ Open Ophthalmology, 8(1), 2023
Liwei Cai, Chi Wen, Jingwen Jiang, Congbi Liang, Hongmei Zheng, Yu Su, and Changzheng Chen. Classification of diabetic maculopathy based on optical coherence tomography images using a vision transformer model.BMJ Open Ophthalmology, 8(1), 2023
2023
-
[70]
Detection of nonexudative macular neovascularization on structural oct images using vision transformers.Ophthalmology Science, 2(4):100197, 2022
Yuka Kihara, Mengxi Shen, Yingying Shi, Xiaoshuang Jiang, Liang Wang, Rita Laiginhas, Cancan Lyu, Jin Yang, Jeremy Liu, Rosalyn Morin, et al. Detection of nonexudative macular neovascularization on structural oct images using vision transformers.Ophthalmology Science, 2(4):100197, 2022
2022
-
[71]
A vision transformer architecture for the automated segmenta- tion of retinal lesions in spectral domain optical coherence tomography images.Scientific Reports, 13(1):517, 2023
Daniel Philippi, Kai Rothaus, and Mauro Castelli. A vision transformer architecture for the automated segmenta- tion of retinal lesions in spectral domain optical coherence tomography images.Scientific Reports, 13(1):517, 2023
2023
-
[72]
An interpretable transformer network for the retinal disease classification using optical coherence tomography.Scientific Reports, 13(1):3637, 2023
Jingzhen He, Junxia Wang, Zeyu Han, Jun Ma, Chongjing Wang, and Meng Qi. An interpretable transformer network for the retinal disease classification using optical coherence tomography.Scientific Reports, 13(1):3637, 2023
2023
-
[73]
Mbt: Model-based transformer for retinal optical coherence tomography image and video multi-classification.International journal of medical informatics, 178:105178, 2023
Badr Ait Hammou, Fares Antaki, Marie-Carole Boucher, and Renaud Duval. Mbt: Model-based transformer for retinal optical coherence tomography image and video multi-classification.International journal of medical informatics, 178:105178, 2023. 23
2023
-
[74]
Transegnet: hybrid cnn-vision transformers encoder for retina segmentation of optical coherence tomography.Life, 13(4):976, 2023
Yiheng Zhang, Zhongliang Li, Nan Nan, and Xiangzhao Wang. Transegnet: hybrid cnn-vision transformers encoder for retina segmentation of optical coherence tomography.Life, 13(4):976, 2023
2023
-
[75]
Hyformer: a hybrid transformer-cnn architecture for retinal oct image segmentation.Biomedical Optics Express, 15(11):6156–6170, 2024
Qingxin Jiang, Ying Fan, Menghan Li, Sheng Fang, Weifang Zhu, Dehui Xiang, Tao Peng, Xinjian Chen, Xun Xu, and Fei Shi. Hyformer: a hybrid transformer-cnn architecture for retinal oct image segmentation.Biomedical Optics Express, 15(11):6156–6170, 2024
2024
-
[76]
Hctnet: a hybrid convnet- transformer network for retinal optical coherence tomography image classification.Biosensors, 12(7):542, 2022
Zongqing Ma, Qiaoxue Xie, Pinxue Xie, Fan Fan, Xinxiao Gao, and Jiang Zhu. Hctnet: a hybrid convnet- transformer network for retinal optical coherence tomography image classification.Biosensors, 12(7):542, 2022
2022
-
[77]
Hrs-net: A hybrid multi-scale network model based on convolution and transformers for multi-class retinal disease classification.IEEE Access, 12:144219–144229, 2024
Hai Yang, Li Chen, Junyang Cao, and Juan Wang. Hrs-net: A hybrid multi-scale network model based on convolution and transformers for multi-class retinal disease classification.IEEE Access, 12:144219–144229, 2024
2024
-
[78]
Ayoub Laouarem, Chafia Kara-Mohamed, El-Bay Bourennane, and Aboubekeur Hamdi-Cherif. Htc-retina: a hybrid retinal diseases classification model using transformer-convolutional neural network from optical coherence tomography images.Computers in biology and medicine, 178:108726, 2024
2024
-
[79]
Effivit: Hybrid cnn-transformer for retinal imaging.Computers in Biology and Medicine, 191:110164, 2025
DV Ashoka et al. Effivit: Hybrid cnn-transformer for retinal imaging.Computers in Biology and Medicine, 191:110164, 2025
2025
-
[80]
Supervised contrastive learning.Advances in neural information processing systems, 33:18661–18673, 2020
Prannay Khosla, Piotr Teterwak, Chen Wang, Aaron Sarna, Yonglong Tian, Phillip Isola, Aaron Maschinot, Ce Liu, and Dilip Krishnan. Supervised contrastive learning.Advances in neural information processing systems, 33:18661–18673, 2020
2020
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.