arxiv: 2604.20918 · v1 · submitted 2026-04-22 · 📡 eess.IV

Recognition: unknown

EDU-Net: Retinal Pathological Fluid Segmentation in OCT Images with Multiscale Feature Fusion and Boundary Optimization

Zijun Lei , Zikang Xu , Liang Zhang , Ge Song , Hanyu Guo , Dan Cao , Yujia Zhou , Qianjin Feng

Authors on Pith no claims yet

Pith reviewed 2026-05-09 23:25 UTC · model grok-4.3

classification 📡 eess.IV

keywords retinal fluid segmentationOCT imagingdiabetic macular edemadeep learning segmentationboundary optimizationmultiscale featuresIRF segmentationEDU-Net

0 comments

The pith

EDU-Net segments retinal fluid in OCT images more accurately by fusing local EfficientNet features with global context and edge-guided boundary optimization.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

This paper presents EDU-Net as a new network for automatic segmentation of intraretinal and subretinal fluid in OCT images used to manage diabetic macular edema. It employs a dual-branch structure where one branch uses EfficientNet to capture fine local details of small lesions and the other uses large-kernel convolutions for global scene understanding, then applies edge attention to sharpen boundaries affected by noise. The design targets the challenges of variable fluid shapes and blurred edges that reduce accuracy in existing automated tools. Readers might care because improved segmentation accuracy could enable better quantification of fluid volumes, supporting more informed clinical decisions to prevent vision loss in diabetic patients.

Core claim

The paper claims that EDU-Net, through its integration of local feature extraction via EfficientNet, global feature enhancement with LKEC modules, and multi-category edge-guided attention for boundary fusion, delivers state-of-the-art Dice similarity coefficient performance in segmenting retinal pathological fluids, with particular strength in IRF lesions across in-house and public datasets.

What carries the argument

The EDU-Net architecture consisting of a local EfficientNet-based branch for high-resolution tiny lesion capture, a global branch with large-kernel efficient convolution for long-range dependencies, and a multi-category edge-guided attention module to incorporate boundary details into multi-resolution features.

If this is right

EDU-Net achieves state-of-the-art DSC performance on in-house and public OCT datasets for fluid segmentation.
It shows particular robustness and efficiency in segmenting IRF lesions.
The multiscale fusion and boundary optimization handle variable morphology and noise interference effectively.
Local-global integration leads to improved accuracy in automatic retinal fluid quantification.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

If the results hold, this could reduce reliance on manual segmentation in clinical workflows for DME monitoring.
The boundary optimization technique may extend to other OCT-based tasks or noisy medical imaging modalities.
Further tests on varied scanner types would help determine if retraining is often needed for new settings.

Load-bearing premise

The gains in segmentation performance will generalize to new patient groups, scanner models, and acquisition settings without needing retraining or adjustments.

What would settle it

A test showing EDU-Net underperforming compared to prior methods on OCT data from a new scanner or unseen patient demographics would indicate the robustness claims do not hold broadly.

read the original abstract

Objective: Diabetic macular edema (DME) is the leading cause of severe visual impairment in patients with diabetes. Quantification of retinal fluid, particularly intraretinal fluid (IRF) and subretinal fluid (SRF), plays a critical role in the management of DME. Although optical coherence tomography (OCT) can be used for detection, the variable morphology of fluid accumulation and the blurred boundaries caused by noise interference still limit the accuracy of OCT's automatic segmentation. Methods: Retrospective model development and validation study. This study proposes a novel edge-guided dual-branch encoder-decoder network (EDU-Net) to achieve accurate and efficient automatic segmentation of OCT liquid lesions. The local feature extraction branch is based on the EfficientNet model, which precisely captures tiny lesions by leveraging its lightweight separable convolution and high-resolution feature preservation strategy. The global feature extraction branch is based on the large-kernel efficient convolution (LKEC) module and the downsampling layer design to enhance long-range dependencies and global semantics. EDU-Net applies a multi-category edge-guided attention module to fuse high-frequency boundary detail information to each resolution feature to optimize the boundary segmentation performance. Results: Extensive results on the in-house and public datasets demonstrate that EDU-Net achieves state-of-the-art DSC segmentation performance in terms of efficiency and robustness, especially in the segmentation of IRF lesions. Conclusions: EDU-Net integrates local details with global context and optimizes boundaries, achieving an improvement in the accuracy of automatic segmentation of retinal fluid.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EDU-Net is a straightforward dual-branch tweak that pairs EfficientNet local features with large-kernel global context plus edge attention for OCT fluid segmentation, and the experiments on in-house plus public data support the performance claims without obvious contradictions.

read the letter

EDU-Net splits the encoder into a local EfficientNet branch that keeps high-resolution details for tiny intraretinal fluid pockets and a global branch built on large-kernel efficient convolutions to pull in longer-range context. A multi-category edge-guided attention module then fuses boundary information across scales before the decoder. This targets the exact problems the abstract flags: small lesions and noise-blurred edges in DME OCT scans. The design is internally consistent and the paper walks through why each piece is there without overclaiming theoretical novelty. They evaluate on both their own retrospective dataset and public OCT collections, reporting better DSC especially on IRF, which matches the practical goal of quantifying fluid for treatment decisions. The architecture description and loss choices line up with the stated aims, and no data-leakage red flags appear in the setup. The main soft spot is that the size of the gains and the ablation controls are not visible in the summary, so it is hard to judge how much the new modules actually move the needle versus careful tuning. Generalization to unseen scanners or protocols is also untested here, which is typical but still limits how far the SOTA label travels. This paper is aimed at medical imaging groups working on retinal OCT analysis or DME quantification tools. A reader already building or comparing segmentation networks for ophthalmology would find the module choices and dataset split useful to examine. It deserves a serious referee because the approach is grounded in the clinical problem and the evaluation plan is concrete enough to review. I would send it to peer review so the numbers, ablations, and any implementation details can be checked directly.

Referee Report

1 major / 2 minor

Summary. The paper proposes EDU-Net, an edge-guided dual-branch encoder-decoder network for automatic segmentation of intraretinal fluid (IRF) and subretinal fluid (SRF) in OCT images for diabetic macular edema. The local branch uses EfficientNet for high-resolution detail capture of small lesions, the global branch applies large-kernel efficient convolution (LKEC) and downsampling for long-range context, and a multi-category edge-guided attention module fuses boundary details across resolutions. The central claim is that extensive experiments demonstrate state-of-the-art DSC performance on in-house and public datasets, with particular gains in IRF segmentation efficiency and robustness.

Significance. If the reported DSC improvements are reproducible and supported by proper controls, the architecture offers a coherent way to balance local detail preservation with global semantics and boundary refinement, which could advance clinical tools for quantifying retinal fluid in DME and similar OCT segmentation tasks.

major comments (1)

[Results] Results section: the SOTA DSC claim is load-bearing for the paper's contribution, yet the abstract provides no numerical values, baseline comparisons, dataset sizes, or statistical tests; without these (and ablations isolating the LKEC and edge-guided attention contributions) in the full results, the robustness assertion cannot be verified.

minor comments (2)

[Abstract] Abstract: adding the specific DSC scores achieved and the sizes of the in-house and public test sets would make the performance claim immediately verifiable.
[Methods] Methods: the exact implementation of the multi-category edge-guided attention fusion (e.g., how high-frequency boundary information is injected at each resolution) would benefit from an equation or pseudocode for reproducibility.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive feedback on the presentation of our results. We agree that strengthening the abstract and adding explicit ablations will improve verifiability of the SOTA claims and will revise accordingly.

read point-by-point responses

Referee: [Results] Results section: the SOTA DSC claim is load-bearing for the paper's contribution, yet the abstract provides no numerical values, baseline comparisons, dataset sizes, or statistical tests; without these (and ablations isolating the LKEC and edge-guided attention contributions) in the full results, the robustness assertion cannot be verified.

Authors: We will revise the abstract to include key DSC values (e.g., overall and per-class), dataset sizes, and direct numerical comparisons to the strongest baselines. In the results section we already report baseline comparisons across in-house and public datasets; we will add a dedicated ablation subsection that isolates the LKEC module and the multi-category edge-guided attention module, reporting their incremental DSC gains and statistical significance (paired t-tests or Wilcoxon tests with p-values). These changes will make the robustness claims directly verifiable without altering the core architecture or experimental protocol. revision: yes

Circularity Check

0 steps flagged

No significant circularity

full rationale

The paper presents an empirical neural network architecture (EDU-Net) for OCT fluid segmentation, with no mathematical derivations, equations, or first-principles claims. The central assertions rest on training the described dual-branch encoder-decoder with edge-guided attention on in-house and public datasets, followed by reporting DSC metrics. No steps reduce a prediction to a fitted input by construction, invoke self-citations as load-bearing uniqueness theorems, or rename known results as novel derivations. The architecture description is self-contained and the performance claims are externally falsifiable via the reported test sets.

Axiom & Free-Parameter Ledger

2 free parameters · 1 axioms · 0 invented entities

The central claim depends on standard supervised deep-learning training assumptions and dataset-specific fitting; no new physical axioms or invented entities are introduced.

free parameters (2)

Network weights and biases
All model parameters are fitted to the training portions of the in-house and public OCT datasets.
Hyperparameters for training and architecture
Choices such as learning rate, batch size, and module dimensions are selected to optimize performance on the validation data.

axioms (1)

domain assumption The training and test distributions are sufficiently similar for generalization
Implicit in any supervised segmentation claim on finite datasets.

pith-pipeline@v0.9.0 · 5587 in / 1232 out tokens · 49916 ms · 2026-05-09T23:25:27.360735+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

35 extracted references · 2 canonical work pages · 2 internal anchors

[1]

Diabetic retinopathy The Lancet

Cheung Ning, Mitchell Paul, Wong Tien Yin. Diabetic retinopathy The Lancet. 2010;376:124-136

2010
[2]

Diabetic retinopathy: pathophysiology and treatments International journal of molecular sciences

Wang Wei, Lo Amy CY . Diabetic retinopathy: pathophysiology and treatments International journal of molecular sciences. 2018;19:1816

2018
[3]

Optical coherence tomography in the management of diabetic macular oedema Progress in retinal and eye research

Szeto Simon KH, Lai Timothy YY , Vujosevic Stela, et al. Optical coherence tomography in the management of diabetic macular oedema Progress in retinal and eye research. 2024;98:101220

2024
[4]

Optical coherence tomography science

Huang David, Swanson Eric A, Lin Charles P, et al. Optical coherence tomography science. 1991;254:1178 – 1181

1991
[5]

Higher -order assessment of OCT in diabetic macular edema from the VISTA study: ellipsoid zone dynamics and the retinal fluid index Ophthalmology Retina

Ehlers Justis P, Uchida Atsuro, Hu Ming, et al. Higher -order assessment of OCT in diabetic macular edema from the VISTA study: ellipsoid zone dynamics and the retinal fluid index Ophthalmology Retina. 2019;3:1056–1066

2019
[6]

Roberts Philipp K, V ogl Wolf-Dieter, Gerendas Bianca S, et al. Quantification of fluid resolution and visual acuity gain in patients with diabetic macular edema using deep learning: a post hoc analysis of a randomized clinical trial JAMA ophthalmology. 2020;138:945–953

2020
[7]

ImageNet classification with deep convolutional neural networks Communications of the ACM

Krizhevsky Alex, Sutskever Ilya, Hinton Geoffrey E. ImageNet classification with deep convolutional neural networks Communications of the ACM. 2017;60:84–90

2017
[8]

Multiple-object geometric deformable model for segmentation of macular OCT Biomedical optics express

Carass Aaron, Lang Andrew, Hauser Matthew, Calabresi Peter A, Ying Howard S, Prince Jerry L. Multiple-object geometric deformable model for segmentation of macular OCT Biomedical optics express. 2014;5:1062–1074

2014
[9]

Fractal -based analysis of optical coherence tomography data to quantify retinal tissue damage BMC bioinformatics

Somfai Gábor Márk, Tátrai Erika, Laurik Lenke, et al. Fractal -based analysis of optical coherence tomography data to quantify retinal tissue damage BMC bioinformatics. 2014;15:1–10

2014
[10]

Classification of SD -OCT V olumes Using Local Binary Patterns: Experimental Validation for DME Detection Journal of ophthalmology

Lemaître Guillaume, Rastgoo Mojdeh, Massich Joan, et al. Classification of SD -OCT V olumes Using Local Binary Patterns: Experimental Validation for DME Detection Journal of ophthalmology. 2016;2016:3298606

2016
[11]

Fully automated detection of diabetic macular edema and dry age -related macular degeneration from optical coherence tomography images Biomedical optics express

Srinivasan Pratul P, Kim Leo A, Mettu Priyatham S, et al. Fully automated detection of diabetic macular edema and dry age -related macular degeneration from optical coherence tomography images Biomedical optics express. 2014;5:3568–3577

2014
[12]

Albarrak Abdulrahman, Coenen Frans, Zheng Yalin, others . Age-related macular degeneration identification in volumetric optical coherence tomography using decomposition and local feature extraction in Proceedings of 2013 international conference on medical image, understanding and analysis:59–64 2013

2013
[13]

Automatic segmentation of microcystic macular edema in OCT Biomedical optics express

Lang Andrew, Carass Aaron, Swingle Emily K, et al. Automatic segmentation of microcystic macular edema in OCT Biomedical optics express. 2014;6:155–169

2014
[14]

Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema Biomedical optics express

Chiu Stephanie J, Allingham Michael J, Mettu Priyatham S, Cousins Scott W, Izatt Joseph A, Farsiu Sina. Kernel regression based segmentation of optical coherence tomography images with diabetic macular edema Biomedical optics express. 2015;6:1172–1194

2015
[15]

Automated segmentation of pathological cavities in optical coherence tomography scans Investigative ophthalmology & visual science

Pilch Matthäus, Stieger Knut, Wenner Yaroslava, et al. Automated segmentation of pathological cavities in optical coherence tomography scans Investigative ophthalmology & visual science. 2013;54:4385–4393

2013
[16]

ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks Biomedical optics express

Roy Abhijit Guha, Conjeti Sailesh, Karri Sri Phani Krishna, et al. ReLayNet: retinal layer and fluid segmentation of macular optical coherence tomography using fully convolutional networks Biomedical optics express. 2017;8:3627–3642

2017
[17]

Fully convolutional networks for semantic segmentation in Proceedings of the IEEE conference on computer vision and pattern recognition:3431–3440 2015

Long Jonathan, Shelhamer Evan, Darrell Trevor. Fully convolutional networks for semantic segmentation in Proceedings of the IEEE conference on computer vision and pattern recognition:3431–3440 2015

2015
[18]

Ronneberger Olaf, Fischer Philipp, Brox Thomas. U -net: Convolutional networks for biomedical image 20 segmentation in Medical image computing and computer -assisted intervention –MICCAI 2015: 18th international conference, Munich, Germany, October 5 -9, 2015, proceedings, part III 18:234 –241Springer 2015

2015
[19]

Segnet: A deep convolutional encoder -decoder architecture for image segmentation IEEE transactions on pattern analysis and machine intelligence

Badrinarayanan Vijay, Kendall Alex, Cipolla Roberto. Segnet: A deep convolutional encoder -decoder architecture for image segmentation IEEE transactions on pattern analysis and machine intelligence. 2017;39:2481–2495

2017
[20]

Encoder -decoder with atrous separable convolution for semantic image segmentation in Proceedings of the European conference on computer vision (ECCV):801–818 2018

Chen Liang-Chieh, Zhu Yukun, Papandreou George, Schroff Florian, Adam Hartwig. Encoder -decoder with atrous separable convolution for semantic image segmentation in Proceedings of the European conference on computer vision (ECCV):801–818 2018

2018
[21]

Unet++: Redesigning skip connections to exploit multiscale features in image segmentation IEEE transactions on medical imaging

Zhou Zongwei, Siddiquee Md Mahfuzur Rahman, Tajbakhsh Nima, Liang Jianming. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation IEEE transactions on medical imaging. 2019;39:1856–1867

2019
[22]

MDAN -UNet: multi -scale and dual attention enhanced nested U -Net architecture for segmentation of optical coherence tomography images Algorithms

Liu Wen, Sun Yankui, Ji Qingge. MDAN -UNet: multi -scale and dual attention enhanced nested U -Net architecture for segmentation of optical coherence tomography images Algorithms. 2020;13:60

2020
[23]

CPFNet: Context pyramid fusion network for medical image segmentation IEEE transactions on medical imaging

Feng Shuanglang, Zhao Heming, Shi Fei, et al. CPFNet: Context pyramid fusion network for medical image segmentation IEEE transactions on medical imaging. 2020;39:3008–3018

2020
[24]

MsTGANet: Automatic drusen segmentation from retinal OCT images IEEE Transactions on Medical Imaging

Wang Meng, Zhu Weifang, Shi Fei, et al. MsTGANet: Automatic drusen segmentation from retinal OCT images IEEE Transactions on Medical Imaging. 2021;41:394–406

2021
[25]

RetiFluidNet: a self -adaptive and multi -attention deep convolutional network for retinal OCT fluid segmentation IEEE Transactions on Medical Imaging

Rasti Reza, Biglari Armin, Rezapourian Mohammad, Yang Ziyun, Farsiu Sina. RetiFluidNet: a self -adaptive and multi -attention deep convolutional network for retinal OCT fluid segmentation IEEE Transactions on Medical Imaging. 2022;42:1413–1423

2022
[26]

EANet: Multiscale autoencoder based edge attention network for fluid segmentation from SD-OCT images International Journal of Imaging Systems and Technology

Pappu Geetha Pavani, Tankala Sreekar, Talabhaktula Krishna, Biswal Birendra. EANet: Multiscale autoencoder based edge attention network for fluid segmentation from SD-OCT images International Journal of Imaging Systems and Technology. 2023;33:909–927

2023
[27]

Loss -balanced parallel decoding network for retinal fluid segmentation in OCT Computers in Biology and Medicine

Yu Xiaojun, Li Mingshuai, Ge Chenkun, et al. Loss -balanced parallel decoding network for retinal fluid segmentation in OCT Computers in Biology and Medicine. 2023;165:107319

2023
[28]

Efficientnet: Rethinking model scaling for convolutional neural networks in International conference on machine learning:6105–6114PMLR 2019

Tan Mingxing, Le Quoc. Efficientnet: Rethinking model scaling for convolutional neural networks in International conference on machine learning:6105–6114PMLR 2019

2019
[29]

Large dataset of labeled optical coherence tomography (oct) and chest x -ray images (No Title)

Kermany Daniel. Large dataset of labeled optical coherence tomography (oct) and chest x -ray images (No Title). 2018

2018
[30]

RETOUCH: The retinal OCT fluid detection and segmentation benchmark and challenge IEEE transactions on medical imaging

Bogunović Hrvoje, Venhuizen Freerk, Klimscha Sophie, et al. RETOUCH: The retinal OCT fluid detection and segmentation benchmark and challenge IEEE transactions on medical imaging. 2019;38:1858–1874

2019
[31]

Ce -net: Context encoder network for 2d medical image segmentation IEEE transactions on medical imaging

Gu Zaiwang, Cheng Jun, Fu Huazhu, et al. Ce -net: Context encoder network for 2d medical image segmentation IEEE transactions on medical imaging. 2019;38:2281–2292

2019
[32]

Farshad Azade, Yeganeh Yousef, Gehlbach Peter, Navab Nassir. Y -net: A spatiospectral dual -encoder network for medical image segmentation in International conference on medical image computing and computer-assisted intervention:582–592Springer 2022

2022
[33]

Grad-cam: Visual explanations from deep networks via gradient -based localization in Proceedings of the IEEE international conference on computer vision:618–626 2017

Selvaraju Ramprasaath R, Cogswell Michael, Das Abhishek, Vedantam Ramakrishna, Parikh Devi, Batra Dhruv. Grad-cam: Visual explanations from deep networks via gradient -based localization in Proceedings of the IEEE international conference on computer vision:618–626 2017

2017
[34]

Attention U-Net: Learning Where to Look for the Pancreas

Oktay Ozan, Schlemper Jo, Folgoc Loic Le, et al. Attention u -net: Learning where to look for the pancreas arXiv preprint arXiv:1804.03999.2018

work page internal anchor Pith review arXiv 2018
[35]

TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation

Chen Jieneng, Lu Yongyi, Yu Qihang, et al. Transunet : Transformers make strong encoders for medical image segmentation arXiv preprint arXiv:2102.04306. 2021. 21 Figures and Tables Figure 1. Characteristics of retinal fluid in DME. (a) Left: Posterior pole fundus photograph of a patient with DME; Right: OCT 3D imaging of the corresponding macular region. ...

work page internal anchor Pith review arXiv 2021