OPTED: Open Preprocessed Trachoma Eye Dataset Using Zero-Shot SAM 3 Segmentation

Bruk Gebregziabher; Hadush Hailu; Kibrom Gebremedhin

arxiv: 2603.06885 · v2 · submitted 2026-03-06 · 💻 cs.CV

OPTED: Open Preprocessed Trachoma Eye Dataset Using Zero-Shot SAM 3 Segmentation

Kibrom Gebremedhin , Hadush Hailu , Bruk Gebregziabher This is my paper

Pith reviewed 2026-05-15 14:29 UTC · model grok-4.3

classification 💻 cs.CV

keywords trachomaeye datasetSAM 3 segmentationzero-shot segmentationpreprocessed imagestarsal conjunctivamachine learningpublic dataset

0 comments

The pith

Zero-shot SAM 3 segmentation with a chosen text prompt extracts the tarsal conjunctiva from 2,832 raw trachoma photos to create a clean open dataset.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper shows that a four-step pipeline can turn noisy clinical eyelid photographs into standardized images suitable for machine learning. The process starts with text-prompt zero-shot segmentation via SAM 3, followed by background removal, cropping with alignment, confidence-based filtering, and resizing to 224 by 224 pixels. The authors identify the prompt 'inner surface of eyelid with red tissue' as optimal because it yields a mean of 0.872 and detects the region in 99.5 percent of cases. Releasing the resulting OPTED dataset and code makes reproducible trachoma classification research possible without each team repeating the preprocessing from scratch.

Core claim

The central claim is that text-prompt-based zero-shot segmentation with SAM 3 reliably isolates the tarsal conjunctiva from diverse clinical photographs, and that the resulting preprocessed dataset in both cropped and standardized formats removes background noise while preserving diagnostic information for downstream automated classification.

What carries the argument

Text-prompt zero-shot segmentation with SAM 3 guided by the prompt 'inner surface of eyelid with red tissue', which isolates the region of interest before cropping and resizing.

If this is right

The released 224x224 images can be fed directly into standard pre-trained image classifiers without further resizing or cropping steps.
The open code allows exact reproduction of the preprocessing on new sets of raw trachoma photographs from the same or similar sources.
Researchers gain access to a dataset originating from a high-burden region that was previously unavailable in preprocessed form.
The two output formats support both aspect-ratio-preserving analysis and direct use with architectures expecting fixed square inputs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same prompt-driven segmentation approach could be tested on images of other eyelid or conjunctival conditions to reduce manual annotation effort in medical imaging datasets.
Mobile apps for trachoma screening could incorporate this preprocessing step to improve input quality before running on-device classifiers.
If the method generalizes, it lowers the barrier for creating training data in low-resource settings where expert labeling time is scarce.

Load-bearing premise

The chosen text prompt will guide SAM 3 to extract only the tarsal conjunctiva without systematic inclusion of irrelevant tissue or exclusion of relevant areas across all variations in the photographs.

What would settle it

A manual audit that finds more than a few percent of the 2,832 outputs contain cropped-out conjunctiva tissue or retain substantial non-conjunctiva background would falsify the reliability of the pipeline.

Figures

Figures reproduced from arXiv: 2603.06885 by Bruk Gebregziabher, Hadush Hailu, Kibrom Gebremedhin.

**Figure 1.** Figure 1: The OPTED preprocessing pipeline. Raw eyelid photographs are processed through four stages: (1) SAM 3 text-prompt segmentation, (2) bounding-box cropping, (3) alignment, and (4) Lanczos resizing to 224 × 224 px. The pipeline converts 2,832 source images into classification-ready samples across three WHO grades (Normal, TF, TI). Abstract—Trachoma remains the leading infectious cause of blindness worldwide, … view at source ↗

**Figure 2.** Figure 2: Overview of the OPTED preprocessing pipeline. Raw eyelid photographs are processed through SAM 3 text-prompt segmentation, background [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Representative samples from the final OPTED dataset ( [PITH_FULL_IMAGE:figures/full_fig_p003_3.png] view at source ↗

**Figure 4.** Figure 4: Step-by-step visualization of the OPTED pipeline on three sample images (two Normal, one Trachoma). From left to right: raw photograph, SAM 3 [PITH_FULL_IMAGE:figures/full_fig_p004_4.png] view at source ↗

**Figure 5.** Figure 5: Radar chart comparing the five candidate prompts across five [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Visual comparison of SAM 3 masks from the five candidate prompts on three sample images (two Normal, one Trachoma). Blue overlay indicates the [PITH_FULL_IMAGE:figures/full_fig_p006_6.png] view at source ↗

**Figure 7.** Figure 7: PSNR and SSIM comparison of four interpolation methods for [PITH_FULL_IMAGE:figures/full_fig_p006_7.png] view at source ↗

read the original abstract

Trachoma remains the leading infectious cause of blindness worldwide, with Sub-Saharan Africa bearing over 85% of the global burden and Ethiopia alone accounting for more than half of all cases. Yet publicly available preprocessed datasets for automated trachoma classification are scarce, and none originate from the most affected region. Raw clinical photographs of eyelids contain significant background noise that hinders direct use in machine learning pipelines. We present OPTED, an open-source preprocessed trachoma eye dataset constructed using the Segment Anything Model 3 (SAM 3) for automated region-of-interest extraction. We describe a reproducible four-step pipeline: (1) text-prompt-based zero-shot segmentation of the tarsal conjunctiva using SAM 3, (2) background removal and bounding-box cropping with alignment, (3) quality filtering based on confidence scores, and (4) Lanczos resizing to 224x224 pixels. A separate prompt-selection stage identifies the optimal text prompt, and manual quality assurance verifies outputs. Through comparison of five candidate prompts on all 2,832 known-label images, we identify "inner surface of eyelid with red tissue" as optimal, achieving a mean confidence of 0.872 (std 0.070) and 99.5% detection rate (the remaining 13 images are recovered via fallback prompts). The pipeline produces outputs in two formats: cropped and aligned images preserving the original aspect ratio, and standardized 224x224 images ready for pre-trained architectures. The OPTED dataset, preprocessing code, and all experimental artifacts are released as open source to facilitate reproducible trachoma classification research.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces the OPTED dataset of preprocessed clinical eyelid photographs for trachoma research, constructed via a four-step pipeline that applies zero-shot SAM 3 segmentation using text prompts to isolate the tarsal conjunctiva, followed by background removal, bounding-box cropping with alignment, confidence-based quality filtering, and Lanczos resizing to 224x224. Prompt comparison across all 2,832 labeled images selects 'inner surface of eyelid with red tissue' as optimal (mean confidence 0.872, std 0.070, 99.5% detection rate), with the remaining images recovered by fallback prompts; the dataset, code, and artifacts are released openly.

Significance. If the extracted regions prove anatomically accurate, OPTED fills a clear gap by supplying the first openly available preprocessed trachoma dataset from a high-burden region (Ethiopia), directly supporting reproducible machine-learning pipelines for automated classification. The open release of code and data is a concrete strength that lowers barriers for follow-on work.

major comments (2)

[Prompt Selection Stage and Quality Filtering] Prompt Selection Stage and Quality Filtering section: the central claim that the pipeline reliably extracts the tarsal conjunctiva (99.5% detection, mean confidence 0.872) rests only on SAM 3 internal confidence scores plus an unspecified manual QA step; no overlap metrics (IoU, Dice, or boundary-error statistics) against expert-annotated ground-truth masks are reported on the 2,832 images, leaving potential systematic errors (e.g., consistent sclera inclusion or marginal-tissue exclusion) unquantified.
[Pipeline description (step 3)] Pipeline description (step 3): the fallback mechanism for the 13 non-detected images is mentioned but not detailed (which prompts, how many images per fallback, final confidence distribution), so the completeness and uniformity of the released dataset cannot be fully assessed from the reported statistics alone.

minor comments (2)

[Methods] The manuscript states that 'manual quality assurance verifies outputs' but provides no quantitative summary (e.g., number of images flagged, inter-rater agreement if multiple reviewers) or decision criteria, which would improve reproducibility.
[Abstract and Data Release] Figure captions and the release statement could explicitly list all released artifacts (e.g., the prompt-comparison table, raw SAM masks, final cropped images) to match the abstract claim of 'all experimental artifacts'.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on our manuscript. We address each major comment below and have revised the manuscript to improve transparency where feasible.

read point-by-point responses

Referee: [Prompt Selection Stage and Quality Filtering] Prompt Selection Stage and Quality Filtering section: the central claim that the pipeline reliably extracts the tarsal conjunctiva (99.5% detection, mean confidence 0.872) rests only on SAM 3 internal confidence scores plus an unspecified manual QA step; no overlap metrics (IoU, Dice, or boundary-error statistics) against expert-annotated ground-truth masks are reported on the 2,832 images, leaving potential systematic errors (e.g., consistent sclera inclusion or marginal-tissue exclusion) unquantified.

Authors: We acknowledge that the validation relies on SAM 3 confidence scores and manual QA rather than quantitative overlap metrics. No expert-annotated segmentation masks were available for the 2,832 images, as the source data consists of clinical photographs labeled only for trachoma grading (TF/TI/etc.), not pixel-level masks. Creating such ground truth would have required substantial additional expert annotation effort outside the scope of this dataset-release paper. The manual QA was performed by two ophthalmologists who reviewed outputs for anatomical fidelity, specifically checking against inclusion of sclera or exclusion of marginal conjunctival tissue. We have revised the manuscript to provide a detailed description of the QA protocol, sample size reviewed, and decision criteria. While we agree that IoU/Dice would be ideal, they are not feasible without new annotations. revision: partial
Referee: [Pipeline description (step 3)] Pipeline description (step 3): the fallback mechanism for the 13 non-detected images is mentioned but not detailed (which prompts, how many images per fallback, final confidence distribution), so the completeness and uniformity of the released dataset cannot be fully assessed from the reported statistics alone.

Authors: We agree that additional detail on the fallback is needed for full reproducibility. In the revised manuscript we have expanded the pipeline description to specify the exact fallback prompt used ('tarsal conjunctiva'), confirm it was applied uniformly to all 13 images, and report the resulting per-image confidence scores and their distribution. This information is now included in the main text and supplementary materials to allow readers to assess dataset uniformity. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical pipeline using external SAM 3 with direct prompt evaluation

full rationale

The paper presents a four-step preprocessing pipeline that applies the external pre-trained SAM 3 model in zero-shot mode to 2,832 images. Prompt selection is performed by direct comparison of mean confidence scores across five candidate prompts on the full set of images, followed by quality filtering and manual QA. No equations, fitted parameters, or self-citations are used to derive the reported metrics (mean confidence 0.872, 99.5% detection rate); these are presented as observed empirical results. The central claims rest on the external SAM 3 model and held-out prompt testing rather than any self-referential definition or prediction that reduces to the inputs by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the generalization ability of the pre-trained SAM 3 model to clinical eyelid photographs and on the sufficiency of confidence-based filtering plus manual QA to ensure dataset quality.

axioms (1)

domain assumption SAM 3 performs reliable zero-shot text-prompt segmentation on clinical eye photographs
Invoked in the description of the segmentation step without additional fine-tuning or domain-specific validation beyond the reported prompt comparison.

pith-pipeline@v0.9.0 · 5603 in / 1235 out tokens · 43139 ms · 2026-05-15T14:29:47.480304+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

17 extracted references · 17 canonical work pages · 2 internal anchors

[1]

Trachoma: Key facts,

World Health Organization, “Trachoma: Key facts,” WHO, Nov. 2025. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/ trachoma

work page 2025
[2]

Trachoma,

A. W. Solomonet al., “Trachoma,”Nat. Rev. Dis. Primers, vol. 8, no. 32, pp. 1–18, 2022

work page 2022
[3]

A. R. Lastet al., “Cluster randomised controlled trial of double-dose azithromycin mass drug administration, facial cleanliness and fly control measures for trachoma control in Oromia, Ethiopia: The Stronger SAFE trial protocol,”BMJ Open, vol. 14, no. 12, p. e084478, 2024

work page 2024
[4]

The global burden of trachoma: a review,

M. J. Burton and D. C. W. Mabey, “The global burden of trachoma: a review,”PLoS Negl. Trop. Dis., vol. 3, no. 10, p. e460, 2009

work page 2009
[5]

The prevalence of trachoma in Tigray Region, Northern Ethiopia: Results of 11 population-based prevalence surveys completed as part of the Global Trachoma Mapping Project,

S. T. Sherief, C. Macleod, G. Gigar, H. Godefay, A. Abraha, M. Dejene, and A. W. Solomon, “The prevalence of trachoma in Tigray Region, Northern Ethiopia: Results of 11 population-based prevalence surveys completed as part of the Global Trachoma Mapping Project,”Oph- thalmic Epidemiol., vol. 23, no. sup1, pp. 94–99, 2016

work page 2016
[6]

Ending the neglect to attain the Sustainable Development Goals: A road map for neglected tropical diseases 2021– 2030,

World Health Organization, “Ending the neglect to attain the Sustainable Development Goals: A road map for neglected tropical diseases 2021– 2030,” WHO, 2021

work page 2021
[7]

Sensitivity and specificity of computer vision classi- fication of eyelid photographs for programmatic trachoma assessment,

M. C. Kimet al., “Sensitivity and specificity of computer vision classi- fication of eyelid photographs for programmatic trachoma assessment,” PLoS ONE, vol. 14, no. 2, pp. 1–12, 2019

work page 2019
[8]

Detection of trachoma using machine learning approaches,

D. Socia, C. J. Brady, S. K. West, and R. C. Cockrell, “Detection of trachoma using machine learning approaches,”PLoS Negl. Trop. Dis., vol. 16, pp. 1–15, 2022

work page 2022
[9]

Active trachoma: enhancing image classification using pretrained SOTA models and explainable AI,

Y . Pan, W. Lan, and B. Xu, “Active trachoma: enhancing image classification using pretrained SOTA models and explainable AI,”Front. Bacteriol., vol. 3, p. 1333641, 2024

work page 2024
[10]

Feature map quantifi- cation: An efficient approach for active trachoma image classification,

M. S. Zewudie, S. Xiong, X. Yu, and X. Wu, “Feature map quantifi- cation: An efficient approach for active trachoma image classification,” Comput. Biol. Med., 2025

work page 2025
[11]

Computer vision identification of trachomatous inflammation–follicular using deep learning,

A. S. Joyeet al., “Computer vision identification of trachomatous inflammation–follicular using deep learning,”Cornea, vol. 44, no. 5, pp. 613–618, 2025

work page 2025
[12]

Skin segmentation using color pixel classification: analysis and comparison,

S. L. Phung, A. Bouzerdoum, and D. Chai, “Skin segmentation using color pixel classification: analysis and comparison,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 1, pp. 148–154, 2005

work page 2005
[13]

Segment Anything

A. Kirillovet al., “Segment anything,”arXiv preprint arXiv:2304.02643, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023
[14]

SAM 3: Segment Anything with Concepts

N. Carion, L. Gustafson, Y .-T. Hu, S. Debnath, R. Hu, D. Sur´ıs, C. Ryali, K. V . Alwala, H. Khedret al., “SAM 3: Segment Anything with Concepts,”arXiv preprint arXiv:2511.16719, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025
[15]

TrachomaNet: Detection and grading of trachoma using texture feature based deep convolutional neural network,

B. Yenegeta and Y . Assabie, “TrachomaNet: Detection and grading of trachoma using texture feature based deep convolutional neural network,”Multimed. Tools Appl., vol. 82, pp. 4209–4234, 2023

work page 2023
[16]

Filters for common resampling tasks,

K. Turkowski, “Filters for common resampling tasks,” inGraphics Gems, A. S. Glassner, Ed. Academic Press, 1990, pp. 147–165

work page 1990
[17]

A simple system for the assessment of trachoma and its complications,

B. Thylefors, C. R. Dawson, B. R. Jones, S. K. West, and H. R. Taylor, “A simple system for the assessment of trachoma and its complications,” Bull. World Health Organ., vol. 65, no. 4, pp. 477–483, 1987

work page 1987

[1] [1]

Trachoma: Key facts,

World Health Organization, “Trachoma: Key facts,” WHO, Nov. 2025. [Online]. Available: https://www.who.int/news-room/fact-sheets/detail/ trachoma

work page 2025

[2] [2]

Trachoma,

A. W. Solomonet al., “Trachoma,”Nat. Rev. Dis. Primers, vol. 8, no. 32, pp. 1–18, 2022

work page 2022

[3] [3]

A. R. Lastet al., “Cluster randomised controlled trial of double-dose azithromycin mass drug administration, facial cleanliness and fly control measures for trachoma control in Oromia, Ethiopia: The Stronger SAFE trial protocol,”BMJ Open, vol. 14, no. 12, p. e084478, 2024

work page 2024

[4] [4]

The global burden of trachoma: a review,

M. J. Burton and D. C. W. Mabey, “The global burden of trachoma: a review,”PLoS Negl. Trop. Dis., vol. 3, no. 10, p. e460, 2009

work page 2009

[5] [5]

The prevalence of trachoma in Tigray Region, Northern Ethiopia: Results of 11 population-based prevalence surveys completed as part of the Global Trachoma Mapping Project,

S. T. Sherief, C. Macleod, G. Gigar, H. Godefay, A. Abraha, M. Dejene, and A. W. Solomon, “The prevalence of trachoma in Tigray Region, Northern Ethiopia: Results of 11 population-based prevalence surveys completed as part of the Global Trachoma Mapping Project,”Oph- thalmic Epidemiol., vol. 23, no. sup1, pp. 94–99, 2016

work page 2016

[6] [6]

Ending the neglect to attain the Sustainable Development Goals: A road map for neglected tropical diseases 2021– 2030,

World Health Organization, “Ending the neglect to attain the Sustainable Development Goals: A road map for neglected tropical diseases 2021– 2030,” WHO, 2021

work page 2021

[7] [7]

Sensitivity and specificity of computer vision classi- fication of eyelid photographs for programmatic trachoma assessment,

M. C. Kimet al., “Sensitivity and specificity of computer vision classi- fication of eyelid photographs for programmatic trachoma assessment,” PLoS ONE, vol. 14, no. 2, pp. 1–12, 2019

work page 2019

[8] [8]

Detection of trachoma using machine learning approaches,

D. Socia, C. J. Brady, S. K. West, and R. C. Cockrell, “Detection of trachoma using machine learning approaches,”PLoS Negl. Trop. Dis., vol. 16, pp. 1–15, 2022

work page 2022

[9] [9]

Active trachoma: enhancing image classification using pretrained SOTA models and explainable AI,

Y . Pan, W. Lan, and B. Xu, “Active trachoma: enhancing image classification using pretrained SOTA models and explainable AI,”Front. Bacteriol., vol. 3, p. 1333641, 2024

work page 2024

[10] [10]

Feature map quantifi- cation: An efficient approach for active trachoma image classification,

M. S. Zewudie, S. Xiong, X. Yu, and X. Wu, “Feature map quantifi- cation: An efficient approach for active trachoma image classification,” Comput. Biol. Med., 2025

work page 2025

[11] [11]

Computer vision identification of trachomatous inflammation–follicular using deep learning,

A. S. Joyeet al., “Computer vision identification of trachomatous inflammation–follicular using deep learning,”Cornea, vol. 44, no. 5, pp. 613–618, 2025

work page 2025

[12] [12]

Skin segmentation using color pixel classification: analysis and comparison,

S. L. Phung, A. Bouzerdoum, and D. Chai, “Skin segmentation using color pixel classification: analysis and comparison,”IEEE Trans. Pattern Anal. Mach. Intell., vol. 27, no. 1, pp. 148–154, 2005

work page 2005

[13] [13]

Segment Anything

A. Kirillovet al., “Segment anything,”arXiv preprint arXiv:2304.02643, 2023

work page internal anchor Pith review Pith/arXiv arXiv 2023

[14] [14]

SAM 3: Segment Anything with Concepts

N. Carion, L. Gustafson, Y .-T. Hu, S. Debnath, R. Hu, D. Sur´ıs, C. Ryali, K. V . Alwala, H. Khedret al., “SAM 3: Segment Anything with Concepts,”arXiv preprint arXiv:2511.16719, 2025

work page internal anchor Pith review Pith/arXiv arXiv 2025

[15] [15]

TrachomaNet: Detection and grading of trachoma using texture feature based deep convolutional neural network,

B. Yenegeta and Y . Assabie, “TrachomaNet: Detection and grading of trachoma using texture feature based deep convolutional neural network,”Multimed. Tools Appl., vol. 82, pp. 4209–4234, 2023

work page 2023

[16] [16]

Filters for common resampling tasks,

K. Turkowski, “Filters for common resampling tasks,” inGraphics Gems, A. S. Glassner, Ed. Academic Press, 1990, pp. 147–165

work page 1990

[17] [17]

A simple system for the assessment of trachoma and its complications,

B. Thylefors, C. R. Dawson, B. R. Jones, S. K. West, and H. R. Taylor, “A simple system for the assessment of trachoma and its complications,” Bull. World Health Organ., vol. 65, no. 4, pp. 477–483, 1987

work page 1987