2017 Robotic Instrument Segmentation Challenge

Alex Shvets; Danail Stoyanov; Huoling Luo; Iro Laina; Jian Yang; Lena Maier-Hein; Luis Herrera; Mahdi Azizian; Max Allan; Nicola Rieke

arxiv: 1902.06426 · v2 · pith:NT6FR7NUnew · submitted 2019-02-18 · 💻 cs.CV

2017 Robotic Instrument Segmentation Challenge

Max Allan , Alex Shvets , Thomas Kurmann , Zichen Zhang , Rahul Duggal , Yun-Hsuan Su , Nicola Rieke , Iro Laina

show 11 more authors

Niveditha Kalavakonda Sebastian Bodenstedt Luis Herrera Wenqi Li Vladimir Iglovikov Huoling Luo Jian Yang Danail Stoyanov Lena Maier-Hein Stefanie Speidel Mahdi Azizian

This is my paper

classification 💻 cs.CV

keywords roboticsegmentationchallengedatasetshoweverinstrumentlimitedtype

0 comments

read the original abstract

In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison. However, this type of approach has had limited translation to problems in robotic assisted surgery as this field has never established the same level of common datasets and benchmarking methods. In 2015 a sub-challenge was introduced at the EndoVis workshop where a set of robotic images were provided with automatically generated annotations from robot forward kinematics. However, there were issues with this dataset due to the limited background variation, lack of complex motion and inaccuracies in the annotation. In this work we present the results of the 2017 challenge on robotic instrument segmentation which involved 10 teams participating in binary, parts and type based segmentation of articulated da Vinci robotic instruments.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 12 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt Framework
cs.CV 2026-04 conditional novelty 7.0

A hierarchical prompt tree with self-reflection graph propagation enables positive forward and backward knowledge transfer in incremental surgical instrument segmentation, improving over baselines by more than 5% and ...
S2M-Net: Spectral-Spatial Mixing for Medical Image Segmentation with Morphology-Aware Adaptive Loss
cs.CV 2026-01 unverdicted novelty 7.0

S2M-Net achieves state-of-the-art Dice scores on 16 medical datasets across 8 modalities using a 4.7M-parameter spectral-spatial mixer and morphology-aware adaptive loss, outperforming transformers with 3.5-6x fewer p...
SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures
eess.IV 2025-06 unverdicted novelty 7.0

Introduces the first publicly accessible native 4K resolution endoscopic video dataset for robotic-assisted minimally invasive procedures.
StereoMamba: Real-time and Robust Intraoperative Stereo Disparity Estimation via Long-range Spatial Dependencies
cs.CV 2025-04 unverdicted novelty 7.0

StereoMamba introduces a Mamba-based architecture with FE-Mamba and MFF modules for real-time stereo disparity estimation in RAMIS, reporting EPE of 2.64 px, depth MAE of 2.55 mm, and 21.28 FPS on the SCARED benchmark...
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
eess.IV 2024-01 unverdicted novelty 7.0

U-Mamba is a hybrid CNN-SSM architecture that outperforms prior CNN and Transformer networks on biomedical image segmentation tasks by efficiently modeling long-range dependencies.
Incorporating Temporal Prior from Motion Flow for Instrument Segmentation in Minimally Invasive Surgery Video
cs.CV 2019-07 unverdicted novelty 7.0

A temporal prior from inter-frame motion flow is injected as initialization into an attention pyramid network to guide coarse-to-fine instrument segmentation in MIS videos, exceeding prior results on the EndoVis datas...
RoboSurg-VQA: A Multimodal Benchmark for Surgical Segmentation-Aware Visual Question Answering
cs.CV 2026-05 unverdicted novelty 6.0

RoboSurg-VQA is a new segmentation-aware VQA benchmark created by repurposing public surgical datasets with fixed clinically motivated questions and closed answer sets.
USEMA: a Scalable Efficient Mamba Like Attention for Medical Image Segmentation
cs.CV 2026-05 unverdicted novelty 5.0

USEMA is a hybrid UNet architecture merging CNNs with scalable Mamba-like attention (SEMA) that achieves better efficiency than transformers and superior segmentation accuracy than pure CNN or Mamba models across medi...
Surgical Visual Understanding (SurgVU) Dataset
cs.CV 2025-01 unverdicted novelty 5.0

Releases the SurgVU dataset of surgical videos and labels to enable machine learning research in surgical data science.
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
cs.CV 2024-07 accept novelty 5.0

SegSTRONG-C provides a new benchmark where top models reach 0.9394 DSC and 0.9301 NSD on corrupted surgical tool segmentation tests, showing conventional techniques help but calling for more innovative robustness methods.
Attention Is not Everything: Efficient Alternatives for Vision
cs.CV 2026-04 unverdicted novelty 3.0

A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.
Benchmarking CNN- and Transformer-Based Models for Surgical Instrument Segmentation in Robotic-Assisted Surgery
cs.CV 2026-04 unverdicted novelty 2.0

DeepLabV3 matches SegFormer performance in multi-class surgical instrument segmentation while convolutional baselines like UNet remain competitive on the SAR-RARP50 dataset.