2017 Robotic Instrument Segmentation Challenge
read the original abstract
In mainstream computer vision and machine learning, public datasets such as ImageNet, COCO and KITTI have helped drive enormous improvements by enabling researchers to understand the strengths and limitations of different algorithms via performance comparison. However, this type of approach has had limited translation to problems in robotic assisted surgery as this field has never established the same level of common datasets and benchmarking methods. In 2015 a sub-challenge was introduced at the EndoVis workshop where a set of robotic images were provided with automatically generated annotations from robot forward kinematics. However, there were issues with this dataset due to the limited background variation, lack of complex motion and inaccuracies in the annotation. In this work we present the results of the 2017 challenge on robotic instrument segmentation which involved 10 teams participating in binary, parts and type based segmentation of articulated da Vinci robotic instruments.
This paper has not been read by Pith yet.
Forward citations
Cited by 12 Pith papers
-
Unlocking Positive Transfer in Incrementally Learning Surgical Instruments: A Self-reflection Hierarchical Prompt Framework
A hierarchical prompt tree with self-reflection graph propagation enables positive forward and backward knowledge transfer in incremental surgical instrument segmentation, improving over baselines by more than 5% and ...
-
S2M-Net: Spectral-Spatial Mixing for Medical Image Segmentation with Morphology-Aware Adaptive Loss
S2M-Net achieves state-of-the-art Dice scores on 16 medical datasets across 8 modalities using a 4.7M-parameter spectral-spatial mixer and morphology-aware adaptive loss, outperforming transformers with 3.5-6x fewer p...
-
SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures
Introduces the first publicly accessible native 4K resolution endoscopic video dataset for robotic-assisted minimally invasive procedures.
-
StereoMamba: Real-time and Robust Intraoperative Stereo Disparity Estimation via Long-range Spatial Dependencies
StereoMamba introduces a Mamba-based architecture with FE-Mamba and MFF modules for real-time stereo disparity estimation in RAMIS, reporting EPE of 2.64 px, depth MAE of 2.55 mm, and 21.28 FPS on the SCARED benchmark...
-
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation
U-Mamba is a hybrid CNN-SSM architecture that outperforms prior CNN and Transformer networks on biomedical image segmentation tasks by efficiently modeling long-range dependencies.
-
Incorporating Temporal Prior from Motion Flow for Instrument Segmentation in Minimally Invasive Surgery Video
A temporal prior from inter-frame motion flow is injected as initialization into an attention pyramid network to guide coarse-to-fine instrument segmentation in MIS videos, exceeding prior results on the EndoVis datas...
-
RoboSurg-VQA: A Multimodal Benchmark for Surgical Segmentation-Aware Visual Question Answering
RoboSurg-VQA is a new segmentation-aware VQA benchmark created by repurposing public surgical datasets with fixed clinically motivated questions and closed answer sets.
-
USEMA: a Scalable Efficient Mamba Like Attention for Medical Image Segmentation
USEMA is a hybrid UNet architecture merging CNNs with scalable Mamba-like attention (SEMA) that achieves better efficiency than transformers and superior segmentation accuracy than pure CNN or Mamba models across medi...
-
Surgical Visual Understanding (SurgVU) Dataset
Releases the SurgVU dataset of surgical videos and labels to enable machine learning research in surgical data science.
-
SegSTRONG-C: Segmenting Surgical Tools Robustly On Non-adversarial Generated Corruptions -- An EndoVis'24 Challenge
SegSTRONG-C provides a new benchmark where top models reach 0.9394 DSC and 0.9301 NSD on corrupted surgical tool segmentation tests, showing conventional techniques help but calling for more innovative robustness methods.
-
Attention Is not Everything: Efficient Alternatives for Vision
A survey that taxonomizes non-Transformer vision models and evaluates their practical trade-offs across efficiency, scalability, and robustness.
-
Benchmarking CNN- and Transformer-Based Models for Surgical Instrument Segmentation in Robotic-Assisted Surgery
DeepLabV3 matches SegFormer performance in multi-class surgical instrument segmentation while convolutional baselines like UNet remain competitive on the SAR-RARP50 dataset.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.