Surgical Visual Understanding (SurgVU) Dataset
Pith reviewed 2026-05-23 05:55 UTC · model grok-4.3
The pith
A dataset of robotic surgery videos paired with labels is released to support machine learning work in surgical data science.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
We present a large dataset of surgical videos and their accompanying labels for foundational work in surgical data science. The videos come from robotic-assisted surgeries and carry labels suited to multiple tasks. A validation set for tool detection and a sample set of question-answer pairs for visual question answering are also supplied. The dataset is made available through public links so that it can serve as a shared resource for future research.
What carries the argument
The SurgVU dataset of surgical videos and labels, which carries the argument by supplying raw visual data and annotations for training and testing models on surgical scenes.
If this is right
- Models can be trained to detect surgical tools directly from the labeled video frames.
- Question-answer pairs enable development of systems that answer queries about surgical scenes.
- The dataset supplies a common benchmark that different research groups can use to compare methods.
- Public release of both videos and labels allows reproduction and extension of experiments in surgical data science.
Where Pith is reading between the lines
- The same videos could be reused to study temporal patterns such as procedure phase recognition if additional time-stamped labels were added later.
- Combining SurgVU with non-surgical video datasets might test whether general video models transfer to the surgical domain.
- Hospitals could use the dataset to prototype privacy-preserving training pipelines before applying them to their own private recordings.
Load-bearing premise
The dataset, although curated for a particular set of scientific challenges, is general enough to be used for a broad range of machine learning questions.
What would settle it
A controlled test in which models trained on the SurgVU training split show no improvement over random baselines when evaluated on the provided public validation set for tool detection would indicate the labels do not support the intended tasks.
Figures
read the original abstract
Owing to recent advances in machine learning and the ability to harvest large amounts of data during robotic-assisted surgeries, surgical data science is ripe for foundational work. We present a large dataset of surgical videos and their accompanying labels for this purpose. We describe how the data was collected and some of its unique attributes. Multiple example problems are outlined. Although the dataset was curated for a particular set of scientific challenges (in an accompanying paper), it is general enough to be used for a broad range machine learning questions. Our hope is that this dataset exposes the larger machine learning community to the challenging problems within surgical data science, and becomes a touch-stone for future research. The videos are available at https://storage.googleapis.com/isi-surgvu/surgvu24_videos_only.zip, the labels at https://storage.googleapis.com/isi-surgvu/surgvu24_labels_updated_v2.zip, a validation set for tool detection problem at https://storage.googleapis.com/isi-surgvu/cat1_test_set_public.zip, and a sample set of question & answer pairs dataset for surgical visual question answering at https://storage.googleapis.com/isi-surgvu/SURGVU25_cat_2_sample_set_public.zip.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript announces the release of the Surgical Visual Understanding (SurgVU) Dataset consisting of surgical videos from robotic-assisted procedures together with accompanying labels. It describes the collection process and unique attributes of the data, outlines multiple example problems, and supplies public download links for the full videos, labels, a validation set for tool detection, and a sample set for surgical visual question answering. The authors note that the dataset was curated for specific challenges in an accompanying paper but assert that it remains general enough for a broad range of machine learning questions, with the goal of engaging the larger ML community in surgical data science.
Significance. If the dataset is released and documented as described, the contribution is a publicly accessible, large-scale labeled surgical video resource that can support benchmarking and foundational modeling in computer vision and surgical data science. The inclusion of task-specific subsets (tool detection validation and VQA samples) and direct Google Cloud links strengthens accessibility and reproducibility. This type of data descriptor can serve as a touchstone resource for the field.
minor comments (2)
- [Abstract] Abstract: the phrase 'a broad range machine learning questions' is missing 'of' and should read 'a broad range of machine learning questions'.
- [Abstract] Abstract: 'touch-stone' should be written as the single word 'touchstone'.
Simulated Author's Rebuttal
We thank the referee for the positive summary, significance assessment, and recommendation of minor revision for our manuscript describing the SurgVU dataset release. No specific major comments were listed in the report.
Circularity Check
No significant circularity: dataset release paper with no derivations or fitted claims
full rationale
The paper is a data release announcement describing collection of surgical videos and labels, providing download links, and outlining example problems. No equations, predictions, parameters, or derivation chains exist. The generality statement is an assertion, not a tested result derived from the data. No self-citations are load-bearing for any mathematical claim. The central contribution (public data availability) is directly supported by external links and does not reduce to any internal construction or fit.
Axiom & Free-Parameter Ledger
Forward citations
Cited by 1 Pith paper
-
SurgiSR4K: A High-Resolution Endoscopic Video Dataset for Robotic-Assisted Minimally Invasive Procedures
Introduces the first publicly accessible native 4K resolution endoscopic video dataset for robotic-assisted minimally invasive procedures.
Reference graph
Works this paper leans on
-
[1]
Estimation of the acquisition and operating costs for robotic surgery
Christopher P Childers and Melinda Maggard-Gibbons. Estimation of the acquisition and operating costs for robotic surgery. Jama, 320(8):835–836, 2018
work page 2018
-
[2]
Trends in robot-assisted procedures for general surgery in the veterans health administration
Michael A Mederos, R Lorie Jacob, Rachel Ward, Rivfka Shenoy, Melinda M Gibbons, Mark D Girgis, Devan Kansagara, Denise Hynes, Paul G Shekelle, and Karli Kondo. Trends in robot-assisted procedures for general surgery in the veterans health administration. Journal of Surgical Research , 279:788–795, 2022
work page 2022
-
[3]
Robotic surgery: finding value in 2019 and beyond
Rafael E Perez and Steven D Schwaitzberg. Robotic surgery: finding value in 2019 and beyond. Annals of Laparoscopic and Endoscopic Surgery , 4, 2019
work page 2019
-
[4]
Exploring the paradigm of robotic surgery and its contribution to the growth of surgical volume
Emily A Grimsley, Tara M Barry, Haroon Janjua, Emanuel Eguia, Christopher DuCoin, and Paul C Kuo. Exploring the paradigm of robotic surgery and its contribution to the growth of surgical volume. Surgery Open Science , 10:36–42, 2022
work page 2022
-
[5]
Kayla R Rizzo, Samuel Grasso, Brandon Ford, Alex Myers, Emily Ofstun, and Avery Walker. Status of robotic assisted surgery (ras) and the effects of coronavirus (covid-19) on ras in the department of defense (dod). Journal of Robotic Surgery , 17(2):413–417, 2023
work page 2023
-
[6]
Biomedical image analysis competitions: The state of current participation practice
Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J Adler, Patrick Godau, Veronika Cheplygina, Michal Kozubek, Sharib Ali, et al. Biomedical image analysis competitions: The state of current participation practice. arXiv preprint arXiv:2212.08568 , 2022
-
[7]
Why is the winner the best? arXiv preprint arXiv:2303.17719 , 2023
Matthias Eisenmann, Annika Reinke, Vivienn Weru, Minu Dietlinde Tizabi, Fabian Isensee, Tim J Adler, Sharib Ali, Vincent Andrearczyk, Marc Aubreville, Ujjwal Baid, et al. Why is the winner the best? arXiv preprint arXiv:2303.17719 , 2023
-
[8]
Intuitive Surgical SurgToolLoc and SurgVU Challenges Results: 2022-2025
Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Max Berniker, Ziheng Wang, Rogerio Nespolo, Satoshi Kondo, Satoshi Kasai, Kousuke Hirasawa, Bo Liu, et al. Surgical tool classification and localization: results and methods from the miccai 2022 surgtoolloc challenge. arXiv preprint arXiv:2305.07152 , 2023
work page internal anchor Pith review Pith/arXiv arXiv 2022
-
[9]
Endoscopic vision challenge 2021
Stefanie Speidel, Lena Maier-Hein, Danail Stoyanov, Sebastian Bodenstedt, Martin Wagner, Beat M¨ uller, Jonathan Chen, Benjamin M¨ uller, Franziska Mathis-Ullrich, Paul Scheikl, et al. Endoscopic vision challenge 2021. In 24th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2021) , 2021
work page 2021
-
[10]
A Zia, X Liu, K Bhattacharyya, Z Wang, M Berniker, A Jarc, C Nwoye, D Alapatt, A Murali, S Sharma, et al. Endoscopic vision challenge 2022. In 25th International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI 2022). Zenodo. https://doi. org/10.5281/zenodo , volume 6362288, 2022
-
[11]
Surgical data science– from concepts to clinical translation
Lena Maier-Hein, Matthias Eisenmann, Duygu Sarikaya, Keno M¨ arz, Toby Collins, Anand Malpani, Johannes Fallert, Hubertus Feussner, Stamatia Giannarou, Pietro Mascagni, et al. Surgical data science– from concepts to clinical translation. arXiv preprint arXiv:2011.02284 , 2020
-
[12]
2017 Robotic Instrument Segmentation Challenge
Max Allan, Alex Shvets, Thomas Kurmann, Zichen Zhang, Rahul Duggal, Yun-Hsuan Su, Nicola Rieke, Iro Laina, Niveditha Kalavakonda, Sebastian Bodenstedt, et al. 2017 robotic instrument segmentation challenge. arXiv preprint arXiv:1902.06426 , 2019
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[13]
arXiv preprint arXiv:2001.11190 (2020)
Max Allan, Satoshi Kondo, Sebastian Bodenstedt, Stefan Leger, Rahim Kadkhodamohammadi, Imanol Luengo, Felix Fuentes, Evangello Flouty, Ahmed Mohammed, Marius Pedersen, et al. 2018 robotic scene segmentation challenge. arXiv preprint arXiv:2001.11190 , 2020
-
[14]
Endonet: a deep architecture for recognition tasks on laparoscopic videos
Andru P Twinanda, Sherif Shehata, Didier Mutter, Jacques Marescaux, Michel De Mathelin, and Nicolas Padoy. Endonet: a deep architecture for recognition tasks on laparoscopic videos. IEEE transactions on medical imaging , 36(1):86–97, 2016. 7
work page 2016
-
[15]
A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery
Narges Ahmidi, Lingling Tao, Shahin Sefati, Yixin Gao, Colin Lea, Benjamin Bejar Haro, Luca Zappella, Sanjeev Khudanpur, Ren´ e Vidal, and Gregory D Hager. A dataset and benchmarks for segmentation and recognition of gestures in robotic surgery. IEEE Transactions on Biomedical Engineering , 64(9):2025– 2041, 2017
work page 2025
-
[16]
Martin Wagner, Beat-Peter M¨ uller-Stich, Anna Kisilenko, Duc Tran, Patrick Heger, Lars M¨ undermann, David M Lubotsky, Benjamin M¨ uller, Tornike Davitashvili, Manuela Capek, et al. Comparative valida- tion of machine learning algorithms for surgical workflow and skill analysis with the heichole benchmark. arXiv preprint arXiv:2109.14956 , 2021
-
[17]
Surgical visual domain adaptation: results from the miccai 2020 surgvisdom challenge
Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Ziheng Wang, Satoshi Kondo, Emanuele Colleoni, Beatrice van Amsterdam, Razeen Hussain, Raabid Hussain, Lena Maier-Hein, et al. Surgical visual domain adaptation: results from the miccai 2020 surgvisdom challenge. arXiv preprint arXiv:2102.13644 , 2021
-
[18]
Aneeq Zia, Kiran Bhattacharyya, Xi Liu, Ziheng Wang, Max Berniker, Satoshi Kondo, Emanuele Colleoni, Dimitris Psychogyios, Yueming Jin, Jinfan Zhou, et al. Objective surgical skills assessment and tool localization: Results from the miccai 2021 simsurgskill challenge. arXiv preprint arXiv:2212.04448 , 2022. 8
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.