Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Chenxi Tao; Seung-Kyum Choi

arxiv: 2605.30581 · v2 · pith:VCXY7RQNnew · submitted 2026-05-28 · 💻 cs.CV · cs.AI· cs.RO

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Chenxi Tao , Seung-Kyum Choi This is my paper

Pith reviewed 2026-06-29 07:40 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.RO

keywords sim-to-real transferindustrial inspectionCAD modelsdomain gapanomaly detectionpose estimationvisual inspectionprior availability

0 comments

The pith

Industrial visual sim-to-real transfer should be organized by the availability of CAD and other priors rather than by task-specific leaderboards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability instead of treating it as a uniform transfer from synthetic to real images. It distinguishes CAD-available settings, where explicit object geometry supports rendering, calibration, pose estimation, segmentation, and test-time verification, from CAD-unavailable settings that rely on normal-reference appearance, feature distributions, teacher-student residuals, or foundation features, plus boundary-prior cases that retain only partial geometric information. This taxonomy connects pose-estimation and detection literature with anomaly and surface-inspection work that are usually reviewed separately. Empirical anchors on T-LESS/BOP, MVTec AD, and VisA show that CAD render count alone does not determine success; source-distribution design, detector capacity, and small real calibration often matter more. The review concludes that deployment decisions should be grounded in the specific priors available rather than a single cross-task leaderboard.

Core claim

Organizing the domain gap by prior availability—CAD-available regimes where geometry enables multiple operations and verification channels, CAD-unavailable regimes where geometry is replaced by appearance or feature-based normality, and boundary-prior regimes that preserve only part of the CAD role—explains differences in how transfer is achieved and verified across industrial visual tasks.

What carries the argument

The taxonomy of CAD-available, CAD-unavailable, and boundary-prior regimes that organizes the domain-gap problem according to what geometric or appearance evidence is accessible before deployment.

Load-bearing premise

The distinctions between CAD-available, CAD-unavailable, and boundary-prior settings capture the essential differences that affect transfer performance in industrial settings.

What would settle it

An experiment that controls for render quality, data volume, and detector capacity and still finds transfer performance correlating more strongly with task category than with prior availability would undermine the proposed taxonomy.

Figures

Figures reproduced from arXiv: 2605.30581 by Chenxi Tao, Seung-Kyum Choi.

**Figure 2.** Figure 2: Mechanism view of the prior-availability distinction. CAD-available settings can use explicit [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

**Figure 3.** Figure 3: Dataset regimes used to anchor Section 6. T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p022_3.png] view at source ↗

**Figure 4.** Figure 4: CAD-as-renderer transfer on real T-LESS images [ [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗

**Figure 5.** Figure 5: Precision-recall diagnostics for representative CAD-as-renderer detector runs on T [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗

**Figure 6.** Figure 6: CAD-at-test-time geometry on T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗

**Figure 7.** Figure 7: CAD-unavailable anomaly-detection anchors on MVTec AD [ [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗

**Figure 8.** Figure 8: Category-level CAD-unavailable diagnostics for MVTec AD [ [PITH_FULL_IMAGE:figures/full_fig_p032_8.png] view at source ↗

**Figure 9.** Figure 9: Selected-category normal-reference budget curves for the CAD-unavailable branch. Patch [PITH_FULL_IMAGE:figures/full_fig_p034_9.png] view at source ↗

**Figure 10.** Figure 10: YOLOv8 [103] CAD-as-renderer training diagnostics for B0–B3 on T-LESS/BOP [3, 4]. Each row corresponds to one detector configuration from [PITH_FULL_IMAGE:figures/full_fig_p047_10.png] view at source ↗

**Figure 11.** Figure 11: YOLOv8 [103] CAD-as-renderer training diagnostics for B4–B6 on T-LESS/BOP [3, 4]. The same real-validation target is used across runs. B4 adds real background negatives to the randomized synthetic source, B5 increases detector capacity, and B6 combines the larger detector with small real calibration. The curves support the main-text interpretation that source design, model capacity, and limited real calib… view at source ↗

**Figure 12.** Figure 12: Representative real-validation prediction grids for the seven CAD-as-renderer detector [PITH_FULL_IMAGE:figures/full_fig_p049_12.png] view at source ↗

**Figure 13.** Figure 13: Normalized confusion matrices for the YOLOv8 [ [PITH_FULL_IMAGE:figures/full_fig_p050_13.png] view at source ↗

**Figure 14.** Figure 14: Training label distributions for the CAD-as-renderer detector runs on T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p051_14.png] view at source ↗

**Figure 15.** Figure 15: Additional CAD-at-test-time qualitative diagnostics. The contact sheet shows Mega [PITH_FULL_IMAGE:figures/full_fig_p052_15.png] view at source ↗

**Figure 16.** Figure 16: Additional CAD-unavailable qualitative diagnostics for the synthetic-anomaly probe. [PITH_FULL_IMAGE:figures/full_fig_p053_16.png] view at source ↗

read the original abstract

Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under different sensors, lighting, materials, fixtures, calibration, production variation, and rare defect modes. This review reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. We distinguish CAD-available settings, where explicit object geometry can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification; CAD-unavailable settings, where geometry is replaced by normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomaly assumptions, foundation features, or vision-language priors; and boundary-prior settings, where approximate models, templates, reference views, or semantic correspondences preserve only part of the CAD role. This framing connects CAD-based detection and 6D pose-estimation literature with industrial anomaly and surface-inspection literature that is usually reviewed separately. To make the taxonomy concrete, we use empirical anchors on T-LESS/BOP, MVTec AD, and VisA. The anchors show that CAD render count alone does not close transfer; source-distribution design, detector capacity, and small real calibration can matter more. They also show that CAD at test time creates a distinct verification channel through mask, pose, and depth consistency, whereas CAD-unavailable inspection relies on calibrated normality and feature deviation. The review therefore argues against a single cross-task leaderboard and instead asks what prior grounds the deployment decision.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

A review that usefully links CAD pose work with anomaly detection via a prior-availability taxonomy, but its anchors mix mismatched tasks so the claimed distinctions are suggestive rather than isolated.

read the letter

This paper reviews industrial visual sim-to-real by organizing it around prior availability: CAD-available cases where geometry supports rendering and test-time checks, CAD-unavailable cases that rely on normal references or feature deviations, and boundary cases with partial templates. It pulls together 6D pose and anomaly detection literatures that are usually kept separate.

It does a solid job showing that CAD render volume by itself does not close the gap. The anchors on T-LESS/BOP, MVTec AD, and VisA illustrate that source design, detector capacity, and limited real calibration often matter more. It also makes clear that CAD at test time supplies an extra verification route through mask, pose, and depth consistency that pure appearance methods lack.

The main limitation is that the supporting examples come from disjoint tasks with different metrics and object sets. There is no controlled within-task comparison that holds the detector, training, and test distribution fixed while toggling only the presence of CAD. That leaves open whether the taxonomy axis drives the differences or whether task semantics do.

The work is aimed at practitioners choosing methods for factory inspection or robotics who need to match their available priors to deployment constraints. It is worth sending to peer review because the framing is practical and the cross-literature connections are reasonable, even if the evidence remains illustrative. A referee could usefully ask for tighter comparisons on matched tasks.

Referee Report

1 major / 2 minor

Summary. The paper is a literature review that reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. It distinguishes CAD-available regimes (explicit geometry supporting rendering, calibration, pose estimation, and geometric verification), CAD-unavailable regimes (relying on normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomalies, foundation features, or vision-language priors), and boundary-prior regimes (approximate models or partial correspondences). The review connects 6D pose estimation and anomaly detection literatures, uses empirical anchors from T-LESS/BOP, MVTec AD, and VisA to illustrate that CAD render count alone does not close transfer gaps while source-distribution design, detector capacity, and small real calibration can matter more, and argues against a single cross-task leaderboard in favor of deployment decisions grounded in available priors.

Significance. If the taxonomy holds, the review supplies a useful organizational framework that bridges two subfields typically reviewed in isolation and shifts evaluation emphasis from generic leaderboards toward prior-specific verification channels (geometric consistency vs. calibrated normality). The observation that render count is not decisive and that small real calibration can be more impactful offers practical guidance for industrial deployment.

major comments (1)

[Empirical anchors discussion (referenced in Abstract)] Empirical anchors (T-LESS/BOP for 6D pose vs. MVTec AD/VisA for anomaly detection): these datasets belong to disjoint problem classes with non-overlapping metrics, object sets, and protocols. The manuscript states that the anchors 'show that CAD render count alone does not close transfer' and that 'prior type drives performance,' yet no within-task controlled comparison is described that holds detector, training regime, and test distribution fixed while varying only prior availability. Consequently the observed differences could be attributable to task semantics rather than the taxonomy axis, weakening support for the central claim that the CAD-available / CAD-unavailable distinction precludes a single cross-task leaderboard.

minor comments (2)

[Taxonomy definition] The boundary-prior category is introduced but its operational distinction from CAD-unavailable settings is illustrated only briefly; additional concrete examples (e.g., specific reference-view or template methods) would sharpen the taxonomy.
[Introduction / Related work] The review cites standard datasets but does not report the search strategy, inclusion criteria, or total number of papers surveyed; adding a methods paragraph would improve reproducibility of the literature mapping.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comments. We address the major comment on the empirical anchors below.

read point-by-point responses

Referee: Empirical anchors (T-LESS/BOP for 6D pose vs. MVTec AD/VisA for anomaly detection): these datasets belong to disjoint problem classes with non-overlapping metrics, object sets, and protocols. The manuscript states that the anchors 'show that CAD render count alone does not close transfer' and that 'prior type drives performance,' yet no within-task controlled comparison is described that holds detector, training regime, and test distribution fixed while varying only prior availability. Consequently the observed differences could be attributable to task semantics rather than the taxonomy axis, weakening support for the central claim that the CAD-available / CAD-unavailable distinction precludes a single cross-task leaderboard.

Authors: We acknowledge that a controlled experiment holding all factors fixed except prior availability would strengthen the causal interpretation. The manuscript uses the anchors to illustrate that in practice, across established benchmarks, factors like source distribution design and small real calibration appear more impactful than render count alone, and that the verification mechanisms differ fundamentally (geometric vs. normality-based). The argument against a single leaderboard rests primarily on these differing verification channels rather than on cross-task performance comparisons. We will add a clarification in the revised manuscript that the anchors are meant as illustrative examples from the respective literatures and do not constitute a controlled ablation across tasks. revision: partial

Circularity Check

0 steps flagged

No circularity: literature review with external anchors

full rationale

The paper is a literature review proposing a taxonomy (CAD-available / CAD-unavailable / boundary-prior) to organize existing sim-to-real work. It contains no equations, fitted parameters, predictions, or derivations. The central claim rests on cited empirical anchors (T-LESS/BOP, MVTec AD, VisA) drawn from the external literature rather than self-generated results or self-citation chains. No step reduces by construction to the paper's own inputs; the framework is an organizational lens applied to independent prior work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a review and introduces no new free parameters, axioms beyond standard domain knowledge, or invented entities. The taxonomy itself is the contribution.

axioms (1)

domain assumption CAD models provide explicit object geometry that can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification.
Stated in the abstract as the basis for CAD-available settings.

pith-pipeline@v0.9.1-grok · 5840 in / 1111 out tokens · 26943 ms · 2026-06-29T07:40:11.491920+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

103 extracted references · 17 canonical work pages

[1]

Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022

Fabio Muratore, Christian Eilers, Michael Gienger, and Jan Peters. Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022. doi: 10.3389/frobt.2022.799893

work page doi:10.3389/frobt.2022.799893 2022
[2]

Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

Wenshuai Zhao, Jorge Pena Queralta, and Tomi Westerlund. Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

2020
[3]

T-less: An rgb-d dataset for 6d pose estimation of texture-less objects

Tomas Hodan, Pavel Haluza, Stepan Obdrzalek, Jiri Matas, Manolis Lourakis, and Xenophon Zabulis. T-less: An rgb-d dataset for 6d pose estimation of texture-less objects. InIEEE Winter Conference on Applications of Computer Vision, 2017

2017
[4]

Bop: Benchmark for 6d object pose estimation

Tomas Hodan, Frank Michel, Eric Brachmann, Wadim Kehl, Anders Glent Buch, Dirk Kraft, Bertram Drost, Joel Vidal, Stefan Ihrke, Xenophon Zabulis, Caner Sahin, Fabian Manhardt, Federico Tombari, Tae-Kyun Kim, Jiri Matas, and Carsten Rother. Bop: Benchmark for 6d object pose estimation. InEuropean Conference on Computer Vision Workshops, 2018

2018
[5]

Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection

Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019

2019
[6]

Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022

Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, and Onkar Dabeer. Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022. Introduces the VisA dataset

2022
[7]

Malamas, Euripides G

Elias N. Malamas, Euripides G. M. Petrakis, Michalis Zervakis, Laurent Petit, and Jean-Didier Legat. A survey on industrial vision systems, applications and tools.Image and Vision Computing, 21(2):171–188, 2003

2003
[8]

A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023

Jan Diers and Christian Pigorsch. A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023. doi: 10.1007/s11263-023-01822-w

work page doi:10.1007/s11263-023-01822-w 2023
[9]

Farahani, and Thorsten Wuest

Milica Babic, Mojtaba A. Farahani, and Thorsten Wuest. Image based quality inspection in smart manufacturing systems: A literature review.Procedia CIRP, 103:262–267, 2021. doi: 10.1016/j.procir.2021.10.042

work page doi:10.1016/j.procir.2021.10.042 2021
[10]

Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020

Tamas Czimmermann, Gastone Ciuti, Matteo Milazzo, Matteo Chiurazzi, Stefano Roccella, CalogeroM.Oddo, andPaoloDario. Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020. doi: 10.3390/s20051459

work page doi:10.3390/s20051459 2020
[11]

Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022

Hafiz Mughees Ahmad and Afshin Rahimi. Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022. doi: 10.1016/j.jmsy.2022.06.011

work page doi:10.1016/j.jmsy.2022.06.011 2022
[12]

Luis Perez, Inigo Rodriguez, Nuria Rodriguez, Ruben Usamentiaga, and Daniel F. Garcia. Robot guidance using machine vision techniques in industrial environments: A comparative review.Sensors, 16(3):335, 2016. doi: 10.3390/s16030335

work page doi:10.3390/s16030335 2016
[13]

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. 38 InIEEE/RSJ International Conference on Intelligent Robots and Systems, pages 23–30, 2017. doi: 10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017
[14]

Blenderproc2: A procedural pipeline for photorealistic rendering,

Maximilian Denninger, Dominik Winkelbauer, Martin Sundermeyer, Wout Boerdijk, Markus Knauer, Klaus H. Strobl, Matthias Humt, and Rudolph Triebel. Blenderproc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023. doi: 10.21105/joss.04901

work page doi:10.21105/joss.04901 2023
[15]

Model globally, match locally: Efficient and robust 3d object recognition

Bertram Drost, Markus Ulrich, Nassir Navab, and Slobodan Ilic. Model globally, match locally: Efficient and robust 3d object recognition. InIEEE Conference on Computer Vision and Pattern Recognition, pages 998–1005, 2010

2010
[16]

Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes

Stefan Hinterstoisser, Vincent Lepetit, Slobodan Ilic, Stefan Holzer, Gary Bradski, Kurt Konolige, and Nassir Navab. Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes. InAsian Conference on Computer Vision, pages 548–562, 2012

2012
[17]

Megapose: 6d pose estimation of novel objects via render and compare

Yann Labbe, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, and Josef Sivic. Megapose: 6d pose estimation of novel objects via render and compare. InConference on Robot Learning, 2023

2023
[18]

Foundationpose: Unified 6d pose estimation and tracking of novel objects

Bowen Wen, Wei Yang, Jan Kautz, and Stan Birchfield. Foundationpose: Unified 6d pose estimation and tracking of novel objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2024
[19]

Foundpose: Unseen object pose estimation with foundation features, 2024

Evin Pinar Ornek, Yann Labbe, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, and Tomas Hodan. Foundpose: Unseen object pose estimation with foundation features, 2024. ECCV 2024

2024
[20]

Sam-6d: Segment anything model meets zero-shot 6d object pose estimation

Jiehong Lin, Lihua Liu, Dekun Lu, and Kui Jia. Sam-6d: Segment anything model meets zero-shot 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2024
[21]

Gigapose: Fast and robust novel object pose estimation via one correspondence

Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, and Vincent Lepetit. Gigapose: Fast and robust novel object pose estimation via one correspondence. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9903–9913, 2024

2024
[22]

Towards total recall in industrial anomaly detection

Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Scholkopf, Thomas Brox, and Peter Gehler. Towards total recall in industrial anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2022
[23]

Efficientad: Accurate visual anomaly detection at millisecond-level latencies

Kilian Batzner, Lars Heckler, and Rebecca Konig. Efficientad: Accurate visual anomaly detection at millisecond-level latencies. InIEEE/CVF Winter Conference on Applications of Computer Vision, 2024

2024
[24]

Draem: A discriminatively trained recon- struction embedding for surface anomaly detection

Vitjan Zavrtanik, Matej Kristan, and Danijel Skocaj. Draem: A discriminatively trained recon- struction embedding for surface anomaly detection. InIEEE/CVF International Conference on Computer Vision, 2021

2021
[25]

Simplenet: A simple network for image anomaly detection and localization

Zhikang Liu, Yiming Zhou, Yuansheng Xu, and Zilei Wang. Simplenet: A simple network for image anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 39

2023
[26]

Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

Blaz Rolih, Matic Fucka, and Danijel Skocaj. Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

2024
[27]

Winclip: Zero-/few-shot anomaly classification and segmentation

Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, and Onkar Dabeer. Winclip: Zero-/few-shot anomaly classification and segmentation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2023
[28]

Dinov2: Learning robust visual features without supervision, 2023

Maxime Oquab, Timothee Darcet, Theo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision, 2023

2023
[29]

Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025

Simon Damm, Mike Laszkiewicz, Johannes Lederer, and Asja Fischer. Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025. WACV 2025

2025
[30]

Cad2rl: Real single-image flight without a single real image

Fereshteh Sadeghi and Sergey Levine. Cad2rl: Real single-image flight without a single real image. InRobotics: Science and Systems, 2017

2017
[31]

Deep object pose estimation for semantic robotic grasping of household objects

Jonathan Tremblay, Thang To, Balakumar Sundaralingam, Yu Xiang, Dieter Fox, and Stan Birchfield. Deep object pose estimation for semantic robotic grasping of household objects. In Conference on Robot Learning, pages 306–316, 2018

2018
[32]

Falling things: A synthetic dataset for 3d object detection and pose estimation

Jonathan Tremblay, Aayush Prakash, David Acuna, Mark Brophy, Varun Jampani, Cem Anil, Thang To, Eric Cameracci, Shaad Boochoon, and Stan Birchfield. Falling things: A synthetic dataset for 3d object detection and pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018

2018
[33]

Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again

Wadim Kehl, Fabian Manhardt, Federico Tombari, Slobodan Ilic, and Nassir Navab. Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. InIEEE International Conference on Computer Vision, pages 1530–1538, 2017

2017
[34]

Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes

Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes. InRobotics: Science and Systems, 2018

2018
[35]

Densefusion: 6d object pose estimation by iterative dense fusion

Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martin-Martin, Cewu Lu, Li Fei-Fei, and Silvio Savarese. Densefusion: 6d object pose estimation by iterative dense fusion. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3343–3352, 2019

2019
[36]

Pvnet: Pixel-wise voting network for 6dof pose estimation

Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, and Hujun Bao. Pvnet: Pixel-wise voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4561–4570, 2019

2019
[37]

Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation

Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, and Jian Sun. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11632–11641, 2020

2020
[38]

Epos: Estimating 6d pose of objects with symmetries

Tomas Hodan, Daniel Barath, and Jiri Matas. Epos: Estimating 6d pose of objects with symmetries. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11703–11712, 2020

2020
[39]

Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings

Rasmus Laurvig Haugaard and Anders Glent Buch. Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6749–6758, 2022. 40

2022
[40]

Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation

Yongzhi Su, Mohamed Saleh, Thomas Fetzer, Jason Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, and Federico Tombari. Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6738–6748, 2022

2022
[41]

Cnos: A strong baseline for cad-based novel object segmentation

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin, Vincent Lepetit, and Tomas Hodan. Cnos: A strong baseline for cad-based novel object segmentation. InIEEE/CVF International Conference on Computer Vision Workshops, 2023

2023
[42]

Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

Sungmin Cho, Sungbum Park, and Insoo Oh. Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

2025
[43]

Bop challenge 2024 on model-based and model-free 6d object pose estimation, 2025

Van Nguyen Nguyen, Stephen Tyree, Andrew Guo, Mederic Fourmy, Anas Gouda, Taeyeop Lee, Sungphill Moon, Hyeontae Son, Lukas Ranftl, Jonathan Tremblay, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Stan Birchfield, Jiri Matas, Yann Labbe, Martin Sundermeyer, and Tomas Hodan. Bop challenge 2024 on model-based and model-free 6d object pose e...

2024
[44]

Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025

Andrea Caraffa, Davide Boscaini, and Fabio Poiesi. Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025. FreeZeV2

2025
[45]

Shaffer, and Stephen Gould

Weijian Deng, Dylan Campbell, Chunyi Sun, Jiahao Zhang, Shubham Kanitkar, Matthew E. Shaffer, and Stephen Gould. Pos3r: 6d pose estimation for unseen objects made easy. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16818–16828, 2025

2025
[46]

Learning from simulated and unsupervised images through adversarial training

Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, and Russell Webb. Learning from simulated and unsupervised images through adversarial training. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2107–2116, 2017

2017
[47]

Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

2016
[48]

Adversarial discriminative domain adaptation

Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. Adversarial discriminative domain adaptation. InIEEE Conference on Computer Vision and Pattern Recognition, pages 7167–7176, 2017

2017
[49]

Efros, and Trevor Darrell

Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, and Trevor Darrell. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning, pages 1989–1998, 2018

1989
[50]

Sim-to-real transfer of robotic control with dynamics randomization

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. InIEEE International Conference on Robotics and Automation, pages 3803–3810, 2018

2018
[51]

Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks

Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, and Konstantinos Bousmalis. Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12627–12637, 2019. 41

2019
[52]

Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects

Martin Sundermeyer, Tomas Hodan, Yann Labbe, Gu Wang, Eric Brachmann, Bertram Drost, Carsten Rother, and Jiri Matas. Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023

2022
[53]

6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021

Zaixing He, Wuxi Feng, Xinyue Zhao, and Yongfeng Lv. 6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021. doi: 10.3390/app11010228

work page doi:10.3390/app11010228 2021
[54]

A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024

Jian Guan, Yingming Hao, Qingxiao Wu, Sicong Li, and Yingjian Fang. A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024. doi: 10.3390/s24041076

work page doi:10.3390/s24041076 2024
[55]

Dpod: 6d pose object detector and refiner

Sergey Zakharov, Ivan Shugurov, and Slobodan Ilic. Dpod: 6d pose object detector and refiner. InIEEE/CVF International Conference on Computer Vision, pages 1941–1950, 2019

1941
[56]

Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation

Zhigang Li, Gu Wang, and Xiangyang Ji. Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. InIEEE/CVF International Conference on Computer Vision, pages 7678–7687, 2019

2019
[57]

Cosypose: Consistent multi-view multi-object 6d pose estimation

Yann Labbe, Justin Carpentier, Mathieu Aubry, and Josef Sivic. Cosypose: Consistent multi-view multi-object 6d pose estimation. InEuropean Conference on Computer Vision, pages 574–591, 2020

2020
[58]

Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation

Gu Wang, Fabian Manhardt, Federico Tombari, and Xiangyang Ji. Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16611–16621, 2021

2021
[59]

Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. Segment anything. InIEEE/CVF International Conference on Computer Vision, pages 4015–4026, 2023

2023
[60]

Anomaly Detection: A Survey,

Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey.ACM Computing Surveys, 41(3):1–58, 2009. doi: 10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009
[61]

Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021

Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021. doi: 10.1145/ 3439950

2021
[62]

IEEE 109, 5 (May 2021), 756–795

Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Gregoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, and Klaus-Robert Muller. A unifying review of deep and shallow anomaly detection.Proceedings of the IEEE, 109(5):756–795, 2021. doi: 10.1109/JPROC.2021.3052449

work page doi:10.1109/jproc.2021.3052449 2021
[63]

A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

Qifan Jin and Li Chen. A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

2022
[64]

Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022

Xian Tao, Xinyi Gong, Xin Zhang, Shaohua Yan, and Chandranath Adak. Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022. doi: 10.1109/TIM.2022.3196436. 42

work page doi:10.1109/tim.2022.3196436 2022
[65]

Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024

Jiaqi Liu, Guoyang Xie, Jinbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, and Yaochu Jin. Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024. doi: 10.1007/s11633-023-1459-z

work page doi:10.1007/s11633-023-1459-z 2024
[66]

Deep one-class classification

Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Muller, and Marius Kloft. Deep one-class classification. In International Conference on Machine Learning, pages 4393–4402, 2018

2018
[67]

Sub-image anomaly detection with deep pyramid correspon- dences, 2020

Niv Cohen and Yedid Hoshen. Sub-image anomaly detection with deep pyramid correspon- dences, 2020

2020
[68]

Padim: A patch distribution modeling framework for anomaly detection and localization

Thomas Defard, Aleksandr Setkov, Angelique Loesch, and Romaric Audigier. Padim: A patch distribution modeling framework for anomaly detection and localization. InInternational Conference on Pattern Recognition Workshops, pages 475–489, 2021

2021
[69]

Patch svdd: Patch-level svdd for anomaly detection and segmentation

Jihun Yi and Sungroh Yoon. Patch svdd: Patch-level svdd for anomaly detection and segmentation. InAsian Conference on Computer Vision, pages 375–390, 2021

2021
[70]

Same same but differnet: Semi- supervised defect detection with normalizing flows

Marco Rudolph, Bastian Wandt, and Bodo Rosenhahn. Same same but differnet: Semi- supervised defect detection with normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 1907–1916, 2021

1907
[71]

Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

Jiawei Yu, Ye Zheng, Xiang Wang, Wei Li, Yushuang Wu, Rui Zhao, and Liwei Wu. Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

2021
[72]

Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows

Denis Gudovskiy, Shun Ishizaka, and Kazuki Kozuka. Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 98–107, 2022

2022
[73]

Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings

Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4183–4192, 2020

2020
[74]

Student-teacher feature pyramid matching for anomaly detection

Guodong Wang, Shumin Han, Errui Ding, and Di Huang. Student-teacher feature pyramid matching for anomaly detection. InBritish Machine Vision Conference, 2021

2021
[75]

Rohban, and Hamid R

Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad H. Rohban, and Hamid R. Rabiee. Multiresolution knowledge distillation for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14902–14912, 2021

2021
[76]

Anomaly detection via reverse distillation from one-class embedding

Hanqiu Deng and Xingyu Li. Anomaly detection via reverse distillation from one-class embedding. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9737–9746, 2022

2022
[77]

Cutpaste: Self-supervised learning for anomaly detection and localization

Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, and Tomas Pfister. Cutpaste: Self-supervised learning for anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9664–9674, 2021

2021
[78]

Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz

Hannah M. Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz. Natural synthetic anomalies for self-supervised anomaly detection and localization, 2021

2021
[79]

Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection

Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, and Ting Chen. Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3914–3923, 2023. 43

2023
[80]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763, 2021

2021

Showing first 80 references.

[1] [1]

Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022

Fabio Muratore, Christian Eilers, Michael Gienger, and Jan Peters. Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022. doi: 10.3389/frobt.2022.799893

work page doi:10.3389/frobt.2022.799893 2022

[2] [2]

Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

Wenshuai Zhao, Jorge Pena Queralta, and Tomi Westerlund. Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

2020

[3] [3]

T-less: An rgb-d dataset for 6d pose estimation of texture-less objects

Tomas Hodan, Pavel Haluza, Stepan Obdrzalek, Jiri Matas, Manolis Lourakis, and Xenophon Zabulis. T-less: An rgb-d dataset for 6d pose estimation of texture-less objects. InIEEE Winter Conference on Applications of Computer Vision, 2017

2017

[4] [4]

Bop: Benchmark for 6d object pose estimation

Tomas Hodan, Frank Michel, Eric Brachmann, Wadim Kehl, Anders Glent Buch, Dirk Kraft, Bertram Drost, Joel Vidal, Stefan Ihrke, Xenophon Zabulis, Caner Sahin, Fabian Manhardt, Federico Tombari, Tae-Kyun Kim, Jiri Matas, and Carsten Rother. Bop: Benchmark for 6d object pose estimation. InEuropean Conference on Computer Vision Workshops, 2018

2018

[5] [5]

Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection

Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019

2019

[6] [6]

Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022

Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, and Onkar Dabeer. Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022. Introduces the VisA dataset

2022

[7] [7]

Malamas, Euripides G

Elias N. Malamas, Euripides G. M. Petrakis, Michalis Zervakis, Laurent Petit, and Jean-Didier Legat. A survey on industrial vision systems, applications and tools.Image and Vision Computing, 21(2):171–188, 2003

2003

[8] [8]

A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023

Jan Diers and Christian Pigorsch. A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023. doi: 10.1007/s11263-023-01822-w

work page doi:10.1007/s11263-023-01822-w 2023

[9] [9]

Farahani, and Thorsten Wuest

Milica Babic, Mojtaba A. Farahani, and Thorsten Wuest. Image based quality inspection in smart manufacturing systems: A literature review.Procedia CIRP, 103:262–267, 2021. doi: 10.1016/j.procir.2021.10.042

work page doi:10.1016/j.procir.2021.10.042 2021

[10] [10]

Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020

Tamas Czimmermann, Gastone Ciuti, Matteo Milazzo, Matteo Chiurazzi, Stefano Roccella, CalogeroM.Oddo, andPaoloDario. Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020. doi: 10.3390/s20051459

work page doi:10.3390/s20051459 2020

[11] [11]

Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022

Hafiz Mughees Ahmad and Afshin Rahimi. Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022. doi: 10.1016/j.jmsy.2022.06.011

work page doi:10.1016/j.jmsy.2022.06.011 2022

[12] [12]

Luis Perez, Inigo Rodriguez, Nuria Rodriguez, Ruben Usamentiaga, and Daniel F. Garcia. Robot guidance using machine vision techniques in industrial environments: A comparative review.Sensors, 16(3):335, 2016. doi: 10.3390/s16030335

work page doi:10.3390/s16030335 2016

[13] [13]

Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,

Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. 38 InIEEE/RSJ International Conference on Intelligent Robots and Systems, pages 23–30, 2017. doi: 10.1109/IROS.2017.8202133

work page doi:10.1109/iros.2017.8202133 2017

[14] [14]

Blenderproc2: A procedural pipeline for photorealistic rendering,

Maximilian Denninger, Dominik Winkelbauer, Martin Sundermeyer, Wout Boerdijk, Markus Knauer, Klaus H. Strobl, Matthias Humt, and Rudolph Triebel. Blenderproc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023. doi: 10.21105/joss.04901

work page doi:10.21105/joss.04901 2023

[15] [15]

Model globally, match locally: Efficient and robust 3d object recognition

Bertram Drost, Markus Ulrich, Nassir Navab, and Slobodan Ilic. Model globally, match locally: Efficient and robust 3d object recognition. InIEEE Conference on Computer Vision and Pattern Recognition, pages 998–1005, 2010

2010

[16] [16]

Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes

Stefan Hinterstoisser, Vincent Lepetit, Slobodan Ilic, Stefan Holzer, Gary Bradski, Kurt Konolige, and Nassir Navab. Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes. InAsian Conference on Computer Vision, pages 548–562, 2012

2012

[17] [17]

Megapose: 6d pose estimation of novel objects via render and compare

Yann Labbe, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, and Josef Sivic. Megapose: 6d pose estimation of novel objects via render and compare. InConference on Robot Learning, 2023

2023

[18] [18]

Foundationpose: Unified 6d pose estimation and tracking of novel objects

Bowen Wen, Wei Yang, Jan Kautz, and Stan Birchfield. Foundationpose: Unified 6d pose estimation and tracking of novel objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2024

[19] [19]

Foundpose: Unseen object pose estimation with foundation features, 2024

Evin Pinar Ornek, Yann Labbe, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, and Tomas Hodan. Foundpose: Unseen object pose estimation with foundation features, 2024. ECCV 2024

2024

[20] [20]

Sam-6d: Segment anything model meets zero-shot 6d object pose estimation

Jiehong Lin, Lihua Liu, Dekun Lu, and Kui Jia. Sam-6d: Segment anything model meets zero-shot 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

2024

[21] [21]

Gigapose: Fast and robust novel object pose estimation via one correspondence

Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, and Vincent Lepetit. Gigapose: Fast and robust novel object pose estimation via one correspondence. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9903–9913, 2024

2024

[22] [22]

Towards total recall in industrial anomaly detection

Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Scholkopf, Thomas Brox, and Peter Gehler. Towards total recall in industrial anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

2022

[23] [23]

Efficientad: Accurate visual anomaly detection at millisecond-level latencies

Kilian Batzner, Lars Heckler, and Rebecca Konig. Efficientad: Accurate visual anomaly detection at millisecond-level latencies. InIEEE/CVF Winter Conference on Applications of Computer Vision, 2024

2024

[24] [24]

Draem: A discriminatively trained recon- struction embedding for surface anomaly detection

Vitjan Zavrtanik, Matej Kristan, and Danijel Skocaj. Draem: A discriminatively trained recon- struction embedding for surface anomaly detection. InIEEE/CVF International Conference on Computer Vision, 2021

2021

[25] [25]

Simplenet: A simple network for image anomaly detection and localization

Zhikang Liu, Yiming Zhou, Yuansheng Xu, and Zilei Wang. Simplenet: A simple network for image anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 39

2023

[26] [26]

Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

Blaz Rolih, Matic Fucka, and Danijel Skocaj. Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

2024

[27] [27]

Winclip: Zero-/few-shot anomaly classification and segmentation

Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, and Onkar Dabeer. Winclip: Zero-/few-shot anomaly classification and segmentation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

2023

[28] [28]

Dinov2: Learning robust visual features without supervision, 2023

Maxime Oquab, Timothee Darcet, Theo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision, 2023

2023

[29] [29]

Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025

Simon Damm, Mike Laszkiewicz, Johannes Lederer, and Asja Fischer. Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025. WACV 2025

2025

[30] [30]

Cad2rl: Real single-image flight without a single real image

Fereshteh Sadeghi and Sergey Levine. Cad2rl: Real single-image flight without a single real image. InRobotics: Science and Systems, 2017

2017

[31] [31]

Deep object pose estimation for semantic robotic grasping of household objects

Jonathan Tremblay, Thang To, Balakumar Sundaralingam, Yu Xiang, Dieter Fox, and Stan Birchfield. Deep object pose estimation for semantic robotic grasping of household objects. In Conference on Robot Learning, pages 306–316, 2018

2018

[32] [32]

Falling things: A synthetic dataset for 3d object detection and pose estimation

Jonathan Tremblay, Aayush Prakash, David Acuna, Mark Brophy, Varun Jampani, Cem Anil, Thang To, Eric Cameracci, Shaad Boochoon, and Stan Birchfield. Falling things: A synthetic dataset for 3d object detection and pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018

2018

[33] [33]

Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again

Wadim Kehl, Fabian Manhardt, Federico Tombari, Slobodan Ilic, and Nassir Navab. Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. InIEEE International Conference on Computer Vision, pages 1530–1538, 2017

2017

[34] [34]

Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes

Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes. InRobotics: Science and Systems, 2018

2018

[35] [35]

Densefusion: 6d object pose estimation by iterative dense fusion

Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martin-Martin, Cewu Lu, Li Fei-Fei, and Silvio Savarese. Densefusion: 6d object pose estimation by iterative dense fusion. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3343–3352, 2019

2019

[36] [36]

Pvnet: Pixel-wise voting network for 6dof pose estimation

Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, and Hujun Bao. Pvnet: Pixel-wise voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4561–4570, 2019

2019

[37] [37]

Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation

Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, and Jian Sun. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11632–11641, 2020

2020

[38] [38]

Epos: Estimating 6d pose of objects with symmetries

Tomas Hodan, Daniel Barath, and Jiri Matas. Epos: Estimating 6d pose of objects with symmetries. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11703–11712, 2020

2020

[39] [39]

Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings

Rasmus Laurvig Haugaard and Anders Glent Buch. Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6749–6758, 2022. 40

2022

[40] [40]

Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation

Yongzhi Su, Mohamed Saleh, Thomas Fetzer, Jason Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, and Federico Tombari. Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6738–6748, 2022

2022

[41] [41]

Cnos: A strong baseline for cad-based novel object segmentation

Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin, Vincent Lepetit, and Tomas Hodan. Cnos: A strong baseline for cad-based novel object segmentation. InIEEE/CVF International Conference on Computer Vision Workshops, 2023

2023

[42] [42]

Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

Sungmin Cho, Sungbum Park, and Insoo Oh. Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

2025

[43] [43]

Bop challenge 2024 on model-based and model-free 6d object pose estimation, 2025

Van Nguyen Nguyen, Stephen Tyree, Andrew Guo, Mederic Fourmy, Anas Gouda, Taeyeop Lee, Sungphill Moon, Hyeontae Son, Lukas Ranftl, Jonathan Tremblay, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Stan Birchfield, Jiri Matas, Yann Labbe, Martin Sundermeyer, and Tomas Hodan. Bop challenge 2024 on model-based and model-free 6d object pose e...

2024

[44] [44]

Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025

Andrea Caraffa, Davide Boscaini, and Fabio Poiesi. Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025. FreeZeV2

2025

[45] [45]

Shaffer, and Stephen Gould

Weijian Deng, Dylan Campbell, Chunyi Sun, Jiahao Zhang, Shubham Kanitkar, Matthew E. Shaffer, and Stephen Gould. Pos3r: 6d pose estimation for unseen objects made easy. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16818–16828, 2025

2025

[46] [46]

Learning from simulated and unsupervised images through adversarial training

Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, and Russell Webb. Learning from simulated and unsupervised images through adversarial training. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2107–2116, 2017

2017

[47] [47]

Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

2016

[48] [48]

Adversarial discriminative domain adaptation

Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. Adversarial discriminative domain adaptation. InIEEE Conference on Computer Vision and Pattern Recognition, pages 7167–7176, 2017

2017

[49] [49]

Efros, and Trevor Darrell

Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, and Trevor Darrell. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning, pages 1989–1998, 2018

1989

[50] [50]

Sim-to-real transfer of robotic control with dynamics randomization

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. InIEEE International Conference on Robotics and Automation, pages 3803–3810, 2018

2018

[51] [51]

Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks

Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, and Konstantinos Bousmalis. Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12627–12637, 2019. 41

2019

[52] [52]

Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects

Martin Sundermeyer, Tomas Hodan, Yann Labbe, Gu Wang, Eric Brachmann, Bertram Drost, Carsten Rother, and Jiri Matas. Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023

2022

[53] [53]

6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021

Zaixing He, Wuxi Feng, Xinyue Zhao, and Yongfeng Lv. 6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021. doi: 10.3390/app11010228

work page doi:10.3390/app11010228 2021

[54] [54]

A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024

Jian Guan, Yingming Hao, Qingxiao Wu, Sicong Li, and Yingjian Fang. A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024. doi: 10.3390/s24041076

work page doi:10.3390/s24041076 2024

[55] [55]

Dpod: 6d pose object detector and refiner

Sergey Zakharov, Ivan Shugurov, and Slobodan Ilic. Dpod: 6d pose object detector and refiner. InIEEE/CVF International Conference on Computer Vision, pages 1941–1950, 2019

1941

[56] [56]

Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation

Zhigang Li, Gu Wang, and Xiangyang Ji. Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. InIEEE/CVF International Conference on Computer Vision, pages 7678–7687, 2019

2019

[57] [57]

Cosypose: Consistent multi-view multi-object 6d pose estimation

Yann Labbe, Justin Carpentier, Mathieu Aubry, and Josef Sivic. Cosypose: Consistent multi-view multi-object 6d pose estimation. InEuropean Conference on Computer Vision, pages 574–591, 2020

2020

[58] [58]

Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation

Gu Wang, Fabian Manhardt, Federico Tombari, and Xiangyang Ji. Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16611–16621, 2021

2021

[59] [59]

Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick

Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. Segment anything. InIEEE/CVF International Conference on Computer Vision, pages 4015–4026, 2023

2023

[60] [60]

Anomaly Detection: A Survey,

Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey.ACM Computing Surveys, 41(3):1–58, 2009. doi: 10.1145/1541880.1541882

work page doi:10.1145/1541880.1541882 2009

[61] [61]

Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021

Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021. doi: 10.1145/ 3439950

2021

[62] [62]

IEEE 109, 5 (May 2021), 756–795

Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Gregoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, and Klaus-Robert Muller. A unifying review of deep and shallow anomaly detection.Proceedings of the IEEE, 109(5):756–795, 2021. doi: 10.1109/JPROC.2021.3052449

work page doi:10.1109/jproc.2021.3052449 2021

[63] [63]

A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

Qifan Jin and Li Chen. A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

2022

[64] [64]

Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022

Xian Tao, Xinyi Gong, Xin Zhang, Shaohua Yan, and Chandranath Adak. Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022. doi: 10.1109/TIM.2022.3196436. 42

work page doi:10.1109/tim.2022.3196436 2022

[65] [65]

Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024

Jiaqi Liu, Guoyang Xie, Jinbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, and Yaochu Jin. Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024. doi: 10.1007/s11633-023-1459-z

work page doi:10.1007/s11633-023-1459-z 2024

[66] [66]

Deep one-class classification

Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Muller, and Marius Kloft. Deep one-class classification. In International Conference on Machine Learning, pages 4393–4402, 2018

2018

[67] [67]

Sub-image anomaly detection with deep pyramid correspon- dences, 2020

Niv Cohen and Yedid Hoshen. Sub-image anomaly detection with deep pyramid correspon- dences, 2020

2020

[68] [68]

Padim: A patch distribution modeling framework for anomaly detection and localization

Thomas Defard, Aleksandr Setkov, Angelique Loesch, and Romaric Audigier. Padim: A patch distribution modeling framework for anomaly detection and localization. InInternational Conference on Pattern Recognition Workshops, pages 475–489, 2021

2021

[69] [69]

Patch svdd: Patch-level svdd for anomaly detection and segmentation

Jihun Yi and Sungroh Yoon. Patch svdd: Patch-level svdd for anomaly detection and segmentation. InAsian Conference on Computer Vision, pages 375–390, 2021

2021

[70] [70]

Same same but differnet: Semi- supervised defect detection with normalizing flows

Marco Rudolph, Bastian Wandt, and Bodo Rosenhahn. Same same but differnet: Semi- supervised defect detection with normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 1907–1916, 2021

1907

[71] [71]

Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

Jiawei Yu, Ye Zheng, Xiang Wang, Wei Li, Yushuang Wu, Rui Zhao, and Liwei Wu. Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

2021

[72] [72]

Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows

Denis Gudovskiy, Shun Ishizaka, and Kazuki Kozuka. Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 98–107, 2022

2022

[73] [73]

Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings

Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4183–4192, 2020

2020

[74] [74]

Student-teacher feature pyramid matching for anomaly detection

Guodong Wang, Shumin Han, Errui Ding, and Di Huang. Student-teacher feature pyramid matching for anomaly detection. InBritish Machine Vision Conference, 2021

2021

[75] [75]

Rohban, and Hamid R

Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad H. Rohban, and Hamid R. Rabiee. Multiresolution knowledge distillation for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14902–14912, 2021

2021

[76] [76]

Anomaly detection via reverse distillation from one-class embedding

Hanqiu Deng and Xingyu Li. Anomaly detection via reverse distillation from one-class embedding. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9737–9746, 2022

2022

[77] [77]

Cutpaste: Self-supervised learning for anomaly detection and localization

Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, and Tomas Pfister. Cutpaste: Self-supervised learning for anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9664–9674, 2021

2021

[78] [78]

Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz

Hannah M. Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz. Natural synthetic anomalies for self-supervised anomaly detection and localization, 2021

2021

[79] [79]

Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection

Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, and Ting Chen. Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3914–3923, 2023. 43

2023

[80] [80]

Learning transferable visual models from natural language supervision

Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763, 2021

2021