pith. sign in

arxiv: 2605.30581 · v2 · pith:VCXY7RQNnew · submitted 2026-05-28 · 💻 cs.CV · cs.AI· cs.RO

Prior Availability in Industrial Visual Sim-to-Real: A Review of CAD-Guided and CAD-Unavailable Regimes

Pith reviewed 2026-06-29 07:40 UTC · model grok-4.3

classification 💻 cs.CV cs.AIcs.RO
keywords sim-to-real transferindustrial inspectionCAD modelsdomain gapanomaly detectionpose estimationvisual inspectionprior availability
0
0 comments X

The pith

Industrial visual sim-to-real transfer should be organized by the availability of CAD and other priors rather than by task-specific leaderboards.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability instead of treating it as a uniform transfer from synthetic to real images. It distinguishes CAD-available settings, where explicit object geometry supports rendering, calibration, pose estimation, segmentation, and test-time verification, from CAD-unavailable settings that rely on normal-reference appearance, feature distributions, teacher-student residuals, or foundation features, plus boundary-prior cases that retain only partial geometric information. This taxonomy connects pose-estimation and detection literature with anomaly and surface-inspection work that are usually reviewed separately. Empirical anchors on T-LESS/BOP, MVTec AD, and VisA show that CAD render count alone does not determine success; source-distribution design, detector capacity, and small real calibration often matter more. The review concludes that deployment decisions should be grounded in the specific priors available rather than a single cross-task leaderboard.

Core claim

Organizing the domain gap by prior availability—CAD-available regimes where geometry enables multiple operations and verification channels, CAD-unavailable regimes where geometry is replaced by appearance or feature-based normality, and boundary-prior regimes that preserve only part of the CAD role—explains differences in how transfer is achieved and verified across industrial visual tasks.

What carries the argument

The taxonomy of CAD-available, CAD-unavailable, and boundary-prior regimes that organizes the domain-gap problem according to what geometric or appearance evidence is accessible before deployment.

Load-bearing premise

The distinctions between CAD-available, CAD-unavailable, and boundary-prior settings capture the essential differences that affect transfer performance in industrial settings.

What would settle it

An experiment that controls for render quality, data volume, and detector capacity and still finds transfer performance correlating more strongly with task category than with prior availability would undermine the proposed taxonomy.

Figures

Figures reproduced from arXiv: 2605.30581 by Chenxi Tao, Seung-Kyum Choi.

Figure 1
Figure 1. Figure 1: Prior-availability view of industrial visual sim-to-real. Source evidence may include [PITH_FULL_IMAGE:figures/full_fig_p003_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Mechanism view of the prior-availability distinction. CAD-available settings can use explicit [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Dataset regimes used to anchor Section 6. T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p022_3.png] view at source ↗
Figure 4
Figure 4. Figure 4: CAD-as-renderer transfer on real T-LESS images [ [PITH_FULL_IMAGE:figures/full_fig_p024_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: Precision-recall diagnostics for representative CAD-as-renderer detector runs on T [PITH_FULL_IMAGE:figures/full_fig_p025_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: CAD-at-test-time geometry on T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p027_6.png] view at source ↗
Figure 7
Figure 7. Figure 7: CAD-unavailable anomaly-detection anchors on MVTec AD [ [PITH_FULL_IMAGE:figures/full_fig_p030_7.png] view at source ↗
Figure 8
Figure 8. Figure 8: Category-level CAD-unavailable diagnostics for MVTec AD [ [PITH_FULL_IMAGE:figures/full_fig_p032_8.png] view at source ↗
Figure 9
Figure 9. Figure 9: Selected-category normal-reference budget curves for the CAD-unavailable branch. Patch [PITH_FULL_IMAGE:figures/full_fig_p034_9.png] view at source ↗
Figure 10
Figure 10. Figure 10: YOLOv8 [103] CAD-as-renderer training diagnostics for B0–B3 on T-LESS/BOP [3, 4]. Each row corresponds to one detector configuration from [PITH_FULL_IMAGE:figures/full_fig_p047_10.png] view at source ↗
Figure 11
Figure 11. Figure 11: YOLOv8 [103] CAD-as-renderer training diagnostics for B4–B6 on T-LESS/BOP [3, 4]. The same real-validation target is used across runs. B4 adds real background negatives to the randomized synthetic source, B5 increases detector capacity, and B6 combines the larger detector with small real calibration. The curves support the main-text interpretation that source design, model capacity, and limited real calib… view at source ↗
Figure 12
Figure 12. Figure 12: Representative real-validation prediction grids for the seven CAD-as-renderer detector [PITH_FULL_IMAGE:figures/full_fig_p049_12.png] view at source ↗
Figure 13
Figure 13. Figure 13: Normalized confusion matrices for the YOLOv8 [ [PITH_FULL_IMAGE:figures/full_fig_p050_13.png] view at source ↗
Figure 14
Figure 14. Figure 14: Training label distributions for the CAD-as-renderer detector runs on T-LESS/BOP [ [PITH_FULL_IMAGE:figures/full_fig_p051_14.png] view at source ↗
Figure 15
Figure 15. Figure 15: Additional CAD-at-test-time qualitative diagnostics. The contact sheet shows Mega [PITH_FULL_IMAGE:figures/full_fig_p052_15.png] view at source ↗
Figure 16
Figure 16. Figure 16: Additional CAD-unavailable qualitative diagnostics for the synthetic-anomaly probe. [PITH_FULL_IMAGE:figures/full_fig_p053_16.png] view at source ↗
read the original abstract

Industrial visual sim-to-real is often described as transferring from synthetic images to real images, but industrial deployment usually involves a broader mismatch between available evidence and required decisions. A system may be built from CAD renderings, simulated RGB-D observations, normal reference images, synthetic defects, pretrained feature spaces, or language prompts, yet deployed under different sensors, lighting, materials, fixtures, calibration, production variation, and rare defect modes. This review reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. We distinguish CAD-available settings, where explicit object geometry can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification; CAD-unavailable settings, where geometry is replaced by normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomaly assumptions, foundation features, or vision-language priors; and boundary-prior settings, where approximate models, templates, reference views, or semantic correspondences preserve only part of the CAD role. This framing connects CAD-based detection and 6D pose-estimation literature with industrial anomaly and surface-inspection literature that is usually reviewed separately. To make the taxonomy concrete, we use empirical anchors on T-LESS/BOP, MVTec AD, and VisA. The anchors show that CAD render count alone does not close transfer; source-distribution design, detector capacity, and small real calibration can matter more. They also show that CAD at test time creates a distinct verification channel through mask, pose, and depth consistency, whereas CAD-unavailable inspection relies on calibrated normality and feature deviation. The review therefore argues against a single cross-task leaderboard and instead asks what prior grounds the deployment decision.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

1 major / 2 minor

Summary. The paper is a literature review that reframes industrial visual sim-to-real as a domain-gap problem organized by prior availability. It distinguishes CAD-available regimes (explicit geometry supporting rendering, calibration, pose estimation, and geometric verification), CAD-unavailable regimes (relying on normal-reference appearance, feature distributions, teacher-student residuals, synthetic anomalies, foundation features, or vision-language priors), and boundary-prior regimes (approximate models or partial correspondences). The review connects 6D pose estimation and anomaly detection literatures, uses empirical anchors from T-LESS/BOP, MVTec AD, and VisA to illustrate that CAD render count alone does not close transfer gaps while source-distribution design, detector capacity, and small real calibration can matter more, and argues against a single cross-task leaderboard in favor of deployment decisions grounded in available priors.

Significance. If the taxonomy holds, the review supplies a useful organizational framework that bridges two subfields typically reviewed in isolation and shifts evaluation emphasis from generic leaderboards toward prior-specific verification channels (geometric consistency vs. calibrated normality). The observation that render count is not decisive and that small real calibration can be more impactful offers practical guidance for industrial deployment.

major comments (1)
  1. [Empirical anchors discussion (referenced in Abstract)] Empirical anchors (T-LESS/BOP for 6D pose vs. MVTec AD/VisA for anomaly detection): these datasets belong to disjoint problem classes with non-overlapping metrics, object sets, and protocols. The manuscript states that the anchors 'show that CAD render count alone does not close transfer' and that 'prior type drives performance,' yet no within-task controlled comparison is described that holds detector, training regime, and test distribution fixed while varying only prior availability. Consequently the observed differences could be attributable to task semantics rather than the taxonomy axis, weakening support for the central claim that the CAD-available / CAD-unavailable distinction precludes a single cross-task leaderboard.
minor comments (2)
  1. [Taxonomy definition] The boundary-prior category is introduced but its operational distinction from CAD-unavailable settings is illustrated only briefly; additional concrete examples (e.g., specific reference-view or template methods) would sharpen the taxonomy.
  2. [Introduction / Related work] The review cites standard datasets but does not report the search strategy, inclusion criteria, or total number of papers surveyed; adding a methods paragraph would improve reproducibility of the literature mapping.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for the constructive comments. We address the major comment on the empirical anchors below.

read point-by-point responses
  1. Referee: Empirical anchors (T-LESS/BOP for 6D pose vs. MVTec AD/VisA for anomaly detection): these datasets belong to disjoint problem classes with non-overlapping metrics, object sets, and protocols. The manuscript states that the anchors 'show that CAD render count alone does not close transfer' and that 'prior type drives performance,' yet no within-task controlled comparison is described that holds detector, training regime, and test distribution fixed while varying only prior availability. Consequently the observed differences could be attributable to task semantics rather than the taxonomy axis, weakening support for the central claim that the CAD-available / CAD-unavailable distinction precludes a single cross-task leaderboard.

    Authors: We acknowledge that a controlled experiment holding all factors fixed except prior availability would strengthen the causal interpretation. The manuscript uses the anchors to illustrate that in practice, across established benchmarks, factors like source distribution design and small real calibration appear more impactful than render count alone, and that the verification mechanisms differ fundamentally (geometric vs. normality-based). The argument against a single leaderboard rests primarily on these differing verification channels rather than on cross-task performance comparisons. We will add a clarification in the revised manuscript that the anchors are meant as illustrative examples from the respective literatures and do not constitute a controlled ablation across tasks. revision: partial

Circularity Check

0 steps flagged

No circularity: literature review with external anchors

full rationale

The paper is a literature review proposing a taxonomy (CAD-available / CAD-unavailable / boundary-prior) to organize existing sim-to-real work. It contains no equations, fitted parameters, predictions, or derivations. The central claim rests on cited empirical anchors (T-LESS/BOP, MVTec AD, VisA) drawn from the external literature rather than self-generated results or self-citation chains. No step reduces by construction to the paper's own inputs; the framework is an organizational lens applied to independent prior work.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The paper is a review and introduces no new free parameters, axioms beyond standard domain knowledge, or invented entities. The taxonomy itself is the contribution.

axioms (1)
  • domain assumption CAD models provide explicit object geometry that can support rendering, calibration, pose estimation, segmentation, and test-time geometric verification.
    Stated in the abstract as the basis for CAD-available settings.

pith-pipeline@v0.9.1-grok · 5840 in / 1111 out tokens · 26943 ms · 2026-06-29T07:40:11.491920+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

103 extracted references · 17 canonical work pages

  1. [1]

    Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022

    Fabio Muratore, Christian Eilers, Michael Gienger, and Jan Peters. Robot learning from randomized simulations: A review.Frontiers in Robotics and AI, 9:799893, 2022. doi: 10.3389/frobt.2022.799893

  2. [2]

    Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

    Wenshuai Zhao, Jorge Pena Queralta, and Tomi Westerlund. Sim-to-real transfer in deep reinforcement learning for robotics: A survey, 2020

  3. [3]

    T-less: An rgb-d dataset for 6d pose estimation of texture-less objects

    Tomas Hodan, Pavel Haluza, Stepan Obdrzalek, Jiri Matas, Manolis Lourakis, and Xenophon Zabulis. T-less: An rgb-d dataset for 6d pose estimation of texture-less objects. InIEEE Winter Conference on Applications of Computer Vision, 2017

  4. [4]

    Bop: Benchmark for 6d object pose estimation

    Tomas Hodan, Frank Michel, Eric Brachmann, Wadim Kehl, Anders Glent Buch, Dirk Kraft, Bertram Drost, Joel Vidal, Stefan Ihrke, Xenophon Zabulis, Caner Sahin, Fabian Manhardt, Federico Tombari, Tae-Kyun Kim, Jiri Matas, and Carsten Rother. Bop: Benchmark for 6d object pose estimation. InEuropean Conference on Computer Vision Workshops, 2018

  5. [5]

    Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection

    Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Mvtec ad: A compre- hensive real-world dataset for unsupervised anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019

  6. [6]

    Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022

    Yang Zou, Jongheon Jeong, Latha Pemula, Dongqing Zhang, and Onkar Dabeer. Spot- the-difference self-supervised pre-training for anomaly detection and segmentation, 2022. Introduces the VisA dataset

  7. [7]

    Malamas, Euripides G

    Elias N. Malamas, Euripides G. M. Petrakis, Michalis Zervakis, Laurent Petit, and Jean-Didier Legat. A survey on industrial vision systems, applications and tools.Image and Vision Computing, 21(2):171–188, 2003

  8. [8]

    A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023

    Jan Diers and Christian Pigorsch. A survey of methods for automated quality control based on images.International Journal of Computer Vision, 131:2553–2581, 2023. doi: 10.1007/s11263-023-01822-w

  9. [9]

    Farahani, and Thorsten Wuest

    Milica Babic, Mojtaba A. Farahani, and Thorsten Wuest. Image based quality inspection in smart manufacturing systems: A literature review.Procedia CIRP, 103:262–267, 2021. doi: 10.1016/j.procir.2021.10.042

  10. [10]

    Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020

    Tamas Czimmermann, Gastone Ciuti, Matteo Milazzo, Matteo Chiurazzi, Stefano Roccella, CalogeroM.Oddo, andPaoloDario. Visual-baseddefectdetectionandclassificationapproaches for industrial applications–a survey.Sensors, 20(5):1459, 2020. doi: 10.3390/s20051459

  11. [11]

    Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022

    Hafiz Mughees Ahmad and Afshin Rahimi. Deep learning methods for object detection in smart manufacturing: A survey.Journal of Manufacturing Systems, 64:181–196, 2022. doi: 10.1016/j.jmsy.2022.06.011

  12. [12]

    Luis Perez, Inigo Rodriguez, Nuria Rodriguez, Ruben Usamentiaga, and Daniel F. Garcia. Robot guidance using machine vision techniques in industrial environments: A comparative review.Sensors, 16(3):335, 2016. doi: 10.3390/s16030335

  13. [13]

    Domain Randomization for Transferring Deep Neural Networks from Simulation to the Real World,

    Josh Tobin, Rachel Fong, Alex Ray, Jonas Schneider, Wojciech Zaremba, and Pieter Abbeel. Domain randomization for transferring deep neural networks from simulation to the real world. 38 InIEEE/RSJ International Conference on Intelligent Robots and Systems, pages 23–30, 2017. doi: 10.1109/IROS.2017.8202133

  14. [14]

    Blenderproc2: A procedural pipeline for photorealistic rendering,

    Maximilian Denninger, Dominik Winkelbauer, Martin Sundermeyer, Wout Boerdijk, Markus Knauer, Klaus H. Strobl, Matthias Humt, and Rudolph Triebel. Blenderproc2: A procedural pipeline for photorealistic rendering.Journal of Open Source Software, 8(82):4901, 2023. doi: 10.21105/joss.04901

  15. [15]

    Model globally, match locally: Efficient and robust 3d object recognition

    Bertram Drost, Markus Ulrich, Nassir Navab, and Slobodan Ilic. Model globally, match locally: Efficient and robust 3d object recognition. InIEEE Conference on Computer Vision and Pattern Recognition, pages 998–1005, 2010

  16. [16]

    Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes

    Stefan Hinterstoisser, Vincent Lepetit, Slobodan Ilic, Stefan Holzer, Gary Bradski, Kurt Konolige, and Nassir Navab. Model based training, detection and pose estimation of texture- less 3d objects in heavily cluttered scenes. InAsian Conference on Computer Vision, pages 548–562, 2012

  17. [17]

    Megapose: 6d pose estimation of novel objects via render and compare

    Yann Labbe, Lucas Manuelli, Arsalan Mousavian, Stephen Tyree, Stan Birchfield, Jonathan Tremblay, Justin Carpentier, Mathieu Aubry, Dieter Fox, and Josef Sivic. Megapose: 6d pose estimation of novel objects via render and compare. InConference on Robot Learning, 2023

  18. [18]

    Foundationpose: Unified 6d pose estimation and tracking of novel objects

    Bowen Wen, Wei Yang, Jan Kautz, and Stan Birchfield. Foundationpose: Unified 6d pose estimation and tracking of novel objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

  19. [19]

    Foundpose: Unseen object pose estimation with foundation features, 2024

    Evin Pinar Ornek, Yann Labbe, Bugra Tekin, Lingni Ma, Cem Keskin, Christian Forster, and Tomas Hodan. Foundpose: Unseen object pose estimation with foundation features, 2024. ECCV 2024

  20. [20]

    Sam-6d: Segment anything model meets zero-shot 6d object pose estimation

    Jiehong Lin, Lihua Liu, Dekun Lu, and Kui Jia. Sam-6d: Segment anything model meets zero-shot 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2024

  21. [21]

    Gigapose: Fast and robust novel object pose estimation via one correspondence

    Van Nguyen Nguyen, Thibault Groueix, Mathieu Salzmann, and Vincent Lepetit. Gigapose: Fast and robust novel object pose estimation via one correspondence. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9903–9913, 2024

  22. [22]

    Towards total recall in industrial anomaly detection

    Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Scholkopf, Thomas Brox, and Peter Gehler. Towards total recall in industrial anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022

  23. [23]

    Efficientad: Accurate visual anomaly detection at millisecond-level latencies

    Kilian Batzner, Lars Heckler, and Rebecca Konig. Efficientad: Accurate visual anomaly detection at millisecond-level latencies. InIEEE/CVF Winter Conference on Applications of Computer Vision, 2024

  24. [24]

    Draem: A discriminatively trained recon- struction embedding for surface anomaly detection

    Vitjan Zavrtanik, Matej Kristan, and Danijel Skocaj. Draem: A discriminatively trained recon- struction embedding for surface anomaly detection. InIEEE/CVF International Conference on Computer Vision, 2021

  25. [25]

    Simplenet: A simple network for image anomaly detection and localization

    Zhikang Liu, Yiming Zhou, Yuansheng Xu, and Zilei Wang. Simplenet: A simple network for image anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023. 39

  26. [26]

    Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

    Blaz Rolih, Matic Fucka, and Danijel Skocaj. Supersimplenet: Unifying unsupervised and supervised learning for fast and reliable surface defect detection, 2024

  27. [27]

    Winclip: Zero-/few-shot anomaly classification and segmentation

    Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, and Onkar Dabeer. Winclip: Zero-/few-shot anomaly classification and segmentation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023

  28. [28]

    Dinov2: Learning robust visual features without supervision, 2023

    Maxime Oquab, Timothee Darcet, Theo Moutakanni, Huy Vo, Marc Szafraniec, Vasil Khalidov, Pierre Fernandez, Daniel Haziza, Francisco Massa, Alaaeldin El-Nouby, et al. Dinov2: Learning robust visual features without supervision, 2023

  29. [29]

    Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025

    Simon Damm, Mike Laszkiewicz, Johannes Lederer, and Asja Fischer. Anomalydino: Boosting patch-based few-shot anomaly detection with dinov2, 2025. WACV 2025

  30. [30]

    Cad2rl: Real single-image flight without a single real image

    Fereshteh Sadeghi and Sergey Levine. Cad2rl: Real single-image flight without a single real image. InRobotics: Science and Systems, 2017

  31. [31]

    Deep object pose estimation for semantic robotic grasping of household objects

    Jonathan Tremblay, Thang To, Balakumar Sundaralingam, Yu Xiang, Dieter Fox, and Stan Birchfield. Deep object pose estimation for semantic robotic grasping of household objects. In Conference on Robot Learning, pages 306–316, 2018

  32. [32]

    Falling things: A synthetic dataset for 3d object detection and pose estimation

    Jonathan Tremblay, Aayush Prakash, David Acuna, Mark Brophy, Varun Jampani, Cem Anil, Thang To, Eric Cameracci, Shaad Boochoon, and Stan Birchfield. Falling things: A synthetic dataset for 3d object detection and pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2018

  33. [33]

    Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again

    Wadim Kehl, Fabian Manhardt, Federico Tombari, Slobodan Ilic, and Nassir Navab. Ssd-6d: Making rgb-based 3d detection and 6d pose estimation great again. InIEEE International Conference on Computer Vision, pages 1530–1538, 2017

  34. [34]

    Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes

    Yu Xiang, Tanner Schmidt, Venkatraman Narayanan, and Dieter Fox. Posecnn: A convolu- tional neural network for 6d object pose estimation in cluttered scenes. InRobotics: Science and Systems, 2018

  35. [35]

    Densefusion: 6d object pose estimation by iterative dense fusion

    Chen Wang, Danfei Xu, Yuke Zhu, Roberto Martin-Martin, Cewu Lu, Li Fei-Fei, and Silvio Savarese. Densefusion: 6d object pose estimation by iterative dense fusion. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3343–3352, 2019

  36. [36]

    Pvnet: Pixel-wise voting network for 6dof pose estimation

    Sida Peng, Yuan Liu, Qixing Huang, Xiaowei Zhou, and Hujun Bao. Pvnet: Pixel-wise voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4561–4570, 2019

  37. [37]

    Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation

    Yisheng He, Wei Sun, Haibin Huang, Jianran Liu, Haoqiang Fan, and Jian Sun. Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11632–11641, 2020

  38. [38]

    Epos: Estimating 6d pose of objects with symmetries

    Tomas Hodan, Daniel Barath, and Jiri Matas. Epos: Estimating 6d pose of objects with symmetries. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11703–11712, 2020

  39. [39]

    Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings

    Rasmus Laurvig Haugaard and Anders Glent Buch. Surfemb: Dense and continuous correspon- dence distributions for object pose estimation with learnt surface embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6749–6758, 2022. 40

  40. [40]

    Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation

    Yongzhi Su, Mohamed Saleh, Thomas Fetzer, Jason Rambach, Nassir Navab, Benjamin Busam, Didier Stricker, and Federico Tombari. Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6738–6748, 2022

  41. [41]

    Cnos: A strong baseline for cad-based novel object segmentation

    Van Nguyen Nguyen, Thibault Groueix, Georgy Ponimatkin, Vincent Lepetit, and Tomas Hodan. Cnos: A strong baseline for cad-based novel object segmentation. InIEEE/CVF International Conference on Computer Vision Workshops, 2023

  42. [42]

    Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

    Sungmin Cho, Sungbum Park, and Insoo Oh. Muse: Model-based uncertainty-aware similarity estimation for zero-shot 2d object detection and segmentation, 2025

  43. [43]

    Bop challenge 2024 on model-based and model-free 6d object pose estimation, 2025

    Van Nguyen Nguyen, Stephen Tyree, Andrew Guo, Mederic Fourmy, Anas Gouda, Taeyeop Lee, Sungphill Moon, Hyeontae Son, Lukas Ranftl, Jonathan Tremblay, Eric Brachmann, Bertram Drost, Vincent Lepetit, Carsten Rother, Stan Birchfield, Jiri Matas, Yann Labbe, Martin Sundermeyer, and Tomas Hodan. Bop challenge 2024 on model-based and model-free 6d object pose e...

  44. [44]

    Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025

    Andrea Caraffa, Davide Boscaini, and Fabio Poiesi. Accurate and efficient zero-shot 6d pose estimation with frozen foundation models, 2025. FreeZeV2

  45. [45]

    Shaffer, and Stephen Gould

    Weijian Deng, Dylan Campbell, Chunyi Sun, Jiahao Zhang, Shubham Kanitkar, Matthew E. Shaffer, and Stephen Gould. Pos3r: 6d pose estimation for unseen objects made easy. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16818–16828, 2025

  46. [46]

    Learning from simulated and unsupervised images through adversarial training

    Ashish Shrivastava, Tomas Pfister, Oncel Tuzel, Josh Susskind, Wenda Wang, and Russell Webb. Learning from simulated and unsupervised images through adversarial training. In IEEE Conference on Computer Vision and Pattern Recognition, pages 2107–2116, 2017

  47. [47]

    Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

    Yaroslav Ganin, Evgeniya Ustinova, Hana Ajakan, Pascal Germain, Hugo Larochelle, Francois Laviolette, Mario Marchand, and Victor Lempitsky. Domain-adversarial training of neural networks.Journal of Machine Learning Research, 17(59):1–35, 2016

  48. [48]

    Adversarial discriminative domain adaptation

    Eric Tzeng, Judy Hoffman, Kate Saenko, and Trevor Darrell. Adversarial discriminative domain adaptation. InIEEE Conference on Computer Vision and Pattern Recognition, pages 7167–7176, 2017

  49. [49]

    Efros, and Trevor Darrell

    Judy Hoffman, Eric Tzeng, Taesung Park, Jun-Yan Zhu, Phillip Isola, Kate Saenko, Alexei A. Efros, and Trevor Darrell. Cycada: Cycle-consistent adversarial domain adaptation. In International Conference on Machine Learning, pages 1989–1998, 2018

  50. [50]

    Sim-to-real transfer of robotic control with dynamics randomization

    Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, and Pieter Abbeel. Sim-to-real transfer of robotic control with dynamics randomization. InIEEE International Conference on Robotics and Automation, pages 3803–3810, 2018

  51. [51]

    Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks

    Stephen James, Paul Wohlhart, Mrinal Kalakrishnan, Dmitry Kalashnikov, Alex Irpan, Julian Ibarz, Sergey Levine, Raia Hadsell, and Konstantinos Bousmalis. Sim-to-real via sim-to- sim: Data-efficient robotic grasping via randomized-to-canonical adaptation networks. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12627–12637, 2019. 41

  52. [52]

    Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects

    Martin Sundermeyer, Tomas Hodan, Yann Labbe, Gu Wang, Eric Brachmann, Bertram Drost, Carsten Rother, and Jiri Matas. Bop challenge 2022 on detection, segmentation and pose estimation of specific rigid objects. InIEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2023

  53. [53]

    6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021

    Zaixing He, Wuxi Feng, Xinyue Zhao, and Yongfeng Lv. 6d pose estimation of objects: Recent technologies and challenges.Applied Sciences, 11(1):228, 2021. doi: 10.3390/app11010228

  54. [54]

    A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024

    Jian Guan, Yingming Hao, Qingxiao Wu, Sicong Li, and Yingjian Fang. A survey of 6dof object pose estimation methods for different application scenarios.Sensors, 24(4):1076, 2024. doi: 10.3390/s24041076

  55. [55]

    Dpod: 6d pose object detector and refiner

    Sergey Zakharov, Ivan Shugurov, and Slobodan Ilic. Dpod: 6d pose object detector and refiner. InIEEE/CVF International Conference on Computer Vision, pages 1941–1950, 2019

  56. [56]

    Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation

    Zhigang Li, Gu Wang, and Xiangyang Ji. Cdpn: Coordinates-based disentangled pose network for real-time rgb-based 6-dof object pose estimation. InIEEE/CVF International Conference on Computer Vision, pages 7678–7687, 2019

  57. [57]

    Cosypose: Consistent multi-view multi-object 6d pose estimation

    Yann Labbe, Justin Carpentier, Mathieu Aubry, and Josef Sivic. Cosypose: Consistent multi-view multi-object 6d pose estimation. InEuropean Conference on Computer Vision, pages 574–591, 2020

  58. [58]

    Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation

    Gu Wang, Fabian Manhardt, Federico Tombari, and Xiangyang Ji. Gdr-net: Geometry-guided direct regression network for monocular 6d object pose estimation. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 16611–16621, 2021

  59. [59]

    Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick

    Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. Segment anything. InIEEE/CVF International Conference on Computer Vision, pages 4015–4026, 2023

  60. [60]

    Anomaly Detection: A Survey,

    Varun Chandola, Arindam Banerjee, and Vipin Kumar. Anomaly detection: A survey.ACM Computing Surveys, 41(3):1–58, 2009. doi: 10.1145/1541880.1541882

  61. [61]

    Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021

    Guansong Pang, Chunhua Shen, Longbing Cao, and Anton van den Hengel. Deep learning for anomaly detection: A review.ACM Computing Surveys, 54(2):1–38, 2021. doi: 10.1145/ 3439950

  62. [62]

    IEEE 109, 5 (May 2021), 756–795

    Lukas Ruff, Jacob R. Kauffmann, Robert A. Vandermeulen, Gregoire Montavon, Wojciech Samek, Marius Kloft, Thomas G. Dietterich, and Klaus-Robert Muller. A unifying review of deep and shallow anomaly detection.Proceedings of the IEEE, 109(5):756–795, 2021. doi: 10.1109/JPROC.2021.3052449

  63. [63]

    A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

    Qifan Jin and Li Chen. A survey of surface defect detection of industrial products based on a small number of labeled data, 2022

  64. [64]

    Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022

    Xian Tao, Xinyi Gong, Xin Zhang, Shaohua Yan, and Chandranath Adak. Deep learning for unsupervised anomaly localization in industrial images: A survey.IEEE Transactions on Instrumentation and Measurement, 71:1–21, 2022. doi: 10.1109/TIM.2022.3196436. 42

  65. [65]

    Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024

    Jiaqi Liu, Guoyang Xie, Jinbao Wang, Shangnian Li, Chengjie Wang, Feng Zheng, and Yaochu Jin. Deep industrial image anomaly detection: A survey.Machine Intelligence Research, 21: 104–135, 2024. doi: 10.1007/s11633-023-1459-z

  66. [66]

    Deep one-class classification

    Lukas Ruff, Robert Vandermeulen, Nico Goernitz, Lucas Deecke, Shoaib Ahmed Siddiqui, Alexander Binder, Emmanuel Muller, and Marius Kloft. Deep one-class classification. In International Conference on Machine Learning, pages 4393–4402, 2018

  67. [67]

    Sub-image anomaly detection with deep pyramid correspon- dences, 2020

    Niv Cohen and Yedid Hoshen. Sub-image anomaly detection with deep pyramid correspon- dences, 2020

  68. [68]

    Padim: A patch distribution modeling framework for anomaly detection and localization

    Thomas Defard, Aleksandr Setkov, Angelique Loesch, and Romaric Audigier. Padim: A patch distribution modeling framework for anomaly detection and localization. InInternational Conference on Pattern Recognition Workshops, pages 475–489, 2021

  69. [69]

    Patch svdd: Patch-level svdd for anomaly detection and segmentation

    Jihun Yi and Sungroh Yoon. Patch svdd: Patch-level svdd for anomaly detection and segmentation. InAsian Conference on Computer Vision, pages 375–390, 2021

  70. [70]

    Same same but differnet: Semi- supervised defect detection with normalizing flows

    Marco Rudolph, Bastian Wandt, and Bodo Rosenhahn. Same same but differnet: Semi- supervised defect detection with normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 1907–1916, 2021

  71. [71]

    Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

    Jiawei Yu, Ye Zheng, Xiang Wang, Wei Li, Yushuang Wu, Rui Zhao, and Liwei Wu. Fastflow: Unsupervised anomaly detection and localization via 2d normalizing flows, 2021

  72. [72]

    Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows

    Denis Gudovskiy, Shun Ishizaka, and Kazuki Kozuka. Cflow-ad: Real-time unsupervised anomaly detection with localization via conditional normalizing flows. InIEEE/CVF Winter Conference on Applications of Computer Vision, pages 98–107, 2022

  73. [73]

    Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings

    Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4183–4192, 2020

  74. [74]

    Student-teacher feature pyramid matching for anomaly detection

    Guodong Wang, Shumin Han, Errui Ding, and Di Huang. Student-teacher feature pyramid matching for anomaly detection. InBritish Machine Vision Conference, 2021

  75. [75]

    Rohban, and Hamid R

    Mohammadreza Salehi, Niousha Sadjadi, Soroosh Baselizadeh, Mohammad H. Rohban, and Hamid R. Rabiee. Multiresolution knowledge distillation for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14902–14912, 2021

  76. [76]

    Anomaly detection via reverse distillation from one-class embedding

    Hanqiu Deng and Xingyu Li. Anomaly detection via reverse distillation from one-class embedding. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9737–9746, 2022

  77. [77]

    Cutpaste: Self-supervised learning for anomaly detection and localization

    Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, and Tomas Pfister. Cutpaste: Self-supervised learning for anomaly detection and localization. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9664–9674, 2021

  78. [78]

    Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz

    Hannah M. Schlueter, Jeremy Tan, Benjamin Hou, and Bernhard Kainz. Natural synthetic anomalies for self-supervised anomaly detection and localization, 2021

  79. [79]

    Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection

    Xuan Zhang, Shiyu Li, Xi Li, Ping Huang, Jiulong Shan, and Ting Chen. Destseg: Segmenta- tion guided denoising student-teacher for anomaly detection. InIEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3914–3923, 2023. 43

  80. [80]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, Gretchen Krueger, and Ilya Sutskever. Learning transferable visual models from natural language supervision. In International Conference on Machine Learning, pages 8748–8763, 2021

Showing first 80 references.