pith. machine review for the scientific record. sign in

arxiv: 2605.05850 · v1 · submitted 2026-05-07 · 💻 cs.CV

Recognition: unknown

Align3D-AD: Cross-Modal Feature Alignment and Dual-Prompt Learning for Zero-shot 3D Anomaly Detection

Authors on Pith no claims yet

Pith reviewed 2026-05-09 16:30 UTC · model grok-4.3

classification 💻 cs.CV
keywords zero-shot 3D anomaly detectioncross-modal feature alignmentdual-prompt learninganomaly detection3D visionprompt alignmentmulti-modal learning
0
0 comments X

The pith

Align3D-AD bridges the domain gap in zero-shot 3D anomaly detection by mapping rendering features to RGB semantics via auxiliary categories and applying dual-prompt contrastive alignment.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Zero-shot 3D anomaly detection must spot defects in objects from categories never seen during training. Standard pipelines project 3D scans into multi-view images and feed them to encoders pretrained on ordinary photos, yet the resulting representations lack realistic visual semantics and create a persistent mismatch. The paper shows that RGB images from auxiliary categories can serve as explicit guidance to pull the 3D rendering features into the same semantic space, with an added reweighting step that emphasizes regions of high holistic consistency. A second stage then learns separate prompts for the aligned RGB-like features and the raw rendering features, aligning them contrastively so each modality supplies complementary cues. If this holds, systems could inspect new 3D parts using only general RGB knowledge and related auxiliary examples, cutting the need for category-specific 3D training sets.

Core claim

Align3D-AD is a two-stage framework that first maps 3D rendering features into the RGB semantic space using auxiliary-category RGB observations and a semantic consistency reweighting strategy, then applies modality-aware prompt learning with dual-prompt contrastive alignment to capture complementary semantics and raise discriminability. The approach requires no training data from the target 3D categories and produces direct semantic transfer rather than implicit reliance on pretrained encoders.

What carries the argument

Cross-modal feature alignment that transfers semantics from auxiliary RGB data into 3D rendering features, combined with modality-aware dual-prompt contrastive alignment.

Load-bearing premise

Auxiliary RGB categories supply enough semantic overlap to map 3D rendering features into the pretrained RGB encoder space without bias or overfitting in the dual-prompt stage.

What would settle it

If Align3D-AD shows no gain over baselines on a new 3D anomaly set whose auxiliary categories share no visual or semantic similarity with the targets, the cross-modal mapping claim would be refuted.

Figures

Figures reproduced from arXiv: 2605.05850 by Chengyu Tao, Juan Du, Letian Bai, Xuanming Cao.

Figure 1
Figure 1. Figure 1: Motivation of Align3D-AD. (a) An overview of the existing framework for zero-shot 3D view at source ↗
Figure 2
Figure 2. Figure 2: Overall framework of Align3D-AD. Our model consists of two stages: Cross-Modal Feature view at source ↗
Figure 3
Figure 3. Figure 3: Qualitative visualization on MVTec3D-AD and Eyecandies. view at source ↗
Figure 4
Figure 4. Figure 4: Visualization of the rendered multi-view observations on MVTec3D-AD, including both view at source ↗
Figure 5
Figure 5. Figure 5: Visualization of the rendered multi-view observations on Eyecandies, including both view at source ↗
Figure 6
Figure 6. Figure 6: Visualization of semantic consistency weights on rendering images. Higher weights are view at source ↗
Figure 7
Figure 7. Figure 7: Impact of number of views on anomaly detection performance view at source ↗
Figure 9
Figure 9. Figure 9: Failure cases of Align3D-AD on MVTec3D-AD. view at source ↗
Figure 10
Figure 10. Figure 10: Visualization of anomaly score maps on MVTec3D-AD. view at source ↗
Figure 11
Figure 11. Figure 11: Visualization of anomaly score maps on Eyecandies. view at source ↗
read the original abstract

Zero-shot 3D anomaly detection aims to identify anomalies without access to training data from target categories. However, existing methods mainly rely on projecting 3D observations into multi-view representations that primarily capture geometric cues rather than realistic visual semantics and process them with vision encoders pretrained on RGB data, leading to a significant domain gap between the encoder and the projected representations. To address this issue, we propose Align3D-AD, a unified two-stage framework that leverages the RGB modality from auxiliary categories as cross-modal guidance for zero-shot 3D anomaly detection. First, we introduce a cross-modal feature alignment paradigm that maps rendering features into the RGB semantic space. Unlike prior works that implicitly rely on pretrained encoders, our method enables direct semantic transfer from RGB observations. A semantic consistency reweighting strategy is further introduced to refine feature alignment by reweighting local regions according to holistic semantic consistency. Second, we propose a modality-aware prompt learning framework with dual-prompt contrastive alignment. By assigning independent prompts to RGB-aligned and rendering features, our method captures complementary semantics across modalities, while the contrastive alignment further enhances prompt representations to improve discriminability. Extensive experiments on MVTec3D-AD, Eyecandies, and Real3D-AD demonstrate that Align3D-AD consistently outperforms existing zero-shot methods under both one-vs-rest and cross-dataset settings, highlighting its generalization capability and robustness. Code and the dataset will be made available once our paper is accepted.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Align3D-AD, a two-stage framework for zero-shot 3D anomaly detection. The first stage performs cross-modal feature alignment by mapping 3D rendering features into RGB semantic space using auxiliary RGB categories from non-target classes, incorporating a semantic consistency reweighting strategy to refine local regions. The second stage employs a modality-aware prompt learning framework with dual-prompt contrastive alignment to capture complementary semantics between RGB-aligned and rendering features. The authors report that this approach consistently outperforms existing zero-shot methods on MVTec3D-AD, Eyecandies, and Real3D-AD under both one-vs-rest and cross-dataset protocols.

Significance. If the reported gains hold under rigorous verification, the work could advance zero-shot 3D anomaly detection by providing an explicit mechanism to bridge the domain gap between geometric renderings and RGB-pretrained encoders via auxiliary data. The dual-prompt contrastive alignment offers a structured way to exploit complementary modality information, and the commitment to release code supports reproducibility.

major comments (2)
  1. [Method (cross-modal feature alignment and semantic consistency reweighting)] The cross-modal feature alignment stage (described in the abstract and method overview) maps rendering features using auxiliary-category RGB data followed by semantic consistency reweighting of local regions according to holistic semantic consistency. This reweighting risks suppressing local anomaly cues that deviate from the auxiliary RGB distribution, potentially turning the alignment into semantic smoothing that benefits normal samples more than anomalies. This is load-bearing for the central claim that the pipeline enables true cross-modal transfer of anomaly-discriminative semantics; targeted ablations measuring the reweighting's effect on anomaly versus normal patch discriminability (e.g., via feature distance or detection AUC breakdowns) are required to substantiate the domain-bridging benefit.
  2. [Experiments and results] The abstract claims consistent outperformance over existing zero-shot methods in one-vs-rest and cross-dataset settings on three datasets, but the provided description does not detail the precise baselines, hyperparameter selection protocol, or statistical tests for the reported gains. Without these, it is unclear whether the improvements stem from the proposed alignment and dual-prompt components or from implementation choices that could be replicated by prompt learning alone.
minor comments (1)
  1. The abstract states that code and the dataset will be made available upon acceptance; confirm that the released auxiliary-category RGB data and rendering pipelines are fully documented to enable exact reproduction of the alignment stage.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed feedback on our manuscript. We address each major comment point by point below, providing clarifications and committing to specific revisions to strengthen the presentation and empirical support.

read point-by-point responses
  1. Referee: [Method (cross-modal feature alignment and semantic consistency reweighting)] The cross-modal feature alignment stage (described in the abstract and method overview) maps rendering features using auxiliary-category RGB data followed by semantic consistency reweighting of local regions according to holistic semantic consistency. This reweighting risks suppressing local anomaly cues that deviate from the auxiliary RGB distribution, potentially turning the alignment into semantic smoothing that benefits normal samples more than anomalies. This is load-bearing for the central claim that the pipeline enables true cross-modal transfer of anomaly-discriminative semantics; targeted ablations measuring the reweighting's effect on anomaly versus normal patch discriminability (e.g., via feature distance or detection AUC breakdowns) are required to substantiate the domain-bridging benefit.

    Authors: We appreciate the referee's concern that the semantic consistency reweighting could inadvertently suppress anomaly cues. The reweighting is computed from the agreement between local rendering features and the global RGB semantic embedding derived from auxiliary categories; normal regions receive higher weights to tighten alignment, while anomalous regions, by construction, exhibit lower consistency scores and thus retain relatively higher residual deviation after alignment. This design aims to transfer anomaly-discriminative semantics rather than smooth them away. To directly substantiate the claim, we will add targeted ablations in the revised manuscript: (i) cosine-distance histograms between aligned and RGB features for anomaly versus normal patches, and (ii) per-class AUC breakdowns with and without the reweighting module. These results will be reported in a new table and discussed in Section 4. revision: yes

  2. Referee: [Experiments and results] The abstract claims consistent outperformance over existing zero-shot methods in one-vs-rest and cross-dataset settings on three datasets, but the provided description does not detail the precise baselines, hyperparameter selection protocol, or statistical tests for the reported gains. Without these, it is unclear whether the improvements stem from the proposed alignment and dual-prompt components or from implementation choices that could be replicated by prompt learning alone.

    Authors: We apologize for insufficient detail in the initial submission. The baselines are exactly the zero-shot 3D anomaly detection methods listed in Section 4.1 (with citations and implementation references). Hyperparameters for all methods, including our dual-prompt learning, were chosen via a fixed protocol: a 20% validation split from the training set of each dataset, with grid search over learning rate, prompt length, and temperature; the selected values are reported in the supplementary material. To address the concern about attribution, we will expand Section 4.2 with (i) an explicit table of all hyperparameter values, (ii) a description of the baseline re-implementations, and (iii) statistical significance results (paired Wilcoxon tests across 5 random seeds) comparing Align3D-AD to the strongest baseline. These additions will clarify that the reported gains arise from the cross-modal alignment and dual-prompt contrastive objectives rather than generic prompt tuning. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical method without self-referential derivations

full rationale

The paper introduces Align3D-AD as a two-stage framework: cross-modal feature alignment (mapping rendering features to RGB space via auxiliary categories plus semantic consistency reweighting) followed by dual-prompt contrastive alignment for zero-shot 3D anomaly detection. The provided text contains no equations, no fitted parameters presented as predictions, no uniqueness theorems, and no self-citations that bear the central claim. Performance claims rest on external experiments across MVTec3D-AD, Eyecandies, and Real3D-AD under one-vs-rest and cross-dataset protocols rather than any internal reduction to inputs by construction. The approach applies standard contrastive and prompt-learning techniques to a new setting without self-definitional loops or load-bearing author citations.

Axiom & Free-Parameter Ledger

0 free parameters · 2 axioms · 0 invented entities

The framework assumes that pretrained RGB vision encoders provide a reliable semantic space for alignment and that auxiliary categories share sufficient semantic overlap with target categories to enable effective transfer. No explicit free parameters are detailed in the abstract, but the prompt learning involves learned parameters.

axioms (2)
  • domain assumption Pretrained vision encoders on RGB data capture transferable semantic information.
    The method relies on this to map rendering features into RGB semantic space.
  • domain assumption Auxiliary categories provide useful cross-modal guidance for target categories.
    Used in the cross-modal feature alignment paradigm.

pith-pipeline@v0.9.0 · 5574 in / 1376 out tokens · 34495 ms · 2026-05-09T16:30:31.731794+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 6 canonical work pages · 1 internal anchor

  1. [1]

    Anomaly detection in 3d point clouds using deep geometric descriptors

    Paul Bergmann and David Sattlegger. Anomaly detection in 3d point clouds using deep geometric descriptors. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2613–2623, 2023

  2. [2]

    Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings

    Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4183–4192, 2020

  3. [3]

    The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization.arXiv preprint arXiv:2112.09045, 2021

    Paul Bergmann, Xin Jin, David Sattlegger, and Carsten Steger. The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization.arXiv preprint arXiv:2112.09045, 2021

  4. [4]

    The eyecandies dataset for unsupervised multimodal anomaly detection and localization

    Luca Bonfiglioli, Marco Toschi, Davide Silvestri, Nicola Fioraio, and Daniele De Gregorio. The eyecandies dataset for unsupervised multimodal anomaly detection and localization. In Proceedings of the Asian Conference on Computer Vision, pages 3586–3602, 2022

  5. [5]

    Anomaly detection under distribution shift

    Tri Cao, Jiawen Zhu, and Guansong Pang. Anomaly detection under distribution shift. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6511–6523, 2023

  6. [6]

    Iaenet: An importance-aware ensemble model for 3d point cloud-based anomaly detection.Information Fusion, page 104097, 2025

    Xuanming Cao, Chengyu Tao, Yifeng Cheng, and Juan Du. Iaenet: An importance-aware ensemble model for 3d point cloud-based anomaly detection.Information Fusion, page 104097, 2025

  7. [7]

    Complementary pseudo multimodal feature for point cloud anomaly detection.Pattern Recognition, 156:110761, 2024

    Yunkang Cao, Xiaohao Xu, and Weiming Shen. Complementary pseudo multimodal feature for point cloud anomaly detection.Pattern Recognition, 156:110761, 2024

  8. [8]

    Adaclip: Adapting clip with hybrid learnable prompts for zero-shot anomaly detection

    Yunkang Cao, Jiangning Zhang, Luca Frittoli, Yuqi Cheng, Weiming Shen, and Giacomo Boracchi. Adaclip: Adapting clip with hybrid learnable prompts for zero-shot anomaly detection. InEuropean Conference on Computer Vision, pages 55–72. Springer, 2024

  9. [9]

    Toward zero-shot point cloud anomaly detection: a multiview projection framework.IEEE Transactions on Systems, Man, and Cybernetics: Systems, 56(3):1747–1760, 2026

    Yuqi Cheng, Yunkang Cao, Guoyang Xie, Zhichao Lu, and Weiming Shen. Toward zero-shot point cloud anomaly detection: a multiview projection framework.IEEE Transactions on Systems, Man, and Cybernetics: Systems, 56(3):1747–1760, 2026. doi: 10.1109/TSMC.2025. 3648581

  10. [10]

    Shape-guided dual-memory learning for 3d anomaly detection

    Yu-Min Chu, Chieh Liu, Ting-I Hsieh, Hwann-Tzong Chen, and Tyng-Luh Liu. Shape-guided dual-memory learning for 3d anomaly detection. InProceedings of the 40th International Conference on Machine Learning, pages 6185–6194, 2023

  11. [11]

    Sub-image anomaly detection with deep pyramid correspon- dences.arXiv preprint arXiv:2005.02357, 2020

    Niv Cohen and Yedid Hoshen. Sub-image anomaly detection with deep pyramid correspon- dences.arXiv preprint arXiv:2005.02357, 2020

  12. [12]

    Gs-clip: Zero-shot 3d anomaly detection by geometry- aware prompt and synergistic view representation learning.arXiv preprint arXiv:2602.19206, 2026

    Zehao Deng, An Liu, and Yan Wang. Gs-clip: Zero-shot 3d anomaly detection by geometry- aware prompt and synergistic view representation learning.arXiv preprint arXiv:2602.19206, 2026

  13. [13]

    3d vision-based anomaly detection in manufacturing: A survey.Frontiers of Engineering Management, 12(2):343–360, 2025

    Juan Du, Chengyu Tao, Xuanming Cao, and Fugee Tsung. 3d vision-based anomaly detection in manufacturing: A survey.Frontiers of Engineering Management, 12(2):343–360, 2025. 10

  14. [14]

    Filo: Zero-shot anomaly detection by fine-grained description and high-quality localization

    Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Hao Li, Ming Tang, and Jinqiao Wang. Filo: Zero-shot anomaly detection by fine-grained description and high-quality localization. InProceedings of the 32nd ACM International Conference on Multimedia, pages 2041–2049, 2024

  15. [15]

    Anoma- lygpt: Detecting industrial anomalies using large vision-language models

    Zhaopeng Gu, Bingke Zhu, Guibo Zhu, Yingying Chen, Ming Tang, and Jinqiao Wang. Anoma- lygpt: Detecting industrial anomalies using large vision-language models. InProceedings of the AAAI Conference on Artificial Intelligence, volume 38, pages 1932–1940, 2024

  16. [16]

    Gaussian Error Linear Units (GELUs)

    Dan Hendrycks and Kevin Gimpel. Gaussian error linear units (gelus).arXiv preprint arXiv:1606.08415, 2016

  17. [17]

    Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection

    Eliahu Horwitz and Yedid Hoshen. Back to the feature: classical 3d features are (almost) all you need for 3d anomaly detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 2968–2977, 2023

  18. [18]

    Winclip: Zero-/few-shot anomaly classification and segmentation

    Jongheon Jeong, Yang Zou, Taewan Kim, Dongqing Zhang, Avinash Ravichandran, and Onkar Dabeer. Winclip: Zero-/few-shot anomaly classification and segmentation. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 19606–19616, 2023

  19. [19]

    Towards scalable 3d anomaly detection and localization: A benchmark via 3d anomaly synthesis and a self-supervised learning network

    Wenqiao Li, Xiaohao Xu, Yao Gu, Bozhong Zheng, Shenghua Gao, and Yingna Wu. Towards scalable 3d anomaly detection and localization: A benchmark via 3d anomaly synthesis and a self-supervised learning network. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 22207–22216, 2024

  20. [20]

    Focal loss for dense object detection

    Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. Focal loss for dense object detection. InProceedings of the IEEE International Conference on Computer Vision, pages 2980–2988, 2017

  21. [21]

    Real3d-ad: A dataset of point cloud anomaly detection.Advances in Neural Information Processing Systems, 36:30402–30415, 2023

    Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong Liu, Chengjie Wang, and Feng Zheng. Real3d-ad: A dataset of point cloud anomaly detection.Advances in Neural Information Processing Systems, 36:30402–30415, 2023

  22. [22]

    Aa-clip: Enhancing zero-shot anomaly detection via anomaly-aware clip

    Wenxin Ma, Xu Zhang, Qingsong Yao, Fenghe Tang, Chenxu Wu, Yingtai Li, Rui Yan, Zihang Jiang, and S Kevin Zhou. Aa-clip: Enhancing zero-shot anomaly detection via anomaly-aware clip. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 4744–4754, 2025

  23. [23]

    Representation Learning for Tabular Data: A Comprehensive Survey.IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–20, 2026

    Yunfeng Ma, Min Liu, Shuai Jiang, Jingyu Zhou, Yuan Bian, Xueping Wang, and Yaonan Wang. Zuma: Training-free zero-shot unified multimodal anomaly detection.IEEE Transactions on Pattern Analysis and Machine Intelligence, pages 1–14, 2026. doi: 10.1109/TPAMI.2026. 3658856

  24. [24]

    V-net: Fully convolutional neural networks for volumetric medical image segmentation

    Fausto Milletari, Nassir Navab, and Seyed-Ahmad Ahmadi. V-net: Fully convolutional neural networks for volumetric medical image segmentation. In2016 Fourth International Conference on 3D Vision (3DV), pages 565–571. Ieee, 2016

  25. [25]

    Bayesian prompt flow learning for zero-shot anomaly detection

    Zhen Qu, Xian Tao, Xinyi Gong, Shichen Qu, Qiyu Chen, Zhengtao Zhang, Xingang Wang, and Guiguang Ding. Bayesian prompt flow learning for zero-shot anomaly detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 30398–30408, 2025

  26. [26]

    Learning transferable visual models from natural language supervision

    Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. Learning transferable visual models from natural language supervision. InInternational Conference on Machine Learning, pages 8748–8763. PmLR, 2021

  27. [27]

    Towards total recall in industrial anomaly detection

    Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Schölkopf, Thomas Brox, and Peter Gehler. Towards total recall in industrial anomaly detection. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 14318–14328, 2022. 11

  28. [28]

    Asymmetric student- teacher networks for industrial anomaly detection

    Marco Rudolph, Tom Wehrbein, Bodo Rosenhahn, and Bastian Wandt. Asymmetric student- teacher networks for industrial anomaly detection. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2592–2602, 2023

  29. [29]

    Pointsgrade: Sparse learning with graph representation for anomaly detection by using unstructured 3d point cloud data.IISE Transactions, 57(2):131–144, 2025

    Chengyu Tao and Juan Du. Pointsgrade: Sparse learning with graph representation for anomaly detection by using unstructured 3d point cloud data.IISE Transactions, 57(2):131–144, 2025

  30. [30]

    G2sf: Geometry-guided score fusion for multimodal industrial anomaly detection

    Chengyu Tao, Xuanming Cao, and Juan Du. G2sf: Geometry-guided score fusion for multimodal industrial anomaly detection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 20551–20560, 2025

  31. [31]

    Mul- timodal industrial anomaly detection via hybrid fusion

    Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, and Chengjie Wang. Mul- timodal industrial anomaly detection via hybrid fusion. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8032–8041, 2023

  32. [32]

    Cheating depth: Enhancing 3d surface anomaly detection via depth simulation

    Vitjan Zavrtanik, Matej Kristan, and Danijel Skoˇcaj. Cheating depth: Enhancing 3d surface anomaly detection via depth simulation. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 2164–2172, 2024

  33. [33]

    Anomalyclip: Object- agnostic prompt learning for zero-shot anomaly detection

    Qihang Zhou, Guansong Pang, Yu Tian, Shibo He, and Jiming Chen. Anomalyclip: Object- agnostic prompt learning for zero-shot anomaly detection. InThe Twelfth International Confer- ence on Learning Representations, 2023

  34. [34]

    Pointad: Compre- hending 3d anomalies from points and pixels for zero-shot 3d anomaly detection.Advances in Neural Information Processing Systems, 37:84866–84896, 2024

    Qihang Zhou, Jiangtao Yan, Shibo He, Wenchao Meng, and Jiming Chen. Pointad: Compre- hending 3d anomalies from points and pixels for zero-shot 3d anomaly detection.Advances in Neural Information Processing Systems, 37:84866–84896, 2024. A Dataset Dataset choice and description.We conduct experiments on three publicly available 3D anomaly detection dataset...