Towards Zero-shot Point Cloud Anomaly Detection: A Multi-View Projection Framework

Guoyang Xie; Weiming Shen; Yunkang Cao; Yuqi Cheng; Zhichao Lu

arxiv: 2409.13162 · v1 · pith:HWPYBEBSnew · submitted 2024-09-20 · 💻 cs.CV

Towards Zero-shot Point Cloud Anomaly Detection: A Multi-View Projection Framework

Yuqi Cheng , Yunkang Cao , Guoyang Xie , Zhichao Lu , Weiming Shen This is my paper

classification 💻 cs.CV

keywords detectionanomalypointcloudvlmszero-shotanomaliesframework

0 comments

read the original abstract

Detecting anomalies within point clouds is crucial for various industrial applications, but traditional unsupervised methods face challenges due to data acquisition costs, early-stage production constraints, and limited generalization across product categories. To overcome these challenges, we introduce the Multi-View Projection (MVP) framework, leveraging pre-trained Vision-Language Models (VLMs) to detect anomalies. Specifically, MVP projects point cloud data into multi-view depth images, thereby translating point cloud anomaly detection into image anomaly detection. Following zero-shot image anomaly detection methods, pre-trained VLMs are utilized to detect anomalies on these depth images. Given that pre-trained VLMs are not inherently tailored for zero-shot point cloud anomaly detection and may lack specificity, we propose the integration of learnable visual and adaptive text prompting techniques to fine-tune these VLMs, thereby enhancing their detection performance. Extensive experiments on the MVTec 3D-AD and Real3D-AD demonstrate our proposed MVP framework's superior zero-shot anomaly detection performance and the prompting techniques' effectiveness. Real-world evaluations on automotive plastic part inspection further showcase that the proposed method can also be generalized to practical unseen scenarios. The code is available at https://github.com/hustCYQ/MVP-PCLIP.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

VT-3DAD: Cross-Category 3D Anomaly Detection via Visual-Text Normal Space Alignment
cs.CV 2026-06 unverdicted novelty 6.0

VT-3DAD fuses visual deviation from few-shot references and semantic deviation from textual normal space to achieve SOTA cross-category 3D anomaly detection on ShapeNetPart.