Recognition: no theorem link
Real-IAD MVN: A Multi-View Normal Vector Dataset and Benchmark for High-Fidelity Industrial Anomaly Detection
Pith reviewed 2026-05-11 01:12 UTC · model grok-4.3
The pith
High-fidelity multi-view surface normal maps improve detection of subtle geometric defects over sparse 3D point clouds.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
By upgrading the acquisition system, Real-IAD-MVN replaces sparse 3D point cloud data with dense multi-view pseudo-3D surface normal maps from five viewpoints. This representation makes previously invisible micro-defects explicitly detectable. A baseline reconstruction approach that extracts cross-modal unified prototypes from image and normal map streams surpasses existing state-of-the-art multimodal fusion methods on the new dataset.
What carries the argument
The multi-view normal vector dataset that supplies dense geometric surface information from five angles, paired with a reconstruction baseline that learns cross-modal unified prototypes from image and normal streams.
If this is right
- Side-wall and occluded micro-defects such as scratches and pits become detectable where they were previously missed.
- Dense multi-view normals produce higher detection performance than sparse 3D point clouds.
- Unified prototypes extracted from image and normal streams outperform prior multimodal fusion techniques.
- The dataset supplies a concrete benchmark for advancing geometric anomaly detection methods.
Where Pith is reading between the lines
- Existing 2D inspection lines could add normal-map channels to catch more defects without switching to full 3D scanners.
- The same multi-view normal capture strategy might transfer to anomaly detection in non-industrial domains that involve fine surface geometry.
- Tests on object categories outside the current collection would reveal how far the performance advantage extends.
Load-bearing premise
The upgraded system must generate artifact-free normals that faithfully capture real micro-defects, and the reported gains must generalize beyond the chosen dataset splits and baseline code.
What would settle it
Apply the baseline to a fresh set of industrial parts containing independently measured micro-defects and check whether detection rates match the numbers reported on the original splits.
Figures
read the original abstract
Industrial Anomaly Detection (IAD) is critical for quality control, but existing methods struggle with subtle, geometric defects. Standard 2D (RGB) images are sensitive to texture and lighting but often miss fine geometric anomalies. While 3D point clouds capture macro-shape, they are typically too sparse to detect micro-defects like scratches or pits. We address this fundamental data limitation by introducing Real-IAD-MVN (Multi-View Normal), a large-scale industrial dataset. By upgrading our acquisition system, Real-IAD-MVN captures high-fidelity surface normal maps from five distinct viewpoints, replacing sparse 3D data entirely. This provides a comprehensive geometric representation at a micro-detail level, making previously invisible side-wall and occluded defects explicitly detectable. Our experiments, conducted on this new dataset, first provide evidence that incorporating dense, multi-view pseudo-3D (surface normals) yields significantly better detection performance than using sparse 3D point cloud data. To further validate the dataset and provide a strong benchmark, we introduce a baseline method based on reconstruction, which learns to extract cross-modal unified prototypes from the image and normal map streams. We demonstrate that this unified prototype approach surpasses existing state-of-the-art multimodal fusion methods, highlighting the rich potential of our new dataset for advancing geometric anomaly detection.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper introduces Real-IAD-MVN, a large-scale industrial anomaly detection dataset that upgrades an acquisition system to capture high-fidelity multi-view surface normal maps from five viewpoints, replacing sparse 3D point clouds. It claims this dense pseudo-3D geometric representation enables detection of previously invisible micro-defects (e.g., side-wall scratches and pits), provides experimental evidence that multi-view normals outperform point clouds, and introduces a reconstruction-based baseline that learns unified cross-modal prototypes from RGB and normal streams, surpassing existing SOTA multimodal fusion methods.
Significance. If the performance claims hold after proper validation, the dataset would address a key limitation in IAD by supplying dense micro-scale geometric cues absent from standard 2D RGB or sparse point-cloud modalities, potentially improving detection of subtle manufacturing defects. The unified-prototype baseline offers a concrete starting point for cross-modal learning research on this new data type.
major comments (3)
- [Abstract] Abstract: the central claim that 'incorporating dense, multi-view pseudo-3D (surface normals) yields significantly better detection performance than using sparse 3D point cloud data' is asserted without any reported metrics (AUROC, F1, etc.), ablation tables, dataset statistics, or controls, rendering the superiority statement unverifiable from the manuscript.
- [Experimental section] Experimental section (and methods): no quantitative fidelity check (e.g., mean angular error against calibrated reference geometry) or artifact analysis is provided for the captured normal maps, which is load-bearing for the claim that the upgraded system produces artifact-free high-fidelity normals capable of revealing real industrial micro-defects rather than reconstruction artifacts.
- [Experiments] Experiments / baseline description: the assertion that the unified-prototype method 'surpasses existing state-of-the-art multimodal fusion methods' lacks details on baseline re-implementations, hyper-parameter controls, statistical significance tests, or cross-validation splits, preventing assessment of whether gains are attributable to the new modality or to implementation differences.
minor comments (1)
- [Abstract] The term 'pseudo-3D' is used for surface normals without a clear definition or comparison to true 3D geometry in the text.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where revisions will be made to improve clarity and verifiability.
read point-by-point responses
-
Referee: [Abstract] Abstract: the central claim that 'incorporating dense, multi-view pseudo-3D (surface normals) yields significantly better detection performance than using sparse 3D point cloud data' is asserted without any reported metrics (AUROC, F1, etc.), ablation tables, dataset statistics, or controls, rendering the superiority statement unverifiable from the manuscript.
Authors: The full experimental section contains the supporting AUROC/F1 metrics, ablation tables, dataset statistics, and controls that underpin the abstract claim. The abstract is intentionally concise as a summary. To make the superiority statement directly verifiable at the abstract level, we will revise it to include brief quantitative highlights (e.g., specific AUROC improvements over point-cloud baselines). revision: yes
-
Referee: [Experimental section] Experimental section (and methods): no quantitative fidelity check (e.g., mean angular error against calibrated reference geometry) or artifact analysis is provided for the captured normal maps, which is load-bearing for the claim that the upgraded system produces artifact-free high-fidelity normals capable of revealing real industrial micro-defects rather than reconstruction artifacts.
Authors: We agree this quantitative validation would strengthen the fidelity claims. The current manuscript supports normal-map quality via qualitative inspection and downstream anomaly-detection gains. In revision we will add a dedicated fidelity subsection reporting mean angular error against calibrated reference geometry plus artifact analysis. revision: yes
-
Referee: [Experiments] Experiments / baseline description: the assertion that the unified-prototype method 'surpasses existing state-of-the-art multimodal fusion methods' lacks details on baseline re-implementations, hyper-parameter controls, statistical significance tests, or cross-validation splits, preventing assessment of whether gains are attributable to the new modality or to implementation differences.
Authors: We will expand the experimental section with explicit re-implementation details (including pseudocode or repository links), full hyper-parameter tables, statistical significance tests (e.g., paired t-tests with p-values), and clarification of the train/validation/test splits used. This will allow readers to attribute performance differences to the modality rather than implementation variance. revision: yes
Circularity Check
No circularity: empirical dataset and benchmark paper with no derivations
full rationale
The paper is a dataset contribution introducing Real-IAD-MVN with multi-view normal maps, accompanied by empirical benchmarks comparing detection performance against point clouds and multimodal baselines. No equations, parameter fittings, uniqueness theorems, or derivation chains are present in the abstract or described content. All claims rest on reported experimental metrics from the new data splits, which are independent of any self-referential constructs or self-citations for core premises. This is a standard self-contained empirical work with no load-bearing reductions to inputs by construction.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Improv- ing unsupervised defect segmentation by applying structural similarity to autoencoders,
Paul Bergmann, Sindy L ¨owe, Michael Fauser, David Sattleg- ger, and Carsten Steger. Improving unsupervised defect seg- mentation by applying structural similarity to autoencoders. arXiv preprint arXiv:1807.02011, 2018. 2
-
[2]
Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection
Paul Bergmann, Michael Fauser, David Sattlegger, and Carsten Steger. Mvtec ad–a comprehensive real-world dataset for unsupervised anomaly detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9592–9600, 2019. 1
work page 2019
-
[3]
Paul Bergmann, Xin Jin, David Sattlegger, and Carsten Ste- ger. The mvtec 3d-ad dataset for unsupervised 3d anomaly detection and localization.arXiv preprint arXiv:2112.09045,
-
[4]
Padim: a patch distribution modeling framework for anomaly detection and localization
Thomas Defard, Aleksandr Setkov, Angelique Loesch, and Romaric Audigier. Padim: a patch distribution modeling framework for anomaly detection and localization. InInter- national Conference on Pattern Recognition, pages 475–489. Springer, 2021. 2, 4
work page 2021
-
[5]
Learning to detect multi-class anomalies with just one normal image prompt
Bin-Bin Gao. Learning to detect multi-class anomalies with just one normal image prompt. InEuropean Conference on Computer Vision, pages 454–470. Springer, 2024. 2
work page 2024
-
[6]
Denis Gudovskiy et al. Cflow-ad: Real-time unsupervised anomaly detection with compact neural flow.IEEE Transac- tions on Artificial Intelligence, page 2022, 2022. 6
work page 2022
-
[7]
Jia Guo, Shuai Lu, Weihang Zhang, Fang Chen, Hongen Liao, and Huiqi Li. Dinomaly: The less is more philoso- phy in multi-class unsupervised anomaly detection.arXiv preprint arXiv:2405.14325, 2024. 6
-
[8]
Inp- former: A new paradigm for training-free anomaly detec- tion
Jaemin Lee, Dong-ju Lee, Jihun Lee, Hyunjin Song, Jong- heon Kim, Sung-eui Park, and Chan-hyung Choi. Inp- former: A new paradigm for training-free anomaly detec- tion. InProceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 562–572, 2024. 4
work page 2024
-
[9]
Cutpaste: Self-supervised learning for anomaly de- tection and localization
Chun-Liang Li, Kihyuk Sohn, Jinsung Yoon, and Tomas Pfister. Cutpaste: Self-supervised learning for anomaly de- tection and localization. InProceedings of the IEEE/CVF conference on CVPR, pages 9664–9674, 2021. 2
work page 2021
-
[10]
arXiv preprint arXiv:2506.18882 , year=
Hong Li, Houyuan Chen, Chongjie Ye, Zhaoxi Chen, Bo- han Li, Shaocong Xu, Xianda Guo, Xuhui Liu, Yikai Wang, Baochang Zhang, et al. Light of normals: Unified fea- ture representation for universal photometric stereo.arXiv preprint arXiv:2506.18882, 2025. 1, 2, 3
-
[11]
Jiaqi Liu, Guoyang Xie, Ruitao Chen, Xinpeng Li, Jinbao Wang, Yong Liu, Chengjie Wang, and Feng Zheng. Real3d- ad: A dataset of point cloud anomaly detection.Advances in Neural Information Processing Systems, 36, 2024. 1, 2, 3
work page 2024
-
[12]
Simplenet: A simple network for image anomaly detection and localization
Zhikang Liu, Yiming Zhou, Yuansheng Xu, and Zilei Wang. Simplenet: A simple network for image anomaly detection and localization. InProceedings of the IEEE/CVF confer- ence on CVPR, pages 20402–20411, 2023. 2, 6
work page 2023
-
[13]
Exploring intrinsic normal prototypes within a single im- age for universal anomaly detection
Wei Luo, Yunkang Cao, Haiming Yao, Xiaotian Zhang, Jianan Lou, Yuqi Cheng, Weiming Shen, and Wenyong Yu. Exploring intrinsic normal prototypes within a single im- age for universal anomaly detection. InProceedings of the Computer Vision and Pattern Recognition Conference, pages 9974–9983, 2025. 4, 6, 7, 8
work page 2025
-
[14]
Towards to- tal recall in industrial anomaly detection
Karsten Roth, Latha Pemula, Joaquin Zepeda, Bernhard Sch¨olkopf, Thomas Brox, and Peter Gehler. Towards to- tal recall in industrial anomaly detection. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 14318–14328, 2022. 2, 4, 6
work page 2022
-
[15]
Unsupervised anomaly detection with generative adversarial networks to guide marker discovery
Thomas Schlegl, Philipp Seeb ¨ock, Sebastian M Waldstein, Ursula Schmidt-Erfurth, and Georg Langs. Unsupervised anomaly detection with generative adversarial networks to guide marker discovery. InInternational conference on in- formation processing in medical imaging, pages 146–157. Springer, 2017. 2
work page 2017
-
[16]
Real-iad: A real-world multi-view dataset for benchmarking versatile industrial anomaly detec- tion
Chengjie Wang, Wenbing Zhu, Bin-Bin Gao, Zhenye Gan, Jiangning Zhang, Zhihao Gu, Shuguang Qian, Mingang Chen, and Lizhuang Ma. Real-iad: A real-world multi-view dataset for benchmarking versatile industrial anomaly detec- tion. InProceedings of the IEEE/CVF Conference on Com- puter Vision and Pattern Recognition, pages 22883–22892,
-
[17]
Chengjie Wang, Haokun Zhu, Jinlong Peng, Yue Wang, Ran Yi, Yunsheng Wu, Lizhuang Ma, and Jiangning Zhang. M3dm-nr: Rgb-3d noisy-resistant industrial anomaly detec- tion via multimodal denoising.IEEE Transactions on Pat- tern Analysis and Machine Intelligence, 2025. 2
work page 2025
-
[18]
Multimodal industrial anomaly detection via hybrid fusion
Yue Wang, Jinlong Peng, Jiangning Zhang, Ran Yi, Yabiao Wang, and Chengjie Wang. Multimodal industrial anomaly detection via hybrid fusion. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8032–8041, 2023. 2, 6, 7, 8
work page 2023
-
[19]
Robert J Woodham. Photometric method for determining surface orientation from multiple images.Optical engineer- ing, 19(1):139–144, 1980. 3
work page 1980
-
[20]
Zhiyuan You, Lei Cui, Yujun Shen, Kai Yang, Xin Lu, Yu Zheng, and Xinyi Le. A unified model for multi-class anomaly detection.Advances in Neural Information Pro- cessing Systems, 35:4571–4584, 2022. 2
work page 2022
-
[21]
Draem- a discriminatively trained reconstruction embedding for sur- face anomaly detection
Vitjan Zavrtanik, Matej Kristan, and Danijel Skoˇcaj. Draem- a discriminatively trained reconstruction embedding for sur- face anomaly detection. InProceedings of the IEEE/CVF international conference on computer vision, pages 8330– 8339, 2021. 2
work page 2021
-
[22]
Real-iad d3: A real-world 2d/pseudo-3d/3d dataset for industrial anomaly detection
Wenbing Zhu, Lidong Wang, Ziqing Zhou, Chengjie Wang, Yurui Pan, Ruoyi Zhang, Zhuhao Chen, Linjie Cheng, Bin- Bin Gao, Jiangning Zhang, et al. Real-iad d3: A real-world 2d/pseudo-3d/3d dataset for industrial anomaly detection. In Proceedings of the Computer Vision and Pattern Recognition Conference, pages 15214–15223, 2025. 1, 2, 3, 5, 6
work page 2025
-
[23]
Spot-the- difference: A novel benchmark for image anomaly detection in industrial inspection
Zongyi Zou, Qiang Qiu, and Weiming Shen. Spot-the- difference: A novel benchmark for image anomaly detection in industrial inspection. InProceedings of the IEEE/CVF International Conference on Computer Vision, pages 2608– 2616, 2022. 1 A. Generative Potential of Real-IAD-MVN To showcase the richness of the learned representation in our new dataset and to ...
work page 2022
-
[24]
Diffusion-based Latent Generation.We adopt a conditional denoising diffusion probabilistic model (DDPM) to generate a latent representation of the geom- etry. A forward process gradually adds Gaussian noise to a ground truth latent vector, and aDenoising Unetis trained to reverse this process. The reverse process is con- ditioned on the fused features fro...
-
[25]
Adversarial Dense Generation.The denoised la- tent representation from the Unet is passed to a finalDe- codernetwork, which generates the high-resolution, single- Figure A1. Qualitative comparison of anomaly segmentation from the generative module’s output on the Real-IAD D³ dataset. The right panel (+Pseudo-3D) shows that generating geometry from normal ...
work page 2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.