pith. machine review for the scientific record. sign in

arxiv: 2506.05405 · v1 · submitted 2025-06-04 · 💻 cs.CV

Recognition: unknown

A VLM-based Method for Visual Anomaly Detection in Robotic Scientific Laboratories

Authors on Pith no claims yet
classification 💻 cs.CV
keywords detectionanomalyscientificvisualworkflowsapproacheffectivenessexperimental
0
0 comments X
read the original abstract

In robot scientific laboratories, visual anomaly detection is important for the timely identification and resolution of potential faults or deviations. It has become a key factor in ensuring the stability and safety of experimental processes. To address this challenge, this paper proposes a VLM-based visual reasoning approach that supports different levels of supervision through four progressively informative prompt configurations. To systematically evaluate its effectiveness, we construct a visual benchmark tailored for process anomaly detection in scientific workflows. Experiments on two representative vision-language models show that detection accuracy improves as more contextual information is provided, confirming the effectiveness and adaptability of the proposed reasoning approach for process anomaly detection in scientific workflows. Furthermore, real-world validations at selected experimental steps confirm that first-person visual observation can effectively identify process-level anomalies. This work provides both a data-driven foundation and an evaluation framework for vision anomaly detection in scientific experiment workflows.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios

    cs.CV 2026-04 conditional novelty 7.0

    FORGE benchmark shows domain-specific knowledge, not visual grounding, is the main bottleneck for MLLMs in manufacturing, with SFT on a 3B model delivering up to 90.8% relative accuracy improvement on held-out scenarios.