Can AI perceive physical danger and intervene?

· 2025 · arXiv 2509.21651

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

representative citing papers

RoboJailBench: Benchmarking Adversarial Attacks and Defenses in Embodied Robotic Agents

cs.CR · 2026-05-19 · unverdicted · novelty 7.0

RoboJailBench creates a taxonomy-based benchmark, intent-contrast datasets, and evaluation framework for jailbreak attacks and defenses in embodied robotic AI systems.

Probing Collision Grounding in Vision-Language Models for Safe Human-Robot Collaboration

cs.CV · 2026-05-29 · unverdicted · novelty 6.0

TouchSafeBench evaluates VLMs on collision grounding, finding best Macro-F1 below 50% and that explicit depth does not yield reliable robot-body contact inference.

SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models

cs.AI · 2026-04-21 · unverdicted · novelty 6.0

SafetyALFRED shows multimodal LLMs recognize kitchen hazards accurately in QA tests but achieve low success rates when required to mitigate those hazards through embodied planning.

citing papers explorer

Showing 3 of 3 citing papers.

RoboJailBench: Benchmarking Adversarial Attacks and Defenses in Embodied Robotic Agents cs.CR · 2026-05-19 · unverdicted · none · ref 14
RoboJailBench creates a taxonomy-based benchmark, intent-contrast datasets, and evaluation framework for jailbreak attacks and defenses in embodied robotic AI systems.
Probing Collision Grounding in Vision-Language Models for Safe Human-Robot Collaboration cs.CV · 2026-05-29 · unverdicted · none · ref 7
TouchSafeBench evaluates VLMs on collision grounding, finding best Macro-F1 below 50% and that explicit depth does not yield reliable robot-body contact inference.
SafetyALFRED: Evaluating Safety-Conscious Planning of Multimodal Large Language Models cs.AI · 2026-04-21 · unverdicted · none · ref 36
SafetyALFRED shows multimodal LLMs recognize kitchen hazards accurately in QA tests but achieve low success rates when required to mitigate those hazards through embodied planning.

Can AI perceive physical danger and intervene?

fields

years

verdicts

representative citing papers

citing papers explorer