RobotEQ is a new benchmark dataset and evaluation suite showing that current embodied AI models fall short on active social-norm compliance, especially spatial grounding, though RAG with external knowledge helps.
Title resolution pending
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it
years
2026 2representative citing papers
FORGE benchmark shows domain-specific knowledge, not visual grounding, is the main bottleneck for MLLMs in manufacturing, with SFT on a 3B model delivering up to 90.8% relative accuracy improvement on held-out scenarios.
citing papers explorer
-
RobotEQ: Transitioning from Passive Intelligence to Active Intelligence in Embodied AI
RobotEQ is a new benchmark dataset and evaluation suite showing that current embodied AI models fall short on active social-norm compliance, especially spatial grounding, though RAG with external knowledge helps.
-
FORGE: Fine-grained Multimodal Evaluation for Manufacturing Scenarios
FORGE benchmark shows domain-specific knowledge, not visual grounding, is the main bottleneck for MLLMs in manufacturing, with SFT on a 3B model delivering up to 90.8% relative accuracy improvement on held-out scenarios.