MLLMs drop from over 85% accuracy on action presence to under 50% on matched action-denial videos, exposing a causal verification gap that causal graph prompts partially close.
In: 2018 IEEE/CVF 18 R
2 Pith papers cite this work. Polarity classification is still indexing.
fields
cs.CV 2years
2026 2verdicts
UNVERDICTED 2representative citing papers
FAROS uses flow-guided propagation from zero-shot masks and optical flow to create dense temporally consistent labels from sparse keyframes, improving joint multi-task learning across temporal and spatial surgical tasks on GraSP, MISAW, and AutoLaparo.
citing papers explorer
-
Learning to Deny: Action Denial in Multimodal Large Language Models
MLLMs drop from over 85% accuracy on action presence to under 50% on matched action-denial videos, exposing a causal verification gap that causal graph prompts partially close.
-
Temporally Consistent Label Interpolation for Robust Surgical Multi-Task Learning under Challenging Conditions
FAROS uses flow-guided propagation from zero-shot masks and optical flow to create dense temporally consistent labels from sparse keyframes, improving joint multi-task learning across temporal and spatial surgical tasks on GraSP, MISAW, and AutoLaparo.