M2R2 proposes a multimodal robotic representation for temporal action segmentation that combines proprioceptive and exteroceptive sensors with a novel training strategy enabling feature reuse across models, achieving new state-of-the-art results on three robotic datasets.
Refining action segmentation with hierarchical video representations
2 Pith papers cite this work. Polarity classification is still indexing.
verdicts
UNVERDICTED 2representative citing papers
B-ACT improves label efficiency in temporal action segmentation by selecting only boundary frames for annotation via a two-stage uncertainty-driven process that fuses neighborhood uncertainty, class ambiguity, and temporal dynamics.
citing papers explorer
-
M2R2: MultiModal Robotic Representation for Temporal Action Segmentation
M2R2 proposes a multimodal robotic representation for temporal action segmentation that combines proprioceptive and exteroceptive sensors with a novel training strategy enabling feature reuse across models, achieving new state-of-the-art results on three robotic datasets.
-
Boundary-Centric Active Learning for Temporal Action Segmentation
B-ACT improves label efficiency in temporal action segmentation by selecting only boundary frames for annotation via a two-stage uncertainty-driven process that fuses neighborhood uncertainty, class ambiguity, and temporal dynamics.