SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.
M3-jepa: Multimodal alignment via multi-gate moe based on the joint-embedding predictive architecture
4 Pith papers cite this work. Polarity classification is still indexing.
years
2026 4representative citing papers
DART is a cross-modal foundation model that delivers rope damage classification, severity regression, and few-shot recognition from a single frozen representation trained on 4270 images across 14 damage classes.
BrainFIBRE presents a foundation model for brain microstructure that applies self-supervised partial information decomposition on NODDI maps to disentangle unique, synergistic, and redundant information and reports state-of-the-art results on multiple prediction tasks.
A literature survey that categorizes how Mixture-of-Experts architectures address multimodal learning challenges and identifies open research gaps.
citing papers explorer
-
SARM2: Multi-Task Stage Aware Reward Modeling for Self Improving Robotic Manipulation
SARM2 presents RM, a multi-task stage-aware reward model achieving 80% lower value-estimation MSE, which when used in SPIRAL boosts manipulation task success from ~50% to near-perfect on several benchmarks.
-
DART: A Vision-Language Foundation Model for Comprehensive Rope Condition Monitoring
DART is a cross-modal foundation model that delivers rope damage classification, severity regression, and few-shot recognition from a single frozen representation trained on 4270 images across 14 damage classes.
-
BrainFIBRE: A Foundation Model via Information Decomposition for Brain Microstructure
BrainFIBRE presents a foundation model for brain microstructure that applies self-supervised partial information decomposition on NODDI maps to disentangle unique, synergistic, and redundant information and reports state-of-the-art results on multiple prediction tasks.