VLA models exhibit catastrophic forgetting on a new real-world dataset of four sequential manipulation tasks, with experience replay implementation factors evaluated for mitigation.
Pretrained vision- language-action models are surprisingly resistant to forgetting in continual learning
4 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
fields
cs.RO 4years
2026 4verdicts
UNVERDICTED 4roles
background 1polarities
background 1representative citing papers
PHASER improves average success rate by up to 31% over uniform experience replay on LIBERO continual learning benchmarks for VLA models by phase-centric capacity allocation and semantic interference routing.
MuSe adapts vision-only pretrained visuomotor policies to force-torque sensing via multi-stage fusion, multisensory future prediction, and experience replay, achieving strong contact-rich performance while preserving original task results.
ConSFT is a gradient-scaling fine-tuning objective for flow-matching VLAs that bounds parameter disruption via model-confidence weighting, yielding over 20% better capability retention than vanilla SFT on LIBERO and RoboTwin.
citing papers explorer
-
Can VLA Models Learn from Real-World Data Continually without Forgetting?
VLA models exhibit catastrophic forgetting on a new real-world dataset of four sequential manipulation tasks, with experience replay implementation factors evaluated for mitigation.
-
PHASER: Phase-Aware and Semantic Experience Replay for Vision-Language-Action Models
PHASER improves average success rate by up to 31% over uniform experience replay on LIBERO continual learning benchmarks for VLA models by phase-centric capacity allocation and semantic interference routing.
-
Multisensory Continual Learning: Adapting Pretrained Visuomotor Policies to Force
MuSe adapts vision-only pretrained visuomotor policies to force-torque sensing via multi-stage fusion, multisensory future prediction, and experience replay, achieving strong contact-rich performance while preserving original task results.
-
Preserving Foundational Capabilities in Flow-Matching VLAs through Conservative SFT
ConSFT is a gradient-scaling fine-tuning objective for flow-matching VLAs that bounds parameter disruption via model-confidence weighting, yielding over 20% better capability retention than vanilla SFT on LIBERO and RoboTwin.