pith. sign in

arxiv: 2503.17821 · v1 · pith:RVFL337Rnew · submitted 2025-03-22 · 💻 cs.AI

OvercookedV2: Rethinking Overcooked for Zero-Shot Coordination

classification 💻 cs.AI
keywords coordinationagentsovercookedovercookedv2algorithmsbenchmarkchallengescoordinate
0
0 comments X
read the original abstract

AI agents hold the potential to transform everyday life by helping humans achieve their goals. To do this successfully, agents need to be able to coordinate with novel partners without prior interaction, a setting known as zero-shot coordination (ZSC). Overcooked has become one of the most popular benchmarks for evaluating coordination capabilities of AI agents and learning algorithms. In this work, we investigate the origins of ZSC challenges in Overcooked. We introduce a state augmentation mechanism which mixes states that might be encountered when paired with unknown partners into the training distribution, reducing the out-of-distribution challenge associated with ZSC. We show that independently trained agents under this algorithm coordinate successfully in Overcooked. Our results suggest that ZSC failure can largely be attributed to poor state coverage under self-play rather than more sophisticated coordination challenges. The Overcooked environment is therefore not suitable as a ZSC benchmark. To address these shortcomings, we introduce OvercookedV2, a new version of the benchmark, which includes asymmetric information and stochasticity, facilitating the creation of interesting ZSC scenarios. To validate OvercookedV2, we conduct experiments demonstrating that mere exhaustive state coverage is insufficient to coordinate well. Finally, we use OvercookedV2 to build a new range of coordination challenges, including ones that require test time protocol formation, and we demonstrate the need for new coordination algorithms that can adapt online. We hope that OvercookedV2 will help benchmark the next generation of ZSC algorithms and advance collaboration between AI agents and humans.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. PACT: Proactive Asking for Continual Task Assistance in Human-Robot Collaboration

    cs.RO 2026-05 unverdicted novelty 6.0

    PACT is an ask-or-act framework using reinforcement learning on interaction history to decide when to seek clarification, improving assistance accuracy and a new clarification utility metric over passive baselines in ...