Agent Island is a new multiagent game environment that functions as a dynamic benchmark resistant to saturation and contamination, with Bayesian ranking showing OpenAI GPT-5.5 as the strongest performer among 49 models across 999 games.
Survivor: Three principles of economics lessons as taught by a reality television show
1 Pith paper cite this work. Polarity classification is still indexing.
1
Pith paper citing it
fields
cs.AI 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Agent Island: A Saturation- and Contamination-Resistant Benchmark from Multiagent Games
Agent Island is a new multiagent game environment that functions as a dynamic benchmark resistant to saturation and contamination, with Bayesian ranking showing OpenAI GPT-5.5 as the strongest performer among 49 models across 999 games.