{"paper":{"title":"Zero-Shot MARL Benchmark in the Cyber-Physical Mobility Lab","license":"http://arxiv.org/licenses/nonexclusive-distrib/1.0/","headline":"A benchmark platform tests MARL policies for vehicles across simulation, digital twin, and physical hardware in zero-shot fashion.","cross_cats":["cs.SY","eess.SY"],"primary_cat":"cs.RO","authors_text":"Bassam Alrifaee, Fynn Belderink, Jianye Xu, Julius Beerwerth, Simon Sch\\\"afer","submitted_at":"2026-01-23T09:26:36Z","abstract_excerpt":"We present a reproducible benchmark for evaluating sim-to-real transfer of Multi-Agent Reinforcement Learning (MARL) policies for Connected and Automated Vehicles (CAVs). The platform, based on the Cyber-Physical Mobility Lab (CPM Lab) [1], integrates simulation, a high-fidelity digital twin, and a physical testbed, enabling structured zero-shot evaluation of MARL motion-planning policies. We demonstrate its use by deploying a SigmaRL-trained policy [2] across all three domains, revealing two complementary sources of performance degradation: architectural differences between simulation and har"},"claims":{"count":4,"items":[{"kind":"strongest_claim","text":"We demonstrate its use by deploying a SigmaRL-trained policy across all three domains, revealing two complementary sources of performance degradation: architectural differences between simulation and hardware control stacks, and the sim-to-real gap induced by increasing environmental realism.","source":"verdict.strongest_claim","status":"machine_extracted","claim_id":"C1","attestation":"unclaimed"},{"kind":"weakest_assumption","text":"The assumption that the CPM Lab setup and the chosen SigmaRL policy provide a representative test for general MARL sim-to-real challenges in CAVs.","source":"verdict.weakest_assumption","status":"machine_extracted","claim_id":"C2","attestation":"unclaimed"},{"kind":"one_line_summary","text":"Introduces an open-source benchmark integrating simulation, digital twin, and physical testbed for zero-shot evaluation of MARL policies in CAVs, identifying architectural and realism-induced performance degradations.","source":"verdict.one_line_summary","status":"machine_extracted","claim_id":"C3","attestation":"unclaimed"},{"kind":"headline","text":"A benchmark platform tests MARL policies for vehicles across simulation, digital twin, and physical hardware in zero-shot fashion.","source":"verdict.pith_extraction.headline","status":"machine_extracted","claim_id":"C4","attestation":"unclaimed"}],"snapshot_sha256":"840bf4a9dc6315cc086f3b2b83b8722724247b9186e5dec4db63046c1e4dbbb5"},"source":{"id":"2601.16578","kind":"arxiv","version":2},"verdict":{"id":"86f0ec4d-df17-4acd-ab81-5c0b23aaf2a5","model_set":{"reader":"grok-4.3"},"created_at":"2026-05-16T12:20:33.824821Z","strongest_claim":"We demonstrate its use by deploying a SigmaRL-trained policy across all three domains, revealing two complementary sources of performance degradation: architectural differences between simulation and hardware control stacks, and the sim-to-real gap induced by increasing environmental realism.","one_line_summary":"Introduces an open-source benchmark integrating simulation, digital twin, and physical testbed for zero-shot evaluation of MARL policies in CAVs, identifying architectural and realism-induced performance degradations.","pipeline_version":"pith-pipeline@v0.9.0","weakest_assumption":"The assumption that the CPM Lab setup and the chosen SigmaRL policy provide a representative test for general MARL sim-to-real challenges in CAVs.","pith_extraction_headline":"A benchmark platform tests MARL policies for vehicles across simulation, digital twin, and physical hardware in zero-shot fashion."},"integrity":{"clean":true,"summary":{"advisory":0,"critical":0,"by_detector":{},"informational":0},"endpoint":"/pith/2601.16578/integrity.json","findings":[],"available":true,"detectors_run":[],"snapshot_sha256":"c28c3603d3b5d939e8dc4c7e95fa8dfce3d595e45f758748cecf8e644a296938"},"references":{"count":0,"sample":[],"resolved_work":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57","internal_anchors":0},"formal_canon":{"evidence_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"author_claims":{"count":0,"strong_count":0,"snapshot_sha256":"258153158e38e3291e3d48162225fcdb2d5a3ed65a07baac614ab91432fd4f57"},"builder_version":"pith-number-builder-2026-05-17-v1"}