pith. sign in

arxiv: 2010.09776 · v2 · pith:EUJIQQOSnew · submitted 2020-10-19 · 💻 cs.MA · cs.AI· cs.GT· cs.LG· cs.SY· eess.SY

SMARTS: Scalable Multi-Agent Reinforcement Learning Training School for Autonomous Driving

classification 💻 cs.MA cs.AIcs.GTcs.LGcs.SYeess.SY
keywords multi-agentsmartsdiversedrivingautonomouslearningresearchtraining
0
0 comments X
read the original abstract

Multi-agent interaction is a fundamental aspect of autonomous driving in the real world. Despite more than a decade of research and development, the problem of how to competently interact with diverse road users in diverse scenarios remains largely unsolved. Learning methods have much to offer towards solving this problem. But they require a realistic multi-agent simulator that generates diverse and competent driving interactions. To meet this need, we develop a dedicated simulation platform called SMARTS (Scalable Multi-Agent RL Training School). SMARTS supports the training, accumulation, and use of diverse behavior models of road users. These are in turn used to create increasingly more realistic and diverse interactions that enable deeper and broader research on multi-agent interaction. In this paper, we describe the design goals of SMARTS, explain its basic architecture and its key features, and illustrate its use through concrete multi-agent experiments on interactive scenarios. We open-source the SMARTS platform and the associated benchmark tasks and evaluation metrics to encourage and empower research on multi-agent learning for autonomous driving. Our code is available at https://github.com/huawei-noah/SMARTS.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 7 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Taming the Curses of Multiagency in Robust Markov Games with Large State Space through Linear Function Approximation

    cs.LG 2026-05 unverdicted novelty 8.0

    The work gives the first algorithms for general robust Markov games with linear function approximation whose sample complexity breaks the curse of multiagency for large state spaces in both generative and online settings.

  2. ScenarioControl: Vision-Language Controllable Vectorized Latent Scenario Generation

    cs.CV 2026-04 unverdicted novelty 7.0

    ScenarioControl introduces the first vision-language controllable generator for realistic vectorized 3D driving scenarios with temporal consistency across actor views.

  3. Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

    cs.CL 2025-11 unverdicted novelty 7.0

    Evo-Memory is a new benchmark for self-evolving memory in LLM agents across task streams, with baseline ExpRAG and proposed ReMem method that integrates reasoning, actions, and memory updates for continual improvement.

  4. Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory

    cs.CL 2025-11 unverdicted novelty 6.0

    Evo-Memory is a new streaming benchmark and evaluation framework for self-evolving memory in LLM agents, unifying over ten memory modules and introducing the ReMem pipeline for continual improvement on multi-turn and ...

  5. FAST: A Framework for Aligned Sampling and Training in Parallel Reinforcement Learning for Autonomous Driving

    cs.LG 2026-06 unverdicted novelty 5.0

    FAST uses Dynamic Parallel Sampling Alignment via virtual continuation and Scaled Mask-Padding Optimization to remove straggler bottlenecks in parallel RL, delivering 1.78x wall-clock speedup while preserving unbiasedness.

  6. RAD-2: Scaling Reinforcement Learning in a Generator-Discriminator Framework

    cs.CV 2026-04 unverdicted novelty 5.0

    RAD-2 uses a diffusion generator and RL discriminator to cut collision rates by 56% in closed-loop autonomous driving planning.

  7. CHARMS: A Cognitive Hierarchical Agent for Reasoning and Motion Stylization in Autonomous Driving

    cs.RO 2025-04 unverdicted novelty 5.0

    CHARMS applies Level-k game theory and Poisson cognitive hierarchy theory to autonomous driving agents via a two-stage RL-then-SFT pipeline for human-like decisions and realistic scenario generation.