Advancing multi-agent traffic simulation via r1-style reinforcement fine-tuning

· 2025 · arXiv 2509.23993

4 Pith papers cite this work. Polarity classification is still indexing.

4 Pith papers citing it

representative citing papers

Long-term Traffic Simulation via Structured Autoregressive Modeling

cs.AI · 2026-06-30 · unverdicted · novelty 6.0

RosettaSim adapts frozen LLMs via structured autoregressive modeling of scene topology and agent states to reach SOTA short- and long-term traffic simulation on WOSAC, paired with RTE evaluation that correlates better with human-like fidelity.

Beyond Self-Play: Hierarchical Reasoning for Continuous Motion in Closed-Loop Traffic Simulation

cs.RO · 2026-05-09 · unverdicted · novelty 6.0

A hierarchical Stackelberg MARL plus continuous-motion architecture with hybrid co-training produces smoother and safer closed-loop traffic behavior than standard self-play methods.

Bridging Local Observation and Global Simulation in Closed-Loop Traffic Modeling

cs.RO · 2026-06-30 · unverdicted · novelty 5.0

CRAFT reduces collisions by 31.2% and traffic violations by 33.2% in closed-loop traffic simulation by discovering context-induced failures in what-if rollouts and using a contextual preference evaluator to reweight autoregressive decoding toward globally coherent behaviors.

Decoupled Intelligence: A Multi-Agent LLM Framework for Controllable Traffic Scenario Generation in SUMO

cs.MA · 2026-05-26 · unverdicted · novelty 5.0

A multi-agent LLM system for SUMO decouples simulation tasks across Planner, Builder, Demand, Runner, and Analyst agents with MCP-based orchestration, yielding higher success rates than single-agent baselines in ablation studies.

citing papers explorer

Showing 2 of 2 citing papers after filters.

Beyond Self-Play: Hierarchical Reasoning for Continuous Motion in Closed-Loop Traffic Simulation cs.RO · 2026-05-09 · unverdicted · none · ref 7
A hierarchical Stackelberg MARL plus continuous-motion architecture with hybrid co-training produces smoother and safer closed-loop traffic behavior than standard self-play methods.
Bridging Local Observation and Global Simulation in Closed-Loop Traffic Modeling cs.RO · 2026-06-30 · unverdicted · none · ref 21
CRAFT reduces collisions by 31.2% and traffic violations by 33.2% in closed-loop traffic simulation by discovering context-induced failures in what-if rollouts and using a contextual preference evaluator to reweight autoregressive decoding toward globally coherent behaviors.

Advancing multi-agent traffic simulation via r1-style reinforcement fine-tuning

fields

years

verdicts

representative citing papers

citing papers explorer