AI agents in supply chain simulations outperform humans but exhibit decision instability that GRPO post-training reduces.
Leonard Boussioux, Andrew Chen, Ming Fan, and Apurva Jain
2 Pith papers cite this work. Polarity classification is still indexing.
2
Pith papers citing it