Optimistic critics can empower small actors

Dhruv Sreenivas; Olya Mastikhina; Pablo Samuel Castro

arxiv: 2506.01016 · v3 · pith:UQOYGOA2new · submitted 2025-06-01 · 💻 cs.LG · stat.ML

Optimistic critics can empower small actors

Olya Mastikhina , Dhruv Sreenivas , Pablo Samuel Castro This is my paper

classification 💻 cs.LG stat.ML

keywords actorsactor-criticanalysesasymmetriccriticcriticsfurthermethods

0 comments

read the original abstract

Actor-critic methods have been central to many of the recent advances in deep reinforcement learning. The most common approach is to use symmetric architectures, whereby both actor and critic have the same network topology and number of parameters. However, recent works have argued for the advantages of asymmetric setups, specifically with the use of smaller actors. We perform broad empirical investigations and analyses to better understand the implications of this and find that, in general, smaller actors result in performance degradation and overfit critics. Our analyses suggest poor data collection, due to value underestimation, as one of the main causes for this behavior, and further highlight the crucial role the critic can play in alleviating this pathology. We explore techniques to mitigate the observed value underestimation, which enables further research in asymmetric actor-critic methods.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Augmenting Game AI with Deep Reinforcement Learning
cs.AI 2026-06 unverdicted novelty 4.0

Proposes a requirements-based framework for RL-augmented game AI, discusses deployment practicalities, and identifies research bottlenecks for industry adoption.