Learning Robust Rewards with Adversarial Inverse Reinforcement Learning

Justin Fu , Katie Luo , Sergey Levine

Authors on Pith no claims yet

classification 💻 cs.LG

keywords learningreinforcementrewardinverseairladversarialdynamicsengineering

read the original abstract

Reinforcement learning provides a powerful and general framework for decision making and control, but its application in practice is often hindered by the need for extensive feature and reward engineering. Deep reinforcement learning methods can remove the need for explicit engineering of policy or value features, but still require a manually specified reward function. Inverse reinforcement learning holds the promise of automatic reward acquisition, but has proven exceptionally difficult to apply to large, high-dimensional problems with unknown dynamics. In this work, we propose adverserial inverse reinforcement learning (AIRL), a practical and scalable inverse reinforcement learning algorithm based on an adversarial reward learning formulation. We demonstrate that AIRL is able to recover reward functions that are robust to changes in dynamics, enabling us to learn policies even under significant variation in the environment seen during training. Our experiments show that AIRL greatly outperforms prior methods in these transfer settings.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 5 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Learning When to Stop: Selective Imitation Learning Under Arbitrary Dynamics Shift
cs.LG 2026-05 unverdicted novelty 7.0

SeqRejectron builds a stopping rule from a small set of validator policies to achieve horizon-free sample-complexity guarantees for selective imitation learning under arbitrary train-test dynamics shifts.
Recovering Hidden Reward in Diffusion-Based Policies
cs.RO 2026-05 unverdicted novelty 7.0

EnergyFlow recovers the gradient of the expert's soft Q-function from the score of a conservative energy field in diffusion policies, enabling reward extraction without adversarial training.
VoxPoser: Composable 3D Value Maps for Robotic Manipulation with Language Models
cs.RO 2023-07 unverdicted novelty 7.0

VoxPoser uses LLMs to compose 3D value maps via VLM interaction for model-based synthesis of robust robot trajectories on open-set language-specified manipulation tasks.
Recovering Hidden Reward in Diffusion-Based Policies
cs.RO 2026-05 unverdicted novelty 6.0

EnergyFlow shows that denoising score matching on diffusion policies recovers the gradient of the expert's soft Q-function under maximum-entropy optimality, enabling non-adversarial reward extraction and improved poli...
Built Environment Reasoning from Remote Sensing Imagery Using Large Vision--Language Models
cs.CL 2026-05 unverdicted novelty 3.0

Large vision-language models applied to multi-scale remote sensing imagery can generate recommendations on built environment design, constructability, land use, and risks for smart city decision-making.