pith. machine review for the scientific record. sign in

arxiv: 2411.05174 · v2 · submitted 2024-11-07 · 💻 cs.LG · cs.AI· stat.ML

Recognition: unknown

Bayesian Inverse Transition Learning: Learning Dynamics From Near-Optimal Trajectories

Authors on Pith no claims yet
classification 💻 cs.LG cs.AIstat.ML
keywords learningexpertnear-optimaltrajectoriestransitionbayesiandynamicsinform
0
0 comments X
read the original abstract

We consider the problem of estimating the transition dynamics $T^*$ from near-optimal expert trajectories in the context of offline model-based reinforcement learning. We develop a novel constraint-based method, Inverse Transition Learning, that treats the limited coverage of the expert trajectories as a \emph{feature}: we use the fact that the expert is near-optimal to inform our estimate of $T^*$. We integrate our constraints into a Bayesian approach. Across both synthetic environments and real healthcare scenarios like Intensive Care Unit (ICU) patient management in hypotension, we demonstrate not only significant improvements in decision-making, but that our posterior can inform when transfer will be successful.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Quantifying Potential Observation Missingness in Inverse Reinforcement Learning

    cs.LG 2026-05 unverdicted novelty 7.0

    A practical algorithm quantifies potential missing observations in IRL by computing minimal perturbations to recorded data that render expert actions optimal.