pith. sign in

arxiv: 1205.6659 · v1 · pith:Q5YTG43Onew · submitted 2012-05-30 · 🧮 math.ST · stat.TH

Q-learning with censored data

classification 🧮 math.ST stat.TH
keywords algorithmflexiblenumberstagescensoreddatamethodologymultistage
0
0 comments X
read the original abstract

We develop methodology for a multistage decision problem with flexible number of stages in which the rewards are survival times that are subject to censoring. We present a novel Q-learning algorithm that is adjusted for censored data and allows a flexible number of stages. We provide finite sample bounds on the generalization error of the policy learned by the algorithm, and show that when the optimal Q-function belongs to the approximation space, the expected survival time for policies obtained by the algorithm converges to that of the optimal policy. We simulate a multistage clinical trial with flexible number of stages and apply the proposed censored-Q-learning algorithm to find individualized treatment regimens. The methodology presented in this paper has implications in the design of personalized medicine trials in cancer and in other life-threatening diseases.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.