Q-learning for risk-sensitive control

Q-learning for risk-sensitive control

0.00 Avg rating0 Votes
Article ID: iaor2004367
Country: United States
Volume: 27
Issue: 2
Start Page Number: 294
End Page Number: 311
Publication Date: May 2002
Journal: Mathematics of Operations Research
Authors:
Keywords: risk, control processes
Abstract:

We propose for risk-sensitive control of finite Markov chains a counterpart of the popular Q-learning algorithm for classical Markov decision processes. The algorithm is shown to converge with probability one to the desired solution. The proof technique is an adaptation of the ordinary differential equation (a.d.e.) approach for the analysis of stochastic approximation algorithms, with most of the work involved used for the analysis of the specific o.d.e.s that arise.

Reviews

Required fields are marked *. Your email address will not be published.