The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space

0.00 Avg rating—0 Votes

Article ID:	iaor20072029
Country:	United States
Volume:	28
Issue:	4
Start Page Number:	752
End Page Number:	776
Publication Date:	Nov 2003
Journal:	Mathematics of Operations Research
Authors:	Cavazos-Cadena Rolando, Montes-De-Oca Ral
Keywords:	programming: dynamic

Abstract:

This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity λ, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal λ-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer to analyze the risk-neutral case.

Reviews

Required fields are marked *. Your email address will not be published.