The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space

The value iteration algorithm in risk-sensitive average Markov decision chains with finite state space

0.00 Avg rating0 Votes
Article ID: iaor20072029
Country: United States
Volume: 28
Issue: 4
Start Page Number: 752
End Page Number: 776
Publication Date: Nov 2003
Journal: Mathematics of Operations Research
Authors: ,
Keywords: programming: dynamic
Abstract:

This work concerns discrete-time Markov decision chains with finite state space and bounded costs. The controller has constant risk sensitivity λ, and the performance of a control policy is measured by the corresponding risk-sensitive average cost criterion. Assuming that the optimality equation has a solution, it is shown that the value iteration scheme can be implemented to obtain, in a finite number of steps, (1) an approximation to the optimal λ-sensitive average cost with an error less than a given tolerance, and (2) a stationary policy whose performance index is arbitrarily close to the optimal value. The argument used to establish these results is based on a modification of the original model, which is an extension of a transformation introduced by Schweitzer to analyze the risk-neutral case.

Reviews

Required fields are marked *. Your email address will not be published.