Article ID: | iaor20001037 |
Country: | Germany |
Volume: | 49 |
Issue: | 1 |
Start Page Number: | 87 |
End Page Number: | 96 |
Publication Date: | Jan 1999 |
Journal: | Mathematical Methods of Operations Research (Heidelberg) |
Authors: | Guo X. |
In this paper, we consider the nonstationary Markov decision processes with average variance criterion on a countable state space, finite action spaces and bounded one-step rewards. From the optimality equations which are provided in this paper, we translate the average variance criterion into a new average expected cost criterion. Then we prove that there exists a Markov policy, which is optimal in an original average expected reward criterion, that minimizes the average variance in the class of optimal policies for the original average expected reward criterion.