Article ID: | iaor20083360 |
Country: | United Kingdom |
Volume: | 14 |
Issue: | 6 |
Start Page Number: | 509 |
End Page Number: | 520 |
Publication Date: | Nov 2007 |
Journal: | International Transactions in Operational Research |
Authors: | Ohtsubo Yoshio |
Keywords: | programming: dynamic |
We consider Markov decision processes with a target set, where criterion function is an expectation of minimum function. We formulate the problem as an infinite horizon case with a recurrent class. We show under some conditions that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also we give a policy improvement method.