| Article ID: | iaor20083360 |
| Country: | United Kingdom |
| Volume: | 14 |
| Issue: | 6 |
| Start Page Number: | 509 |
| End Page Number: | 520 |
| Publication Date: | Nov 2007 |
| Journal: | International Transactions in Operational Research |
| Authors: | Ohtsubo Yoshio |
| Keywords: | programming: dynamic |
We consider Markov decision processes with a target set, where criterion function is an expectation of minimum function. We formulate the problem as an infinite horizon case with a recurrent class. We show under some conditions that an optimal value function is a unique solution to an optimality equation and there exists a stationary optimal policy. Also we give a policy improvement method.