| Article ID: | iaor19962233 |
| Country: | United States |
| Volume: | 86 |
| Issue: | 1 |
| Start Page Number: | 1 |
| End Page Number: | 15 |
| Publication Date: | Jan 1995 |
| Journal: | Journal of Optimization Theory and Applications |
| Authors: | Kebir Y., Bouakiz M. |
The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. The authors focus on and study the dynamic programming equations of the model. They give various properties of the optimal return operator and, for the infinite planning-horizon model, the authors characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.