Article ID: | iaor19962233 |
Country: | United States |
Volume: | 86 |
Issue: | 1 |
Start Page Number: | 1 |
End Page Number: | 15 |
Publication Date: | Jan 1995 |
Journal: | Journal of Optimization Theory and Applications |
Authors: | Kebir Y., Bouakiz M. |
The Markov decision process is studied under the maximization of the probability that total discounted rewards exceed a target level. The authors focus on and study the dynamic programming equations of the model. They give various properties of the optimal return operator and, for the infinite planning-horizon model, the authors characterize the optimal value function as a maximal fixed point of the previous operator. Various turnpike results relating the finite and infinite-horizon models are also given.