Article ID: | iaor2005711 |
Country: | United Kingdom |
Volume: | 31 |
Issue: | 14 |
Start Page Number: | 2333 |
End Page Number: | 2347 |
Publication Date: | Dec 2004 |
Journal: | Computers and Operations Research |
Authors: | Patek Stephen D. |
We introduce and analyze several new policy iteration type algorithms for average cost Markov decision processes (MDPs). We limit attention to “recurrent state” processes where there exists a state which is recurrent under all stationary policies, and our analysis applies to finite-state problems with compact constraint sets, continuous transition probability functions, and lower-semicontinuous functions. The analysis makes use of an underlying relationship between recurrent state MDPs and the so-called stochastic shortest path problems of Bertsekas and Tsitsiklis. After extending this relationship, we establish the convergence of the new policy iteration type algorithms either to optimality or to within ϵ > 0 of the optimal average cost.