| Article ID: | iaor2005711 |
| Country: | United Kingdom |
| Volume: | 31 |
| Issue: | 14 |
| Start Page Number: | 2333 |
| End Page Number: | 2347 |
| Publication Date: | Dec 2004 |
| Journal: | Computers and Operations Research |
| Authors: | Patek Stephen D. |
We introduce and analyze several new policy iteration type algorithms for average cost Markov decision processes (MDPs). We limit attention to “recurrent state” processes where there exists a state which is recurrent under all stationary policies, and our analysis applies to finite-state problems with compact constraint sets, continuous transition probability functions, and lower-semicontinuous functions. The analysis makes use of an underlying relationship between recurrent state MDPs and the so-called stochastic shortest path problems of Bertsekas and Tsitsiklis. After extending this relationship, we establish the convergence of the new policy iteration type algorithms either to optimality or to within ϵ > 0 of the optimal average cost.