Policy iteration type algorithms for recurrent state Markov decision processes

Policy iteration type algorithms for recurrent state Markov decision processes

0.00 Avg rating0 Votes
Article ID: iaor2005711
Country: United Kingdom
Volume: 31
Issue: 14
Start Page Number: 2333
End Page Number: 2347
Publication Date: Dec 2004
Journal: Computers and Operations Research
Authors:
Abstract:

We introduce and analyze several new policy iteration type algorithms for average cost Markov decision processes (MDPs). We limit attention to “recurrent state” processes where there exists a state which is recurrent under all stationary policies, and our analysis applies to finite-state problems with compact constraint sets, continuous transition probability functions, and lower-semicontinuous functions. The analysis makes use of an underlying relationship between recurrent state MDPs and the so-called stochastic shortest path problems of Bertsekas and Tsitsiklis. After extending this relationship, we establish the convergence of the new policy iteration type algorithms either to optimality or to within ϵ > 0 of the optimal average cost.

Reviews

Required fields are marked *. Your email address will not be published.