A note on the convergence of policy iteration in Markov decision processes with compact action spaces

A note on the convergence of policy iteration in Markov decision processes with compact action spaces

0.00 Avg rating0 Votes
Article ID: iaor2004695
Country: United States
Volume: 28
Issue: 1
Start Page Number: 194
End Page Number: 200
Publication Date: Feb 2003
Journal: Mathematics of Operations Research
Authors:
Keywords: programming: dynamic
Abstract:

The undiscounted, unichain, finite state Markov decision process with compact action space is studied. We provide a counterexample for a result in Hordijk and Puterman and give an alternate proof of the convegence of policy iteration under the condition that there exists a state that is recurrent under every stationary policy. The analysis essentially uses a two-term matrix representation for the relative value vectors generated by policy iteration procedure.

Reviews

Required fields are marked *. Your email address will not be published.