| Article ID: | iaor2004695 | 
| Country: | United States | 
| Volume: | 28 | 
| Issue: | 1 | 
| Start Page Number: | 194 | 
| End Page Number: | 200 | 
| Publication Date: | Feb 2003 | 
| Journal: | Mathematics of Operations Research | 
| Authors: | Golubin A.Y. | 
| Keywords: | programming: dynamic | 
The undiscounted, unichain, finite state Markov decision process with compact action space is studied. We provide a counterexample for a result in Hordijk and Puterman and give an alternate proof of the convegence of policy iteration under the condition that there exists a state that is recurrent under every stationary policy. The analysis essentially uses a two-term matrix representation for the relative value vectors generated by policy iteration procedure.