| Article ID: | iaor19981802 |
| Volume: | 35 |
| Issue: | 6 |
| Start Page Number: | 2128 |
| End Page Number: | 2136 |
| Publication Date: | Nov 1997 |
| Journal: | SIAM Journal on Control and Optimization |
| Authors: | Deng Sien |
| Keywords: | markov processes, programming: dynamic |
We prove the existence of stationary Blackwell optimal policies in Markov decision processes with a Borel state space, compact action sets, and continuous-in-action and bounded transition densities and rewards, satisfying a simultaneous Doeblin-type condition. The proof is based on a compactification of the randomized stationary policy space in a weak–strong topology, on the continuity of Laurent coefficients of the discounted rewards in this topology, and on a lexicographical policy improvement. Until now similar results were obtained for the models with a denumerable state space or with a Borel space and finite action sets.