| Article ID: | iaor20072034 |
| Country: | United States |
| Volume: | 29 |
| Issue: | 2 |
| Start Page Number: | 339 |
| End Page Number: | 352 |
| Publication Date: | May 2004 |
| Journal: | Mathematics of Operations Research |
| Authors: | Xiaobo Z., Jianyong L. |
| Keywords: | programming: dynamic |
In this paper we investigate average reward semi-Markov decision processes with a general multichain structure using a data-transformation method. By solving the transformed discrete-time average Markov decision processes, we can obtain significant and interesting information on the original average semi-Markov decision processes. If the original semi-Markov decision processes satisfy some appropriate conditions, then stationary optimal policies in the transformed discrete-time models are also optimal in the original semi-Markov decision processes.