| Article ID: | iaor20102986 |
| Volume: | 37 |
| Issue: | 5 |
| Start Page Number: | 317 |
| End Page Number: | 321 |
| Publication Date: | Sep 2009 |
| Journal: | Operations Research Letters |
| Authors: | Guo Xianping, Song XinYuan, Zhang Junyu |
This paper deals with the bias optimality of multichain models for finite continuous-time Markov decision processes. Based on new performance difference formulas developed here, we prove the convergence of a so-called bias-optimal policy iteration algorithm, which can be used to obtain bias-optimal policies in a finite number of iterations.