| Article ID: | iaor20082753 |
| Country: | United States |
| Volume: | 53 |
| Issue: | 2 |
| Start Page Number: | 308 |
| End Page Number: | 322 |
| Publication Date: | Feb 2007 |
| Journal: | Management Science |
| Authors: | Tsitsiklis John N., Simester Duncan, Mannor Shie, Sun Peng |
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance, which can then be used to derive confidence intervals around the value function estimates. We illustrate and validate our findings using a large database describing the transaction and mailing histories for customers of a mail-order catalog firm.