Article ID: | iaor20082753 |
Country: | United States |
Volume: | 53 |
Issue: | 2 |
Start Page Number: | 308 |
End Page Number: | 322 |
Publication Date: | Feb 2007 |
Journal: | Management Science |
Authors: | Tsitsiklis John N., Simester Duncan, Mannor Shie, Sun Peng |
We consider a finite-state, finite-action, infinite-horizon, discounted reward Markov decision process and study the bias and variance in the value function estimates that result from empirical estimates of the model parameters. We provide closed-form approximations for the bias and variance, which can then be used to derive confidence intervals around the value function estimates. We illustrate and validate our findings using a large database describing the transaction and mailing histories for customers of a mail-order catalog firm.