Article ID: | iaor19931232 |
Country: | Netherlands |
Volume: | 11 |
Issue: | 5 |
Start Page Number: | 267 |
End Page Number: | 272 |
Publication Date: | Jun 1992 |
Journal: | Operations Research Letters |
Authors: | Haviv Moshe, Puterman Martin I. |
This paper provides a differential equation which relates the expected total discounted reward of a reward process to the expected total undiscounted reward of a process which terminates at a negative binomial stopping time. The solution of this equation provides the basis for unbiased estimators of the expected total discounted reward and its derivative with respect to the discount rate. The authors compare this estimator to other estimators and discuss when it might be more efficient. When rewards are positive they show that the esimator is monotone in the sampled variate.