Article ID: | iaor1995336 |
Country: | United States |
Volume: | 42 |
Issue: | 4 |
Start Page Number: | 739 |
End Page Number: | 749 |
Publication Date: | Jul 1994 |
Journal: | Operations Research |
Authors: | White Chelsea C., Eldeib Hany K. |
Keywords: | programming: dynamic |
The authors present new numerical algorithms and bounds for the infinite horizon, discrete stage, finite stage and action Markov decision process with imprecise transition probabilities. They assume that the transition probability mass vector for each state and action is described by a finite number of linear inequalities. This model of imprecision appears to be well suited for describing statistically determined confidence limits and/or natural language statements of likelihood. The numerical procedures for calculating an optimal max-min strategy are based on successive approximations, reward revision, and modified policy iteration. The bounds that are determined are at least as tight as currently available bounds for the case where the transition probabilities are precise.