How does the value function of a Markov decision depend on the transition probabilities?

How does the value function of a Markov decision depend on the transition probabilities?

0.00 Avg rating0 Votes
Article ID: iaor2004690
Country: United States
Volume: 22
Issue: 4
Start Page Number: 872
End Page Number: 885
Publication Date: Nov 1997
Journal: Mathematics of Operations Research
Authors:
Abstract:

The present work deals with the comparison of (discrete time) Markov decision processes (MDPs), which differ only in their transition probabilities. We show that the optimal value function of an MDP is monotone with respect to appropriately defined stochastic order relations. We also find conditions for continuity with respect to suitable probability metrics. The results are applied to some well-known examples, including inventory control and optimal stopping.

Reviews

Required fields are marked *. Your email address will not be published.