Competing Markov decision processes

0.00 Avg rating—0 Votes

Article ID:	iaor19911698
Country:	Switzerland
Volume:	29
Start Page Number:	537
End Page Number:	564
Publication Date:	Apr 1991
Journal:	Annals of Operations Research
Authors:	Glazebrook K.D.

Abstract:

A class of discounted Markov decision processes (MDPs) is formed by bringing together individual MDPs sharing the same discount rate. These are in competition in the sense that at each decision epoch a single action is chosen from the union of the action sets of the individual MDPs. Such families of competing MDPs have been used to model a variety of problems in stochastic resource allocation and in the sequential design of experiments. Suppose that S is a stationary strategy for such a family, that S* is an optimal strategy and that R(S), R(S*) denote the respective rewards earned. The paper extends (and explains) existing theory based on the Gittins index to give bounds on R(S*)-R(S) for this important class of processes. The procedures are illustrated by examples taken from the fields of stochastic scheduling and research planning.

Reviews

Required fields are marked *. Your email address will not be published.