Constrained Markov decision processes with first passage criteria

0.00 Avg rating—0 Votes

Article ID:	iaor20133931
Volume:	206
Issue:	1
Start Page Number:	197
End Page Number:	219
Publication Date:	Jul 2013
Journal:	Annals of Operations Research
Authors:	Guo Xianping, Huang Yonghui, Wei Qingda
Keywords:	queues: applications

Abstract:

This paper deals with constrained Markov decision processes (MDPs) with first passage criteria. The objective is to maximize the expected reward obtained during a first passage time to some target set, and a constraint is imposed on the associated expected cost over this first passage time. The state space is denumerable, and the rewards/costs are possibly unbounded. In addition, the discount factor is state‐action dependent and is allowed to be equal to one. We develop suitable conditions for the existence of a constrained optimal policy, which are generalizations of those for constrained MDPs with the standard discount criteria. Moreover, it is revealed that the constrained optimal policy randomizes between two stationary policies differing in at most one state. Finally, we use a controlled queueing system to illustrate our results, which exhibits some advantage of our optimality conditions.

Reviews

Required fields are marked *. Your email address will not be published.