Markov decision processes with a stopping time constraint

Markov decision processes with a stopping time constraint

0.00 Avg rating0 Votes
Article ID: iaor20021940
Country: Germany
Volume: 53
Issue: 2
Start Page Number: 279
End Page Number: 295
Publication Date: Jan 2001
Journal: Mathematical Methods of Operations Research (Heidelberg)
Authors:
Abstract:

In this paper, the optimization problem from a stopped Markov decision process with finite states and actions is considered over stopping times τ constrained so that ℰτ ≦ α for some fixed α > 0. The problem is solved through randomization of stopping times and mathematical programming formulation by occupation measures. Another representation, called F-representation, of randomized stopping times is given, by which the concept of Markov or stationary randomized stopping times is introduced. We treat two types of occupation measures, running and stopped, but stopped occupation measure is shown to be expressed by running one. We study the properties of the set of running occupation measures achieved by different classes of pairs of policies and randomized stopping times. Analyzing the equivalent mathematical programming problem formulated by running occupation measures corresponding with stationary policies and stationary randomized stopping times, we prove the existence of an optimal constrained pair of stationary policy and stopping time requiring randomization in at most one state.

Reviews

Required fields are marked *. Your email address will not be published.