Notes on equivalent stationary policies in Markov decision processes with total rewards

Notes on equivalent stationary policies in Markov decision processes with total rewards

0.00 Avg rating0 Votes
Article ID: iaor19972491
Country: Germany
Volume: 44
Issue: 2
Start Page Number: 205
End Page Number: 221
Publication Date: Sep 1996
Journal: Mathematical Methods of Operations Research (Heidelberg)
Authors: ,
Keywords: programming: dynamic
Abstract:

The authors construct examples of Markov Decision Processes for which, for a given initial state and for a given nonstationary transient policy, there is no equivalent (randomized) stationary policy, i.e. there is no stationary policy which occupation measure is equal to the occupation measure of a given policy. They also investigate the relation between the existence of equivalent stationary policies in special models and the existence of equivalent strategies in various classes of nonstationary policies in general models.

Reviews

Required fields are marked *. Your email address will not be published.