Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes

Average Optimality in Nonhomogeneous Infinite Horizon Markov Decision Processes

0.00 Avg rating0 Votes
Article ID: iaor20113644
Volume: 36
Issue: 1
Start Page Number: 147
End Page Number: 164
Publication Date: Feb 2011
Journal: Mathematics of Operations Research
Authors: , ,
Keywords: programming: probabilistic, stochastic processes
Abstract:

We consider a nonhomogeneous stochastic infinite horizon optimization problem whose objective is to minimize the overall average cost per period of an infinite sequence of actions (average optimality). Optimal solutions to such problems will in general be nonstationary. Moreover, a solution that initially makes poor decisions, and then selects wisely thereafter, can be average optimal. However, we seek average optimal solutions with optimal short‐term, as well as long‐term, behavior. Our approach is to first transform our stochastic problem into one that is deterministic, using the standard device of formulating the problem as one of choosing a sequence of policies, as opposed to actions. Within this deterministic framework, states become probability distributions over the original stochastic states. Then, by weakening the notion of state reachability, and strengthening the notion of efficiency traditionally used in the deterministic framework, we prove that such efficient solutions exist and are average optimal, thus simultaneously exhibiting both optimal long‐ and short‐run behavior. This deterministic view of the property of stochastic ergodicity offers the potential to relax the traditional conditions for average optimality that use coefficients of ergodicity, as well as the opportunity to strengthen the criterion of average optimality through the property of efficiency.

Reviews

Required fields are marked *. Your email address will not be published.