On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

On the average cost optimality equation and the structure of optimal policies for partially observable Markov decision processes

0.00 Avg rating0 Votes
Article ID: iaor19911695
Country: Switzerland
Volume: 29
Start Page Number: 439
End Page Number: 470
Publication Date: Apr 1991
Journal: Annals of Operations Research
Authors: , ,
Abstract:

The authors consider partially observable Markov decision processes with finite or countably infinite (core) state and observation spaces and finite action set. Following a standard approach, an equivalent completely observed problem is formulated, with the same finite action set but with an uncountable state space, namely the space of probability distributions on the original core state space. By developing a suitable theoretical framework, it is shown that some characteristics induced in the original problem due to the countability of the spaces involved are reflected into the equivalent problem. Sufficient conditions are then derived for solutions to the average cost optimality equation to exist. The authors illustrate these results in the context of machine replacement problems. Structural properties for average cost optimal policies are obtained for a two state replacement problem; these are similar to results available for discount optimal policies. The set of assumptions used compares favorably to others currently available.

Reviews

Required fields are marked *. Your email address will not be published.