Computationally feasible bounds for partially observed Markov decision processes

0.00 Avg rating—0 Votes

Article ID:	iaor1993752
Country:	United States
Volume:	39
Issue:	1
Start Page Number:	162
End Page Number:	175
Publication Date:	Jan 1991
Journal:	Operations Research
Authors:	Lovejoy William S.
Keywords:	programming: dynamic

Abstract:

A partially observed Markov decision process is a sequential decision problem where information concerning parameters of interest is incomplete, and possible actions include sampling, surveying, or otherwise collecting additional information. Such problems can theoretically be solved as dynamic programs, but the relevant state space is infinite, which inhibits algorithmic solution. This paper explains how to approximate the state space by a finite grid of points, and use that grid to construct upper and lower value function bounds, generate approximate nonstationary and stationary policies, and bound the value loss relative to optimal for using these policies in the decision problem. A numerical example illustrates the methodology.

Reviews

Required fields are marked *. Your email address will not be published.