Finite-memory suboptimal design for partially observed Markov decision processes

Finite-memory suboptimal design for partially observed Markov decision processes

0.00 Avg rating0 Votes
Article ID: iaor1995335
Country: United States
Volume: 42
Issue: 3
Start Page Number: 439
End Page Number: 455
Publication Date: May 1994
Journal: Operations Research
Authors: ,
Abstract:

The authors develop bounds on the value function and a suboptimal design for the partially observed Markov decision process. These bounds and suboptimal design are based on the M most recent observations and actions. An a priori measure of the quality of these bounds is given. The authors show that larger M implies tighter bounds. An operations count analysis indicates that (’ℝA’ℝZ)M’+1(’ℝS) multiplications and additions are requuired per successive approximations iteration of the suboptimal design algorithm, where A, Z, and S are the action, observation, and state spaces, respectively, suggesting the algorithm is of potential use for problems with large state spaces. A preliminary numerical study indicates that the quality of the suboptimal design can be excellent.

Reviews

Required fields are marked *. Your email address will not be published.