A two-state partially observable Markov decision process with uniformly distributed observations

A two-state partially observable Markov decision process with uniformly distributed observations

0.00 Avg rating0 Votes
Article ID: iaor20002967
Country: United States
Volume: 44
Issue: 3
Start Page Number: 458
End Page Number: 463
Publication Date: May 1996
Journal: Operations Research
Authors:
Abstract:

A controller observes a production system periodically, over time. If the system is in the GOOD state during one period, there is a constant probability that it will deteriorate and be in the BAD state during the next period (and remains there). The true state of the system is unobservable and can only be inferred from observations (quality of output). Two actions are available: CONTINUE or REPLACE (for a fixed cost). The objective is to maximize the expected discounted value of the total future income. For both the finite- and infinite-horizon problems, the optimal policy is of a CONTROL LIMIT (CLT) type: continue if the good state probability exceeds the CLT, and replace otherwise. The computation of the CLT involves a functional equation. An analytical solution for this equation is as yet unknown. For uniformly distributed observations we obtain the infinite-horizon CLT analytically. We also show that the finite horizon CLTs, as a function of the time remaining, are not necessarily monotone, which is counterintuitive.

Reviews

Required fields are marked *. Your email address will not be published.