Article ID: | iaor20162891 |
Volume: | 63 |
Issue: | 4 |
Start Page Number: | 320 |
End Page Number: | 334 |
Publication Date: | Jun 2016 |
Journal: | Naval Research Logistics (NRL) |
Authors: | Wang Jue |
Keywords: | stochastic processes, simulation, markov processes, programming: markov decision, combinatorial optimization |
We consider a stochastic partially observable system that can switch between a normal state and a transient abnormal state before entering a persistent abnormal state. Only the persistent abnormal state requires alarms. The transient and persistent abnormal states may be similar in appearance, which can result in excess false alarms. We propose a partially observable Markov decision process model to minimize the false alarm rate, subject to a given upper bound on the expected alarm delay time. The cost parameter is treated as the Lagrange multiplier, which can be estimated from the bound of the alarm delay. We show that the optimal policy has a control‐limit structure on the probability of persistent abnormality, and derive closed‐form bounds for the control limit and present an algorithm to specify the Lagrange multiplier. We also study a specialized model where the transient and persistent abnormal states have the same observation distribution, in which case an intuitive ‘watchful‐waiting’ policy is optimal. 2016 Wiley Periodicals, Inc. Naval Research Logistics 63: 320–334, 2016