| Article ID: | iaor19982955 | 
| Country: | Germany | 
| Volume: | 45 | 
| Issue: | 2 | 
| Start Page Number: | 265 | 
| End Page Number: | 280 | 
| Publication Date: | Jan 1997 | 
| Journal: | Mathematical Methods of Operations Research (Heidelberg) | 
| Authors: | Yushkevich Alexander A., Donchev D.S. | 
| Keywords: | control processes | 
A symmetric Poissonian two-armed bandit becomes, in terms of a posteriori probabilities, a piecewise deterministic Markov decision process. For the case of the switching arms, only one of which creates rewards, we solve explicitly the average optimality equation and prove that a myopic policy is average optimal.