Exact solution of the Bellman equation for a β-discounted reward in a two-armed bandit with switching arms

0.00 Avg rating—0 Votes

Article ID:	iaor20002390
Country:	United States
Volume:	12
Issue:	2
Start Page Number:	151
End Page Number:	160
Publication Date:	Apr 1999
Journal:	Journal of Applied Mathematics and Stochastic Analysis
Authors:	Donchev Doncho S.
Keywords:	markov processes

Abstract:

We consider the symmetric Poissonian two-armed bandit problem. For the case of switching arms, only one of which creates reward, we solve explicitly the Bellman equation for a β-discounted reward and prove that a myopic policy is optimal.

Reviews

Required fields are marked *. Your email address will not be published.