Exact solution of the Bellman equation for a β-discounted reward in a two-armed bandit with switching arms

Exact solution of the Bellman equation for a β-discounted reward in a two-armed bandit with switching arms

0.00 Avg rating0 Votes
Article ID: iaor20002390
Country: United States
Volume: 12
Issue: 2
Start Page Number: 151
End Page Number: 160
Publication Date: Apr 1999
Journal: Journal of Applied Mathematics and Stochastic Analysis
Authors:
Keywords: markov processes
Abstract:

We consider the symmetric Poissonian two-armed bandit problem. For the case of switching arms, only one of which creates reward, we solve explicitly the Bellman equation for a β-discounted reward and prove that a myopic policy is optimal.

Reviews

Required fields are marked *. Your email address will not be published.