Parallel design of robust control in the stochastic environment (the two-armed bandit problem)

0.00 Avg rating—0 Votes

Article ID:	iaor20123713
Volume:	73
Issue:	4
Start Page Number:	689
End Page Number:	701
Publication Date:	Apr 2012
Journal:	Automation and Remote Control
Authors:	Kolnogorov A
Keywords:	stochastic processes, game theory

Abstract:

The problem of rational behavior in the stochastic environment, also known as the two armed bandit problem, is considered in the robust (minimax) setting. A parallel strategy is proposed leading to control, which is arbitrary close to the optimal one for environments with gains having gaussian cumulative distribution functions with unit variance. The invariant recursive equation is obtained for computing the minimax strategy and risk, which are to be found as Bayesian ones associated with the worst‐case a priori distribution. As a result, the well‐known Vogel’s estimates of the minimax risk can be improved. Numerical experiments show that the strategy is efficient in the environments with non‐gaussian distributions, e.g., the binary ones.

Reviews

Required fields are marked *. Your email address will not be published.