Finding minimax strategy and minimax risk in a random environment (the two‐armed bandit problem)

Finding minimax strategy and minimax risk in a random environment (the two‐armed bandit problem)

0.00 Avg rating0 Votes
Article ID: iaor20116055
Volume: 72
Issue: 5
Start Page Number: 1017
End Page Number: 1027
Publication Date: May 2011
Journal: Automation and Remote Control
Authors:
Keywords: statistics: distributions
Abstract:

Minimax strategy and risk in a stationary random environment are found as Bayesian ones corresponding to the worst prior distribution. For environments with normally distributed incomes with unit variance and expectations that depend only on the alternative selected, this distribution can be chosen to be symmetric and asymptotically uniform. This lets one use numerical methods. The results can be used for systems with parallel data processing, in particular, for controlling environments with distributions other than normal.

Reviews

Required fields are marked *. Your email address will not be published.