The optimal solution of the lob-pass problem with known reaction curves

The optimal solution of the lob-pass problem with known reaction curves

0.00 Avg rating0 Votes
Article ID: iaor2000354
Country: Japan
Volume: 41
Issue: 4
Start Page Number: 509
End Page Number: 530
Publication Date: Dec 1998
Journal: Journal of the Operations Research Society of Japan
Authors: ,
Keywords: control, calculus of variations
Abstract:

The ‘lob-pass problem’ is a model which is used in psychology. It describes the phenomenon that the same choices decrease an effect, like experience or weariness. Abe and Takeuchi formulated it as an on-line learning problem, and pointed out that it is an extension of the multi-armed bandit problem. In the lob-pass problem, the player's choices will change the environment itself. This is the difference from the multi-armed bandit problems. The proposed strategies for the lob-pass problem repeat the following procedures: (i) observe the reaction from the unknown environment, (ii) estimate the environment, (iii) find the optimal ‘stationary’ strategy for the estimated environment, (iv) determine the choice according to the strategy. Moreover, the criteria for the strategies in these studies are the loss due to uncertainness of the environment, compared with the optimal ‘stationary’ strategy for the known-environment case. To judge whether such policies are appropriate or not, we have to know the optimal strategy, which may not be ‘stationary’, for the known-environment case. It is calculated in the present paper. It is also shown that the ‘matching condition’ assumed in these past studies is the necessary and sufficient condition that the optimal strategy does not depend on the stopping time of the game. The meaning and the appropriateness of the matching condition are discussed. Finally, the asymptotical optimality is defined. We prove that the stationary strategy can be asymptotically optimal for the opponent with the forgetting factor, but no strategy is asymptotically optimal for the opponent without the forgetting factor.

Reviews

Required fields are marked *. Your email address will not be published.