An acquisition of evaluation function for shogi by learning self-play

An acquisition of evaluation function for shogi by learning self-play

0.00 Avg rating0 Votes
Article ID: iaor20021379
Country: United Kingdom
Volume: 8
Issue: 3
Start Page Number: 305
End Page Number: 315
Publication Date: May 2001
Journal: International Transactions in Operational Research
Authors: , ,
Keywords: recreation & tourism
Abstract:

Since Deep Blue, which is a chess program, beat the world human chess champion, recent interest in computer games has been directed to shogi. However, the search space for shogi is larger than that of chess and a captured piece is available again in shogi. To overcome these difficulties, we propose a reinforcement learning method by self-play, in order to obtain a static evaluation function, which is a map from any positions in shogi to real values. Our proposed method is based on temporal difference learning, developed by R. Sutton and applied to backgammon by G. Tesauro. In our method, the neural network, which takes the board description of shogi positions and outputs the winning percentage from the position, is trained by only self-play without any knowledge of shogi. In order to show the effectiveness of obtained evaluation function, some computational experiments will be presented.

Reviews

Required fields are marked *. Your email address will not be published.