First results from using temporal difference learning in shogi

Donald F. Beal; Martin C. Smith

Conference Proceedings

First results from using temporal difference learning in shogi

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (1999) 1558 113-125

DOI: 10.1007/3-540-48957-6_7

13Citations

3Readers

Get full text

Abstract

This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.

Author supplied keywords

Cite

CITATION STYLE

APA

Beal, D. F., & Smith, M. C. (1999). First results from using temporal difference learning in shogi. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1558, pp. 113–125). Springer Verlag. https://doi.org/10.1007/3-540-48957-6_7

First results from using temporal difference learning in shogi

Abstract

Author supplied keywords

Cite

Register to see more suggestions