First results from using temporal difference learning in shogi

13Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes first results from the application of Temporal Difference learning [1] to shogi. We report on experiments to determine whether sensible values for shogi pieces can be obtained in the same manner as for western chess pieces [2]. The learning is obtained entirely from randomised self-play, without access to any form of expert knowledge. The piece values are used in a simple search program that chooses shogi moves from a shallow lookahead, using pieces values to evaluate the leaves, with a random tie-break at the top level. Temporal difference learning is used to adjust the piece values over the course of a series of games. The method is successful in learning values that perform well in matches against hand-crafted values.

Cite

CITATION STYLE

APA

Beal, D. F., & Smith, M. C. (1999). First results from using temporal difference learning in shogi. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 1558, pp. 113–125). Springer Verlag. https://doi.org/10.1007/3-540-48957-6_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free