In this paper we examine the application of temporal difference methods in learning a linear state value function approximation in a game of give-away checkers. Empirical results show that the TD(λ) algorithm can be successfully used to improve playing policy quality in this domain. Training games with strong and random opponents were considered. Results show that learning only on negative game outcomes improved performance of the learning player against strong opponents.
CITATION STYLE
Mańdziuk, J., & Osman, D. (2004). Temporal difference approach to playing give-away checkers. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 3070, pp. 909–914). Springer Verlag. https://doi.org/10.1007/978-3-540-24844-6_141
Mendeley helps you to discover research relevant for your work.