This paper discusses an experimental comparison of supervised and reinforcement learning algorithms for the game of Othello. Motivated from the results, a new learning algorithm MOUSE(μ) (MOnte-Carlo learning Using heuriStic Error reduction) has been developed. MOUSE uses a heuristic model of past experience to improve generalization and reduce noisy estimations. The algorithm was able to tune the parameter vector of a huge linear system consisting of about 1.5 million parameters and to end up at the fourth place in a recent GGS Othello tournament1, a significant result for a self-teaching algorithm. Besides the theoretical aspects of the used learning methods, experimental results and comparisons are presented and discussed. These results demonstrate the advantages and drawbacks of existing learning approaches in strategy games and the potential of the new algorithm. © Springer-Verlag Berlin Heidelberg 2003.
CITATION STYLE
Tournavitis, K. (2003). MOUSE(μ): A self-teaching algorithm that achieved master-strength at Othello. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2883, 11–28. https://doi.org/10.1007/978-3-540-40031-8_2
Mendeley helps you to discover research relevant for your work.