Learning behaviors of the hierarchical structure stochastic automata operating in the general nonstationary multiteacher environment are considered. It is shown that convergence with probability 1 to the optimal path is ensured by a new learning algorithm which is an extended form of the relative reward strength algorithm.
CITATION STYLE
Baba, N., & Mogami, Y. (2003). A new learning algorithm for the hierarchical structure learning automata operating in the general nonstationary multiteacher environment. In Lecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science) (Vol. 2773 PART 1, pp. 1122–1128). Springer Verlag. https://doi.org/10.1007/978-3-540-45224-9_151
Mendeley helps you to discover research relevant for your work.