Reordering sparsification of kernel machines in approximate policy iteration

Chunming Liu; Jinze Song; Xin Xu; Pengcheng Zhang

Conference Proceedings

Reordering sparsification of kernel machines in approximate policy iteration

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2009) 5552 LNCS(PART 2) 398-407

DOI: 10.1007/978-3-642-01510-6_46

0Citations

3Readers

Get full text

Abstract

Approximate policy iteration (API), which includes least-squares policy iteration (LSPI) and its kernelized version (KLSPI), has received increasing attention due to their good convergence and generalization abilities in solving difficult reinforcement learning problems. However, the sparsification of feature vectors, especially the kernel-based features, greatly influences the performance of API methods. In this paper, a novel reordering sparsification method is proposed for sparsifiying kernel machines in API. In this method, a greedy strategy is adopted, which adds the sample with the maximal squared approximation error to the kernel dictionary, so that the samples are reordered to improve the performance of kernel sparsification. Experimental results on the learning control of an inverted pendulum verify that by using the proposed algorithm, the size of the kernel dictionary is smaller than that of the previous sequential sparsification algorithm with the same level of sparsity, and the performance of the control policies learned by KLSPI can also be improved. © 2009 Springer Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, C., Song, J., Xu, X., & Zhang, P. (2009). Reordering sparsification of kernel machines in approximate policy iteration. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5552 LNCS, pp. 398–407). https://doi.org/10.1007/978-3-642-01510-6_46

Reordering sparsification of kernel machines in approximate policy iteration

Abstract

Author supplied keywords

Cite

Register to see more suggestions