A KBRL inference metaheuristic with applications

0Citations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In this chapter we propose an inference metaheuristic for Kernel-Based Reinforcement Learning (KBRL) agents-agents that operate in a continuous-state MDP. The metaheuristic is proposed in the simplified case of greedy policy RL agents with no receding horizon which perform online learning in an environment where feedback is generated by an ergodic and stationary source. We propose two inference strategies: isotropic discrete choice and anisotropic optimization, the former focused on speed and the latter focused on generalization capability. We cast the problem of classification as a RL problem and test the proposed metaheuristic in two experiments: an image recognition experiment on the Yale Faces database and a synthetic data set experiment. We propose a set of inference filters which increase the vigilance of the agent and show that they can prevent the agent from taking erroneous actions in an unknown environment. Two parallel inference algorithms are tested and illustrated in a cluster and GPU implementation.

Cite

CITATION STYLE

APA

Bucur, L., Florea, A., & Chera, C. (2013). A KBRL inference metaheuristic with applications. Studies in Computational Intelligence, 427, 721–749. https://doi.org/10.1007/978-3-642-29694-9_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free