Efficient and robust high-dimensional linear contextual bandits

6Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.

Abstract

The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) update time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.

Cite

CITATION STYLE

APA

Chen, C., Luo, L., Zhang, W., Yu, Y., & Lian, Y. (2020). Efficient and robust high-dimensional linear contextual bandits. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 4259–4265). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/588

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free