The linear contextual bandits is a sequential decision-making problem where an agent decides among sequential actions given their corresponding contexts. Since large-scale data sets become more and more common, we study the linear contextual bandits in high-dimensional situations. Recent works focus on employing matrix sketching methods to accelerating contextual bandits. However, the matrix approximation error will bring additional terms to the regret bound. In this paper we first propose a novel matrix sketching method which is called Spectral Compensation Frequent Directions (SCFD). Then we propose an efficient approach for contextual bandits by adopting SCFD to approximate the covariance matrices. By maintaining and manipulating sketched matrices, our method only needs O(md) space and O(md) update time in each round, where d is the dimensionality of the data and m is the sketching size. Theoretical analysis reveals that our method has better regret bounds than previous methods in high-dimensional cases. Experimental results demonstrate the effectiveness of our algorithm and verify our theoretical guarantees.
CITATION STYLE
Chen, C., Luo, L., Zhang, W., Yu, Y., & Lian, Y. (2020). Efficient and robust high-dimensional linear contextual bandits. In IJCAI International Joint Conference on Artificial Intelligence (Vol. 2021-January, pp. 4259–4265). International Joint Conferences on Artificial Intelligence. https://doi.org/10.24963/ijcai.2020/588
Mendeley helps you to discover research relevant for your work.