Balanced linear contextual bandits

Maria Dimakopoulou; Zhengyuan Zhou; Susan Athey; Guido Imbens

Conference ProceedingsOPEN ACCESS

Balanced linear contextual bandits

33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (2019) 3445-3453

DOI: 10.1609/aaai.v33i01.33013445

46Citations

70Readers

Abstract

Contextual bandit algorithms are sensitive to the estimation method of the outcome model as well as the exploration method used, particularly in the presence of rich heterogeneity or complex outcome models, which can lead to difficult estimation problems along the path of learning. We develop algorithms for contextual bandits with linear payoffs that integrate balancing methods from the causal inference literature in their estimation to make it less prone to problems of estimation bias. We provide the first regret bound analyses for linear contextual bandits with balancing and show that our algorithms match the state of the art theoretical guarantees. We demonstrate the strong practical advantage of balanced contextual bandits on a large number of supervised learning datasets and on a synthetic example that simulates model misspecification and prejudice in the initial training data.

Cite

CITATION STYLE

APA

Dimakopoulou, M., Zhou, Z., Athey, S., & Imbens, G. (2019). Balanced linear contextual bandits. In 33rd AAAI Conference on Artificial Intelligence, AAAI 2019, 31st Innovative Applications of Artificial Intelligence Conference, IAAI 2019 and the 9th AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019 (pp. 3445–3453). AAAI Press. https://doi.org/10.1609/aaai.v33i01.33013445

Balanced linear contextual bandits

Abstract

Cite

Register to see more suggestions