A POMDP formulation of proactive learning

Kyle Hollins Wray; Shlomo Zilberstein

Conference ProceedingsOPEN ACCESS

A POMDP formulation of proactive learning

30th AAAI Conference on Artificial Intelligence, AAAI 2016 (2016) 3202-3208

DOI: 10.1609/aaai.v30i1.10400

5Citations

19Readers

Abstract

We cast the Proactive Learning (PAL) problem-Active Learning (AL) with multiple reluctant, fallible, cost-varying oracles-as a Partially Observable Markov Decision Process (POMDP). The agent selects an oracle at each time step to label a data point while it maintains a belief over the true underlying correctness of its current dataset's labels. The goal is to minimize labeling costs while considering the value of obtaining correct labels, thus maximizing final resultant classifier accuracy. We prove three properties that show our particular formulation leads to a structured and bounded-size set of belief points, enabling strong performance of pointbased methods to solve the POMDP. Our method is compared with the original three algorithms proposed by Donmez and Carbonell and a simple baseline. We demonstrate that our approach matches or improves upon the original approach within five different oracle scenarios, each on two datasets. Finally, our algorithm provides a general, well-defined mathematical foundation to build upon.

Cite

CITATION STYLE

APA

Wray, K. H., & Zilberstein, S. (2016). A POMDP formulation of proactive learning. In 30th AAAI Conference on Artificial Intelligence, AAAI 2016 (pp. 3202–3208). AAAI press. https://doi.org/10.1609/aaai.v30i1.10400

A POMDP formulation of proactive learning

Abstract

Cite

Register to see more suggestions