The quality of a classifier hinges on the availability of training data. In scenarios where data collection is restricted or expensive, e.g., compute-intensive simulations, training data may be small and/or biased. In principle, data synthesis then allows to extend the data set. Yet it is difficult for a user to extend the data without any guidance when the data space is unbound or of high dimensionality. In this article we target at the domain expansion problem, i.e., expanding the classifier knowledge beyond an initial sample that completely falls into one class. We first propose a general framework for query synthesis in the one-class setting. Then we present a new query synthesis strategy to quickly explore the data space beyond the initial sample. For the evaluation we derive three options to simulate an oracle in the one-class setting that can answer arbitrary queries. Experiments on both synthetic and real world data demonstrate that our new query strategy indeed expands the knowledge of a one-class classifier beyond a small and biased initial sample. Our strategy outperforms realistic baselines on most domain expansion problems.
CITATION STYLE
Englhardt, A., & Böhm, K. (2020). Exploring the unknown – Query synthesis in one-class active learning. In Proceedings of the 2020 SIAM International Conference on Data Mining, SDM 2020 (pp. 145–153). Society for Industrial and Applied Mathematics Publications. https://doi.org/10.1137/1.9781611976236.17
Mendeley helps you to discover research relevant for your work.