Due to its effectiveness in training precise model using significant fewer labeled instances, active learning has been widely researched and applied. In order to reduce the time complexity of active learning so that the oracle need not wait for the algorithm to provide instance in labeling, we proposed a new active learning method, which leverages batch sampling and direct boundary annotation with a two-stage sampling strategy. In the first stage sampling, the initial seed, which determines the location of boundary annotation, is selected with reject sampling based on the clustering structure of instances to ensure the initial seeds can approximate the distribution of data and with high diversity. In the second stage sampling, by treating the instance sampling as the selection of representative in a local region and maximizing the rewards that can get from selecting a instance as the new representative, we proposed a novel mechanism to maintain local representativeness and diversity of query instances. Compared with the conventional pool-based active learning method, our proposed method does not need to train the model in each iteration, which reduces the amount of calculation and time consumption. The experimental results in three public datasets show that the proposed method has comparable performance with the uncertainty-based active learning methods, which proves that the sampling mechanism in our method is effective. It performs well without retraining the model in each iteration and does not rely on the precision of the model.
CITATION STYLE
Luo, R., & Wang, X. (2020). Batch Active Learning with Two-Stage Sampling. IEEE Access, 8, 46519–46528. https://doi.org/10.1109/ACCESS.2020.2979315
Mendeley helps you to discover research relevant for your work.