For many tasks, the successful application of deep learning relies on having large amounts of training data, labeled to a high standard. But much of the data in real-world applications suffer from label noise. Data annotation is much more expensive and resource-consuming than data collection, somewhat restricting the successful deployment of deep learning to applications where there are very large and well-labeled datasets. To address this problem, we propose a recursive ensemble learning approach in order to maximize the utilization of data. A disagreement-based annotation method and different voting strategies are the core ideas of the proposed method. Meanwhile, we provide guidelines for how to choose the most suitable among many candidate neural networks, with a pruning strategy that provides convenience. The approach is effective especially when the original dataset contains a significant label noise. We conducted experiments on the datasets of Cats versus Dogs, in which significant amounts of label noise were present, and on the CIFAR-10 dataset, achieving promising results.
CITATION STYLE
Wang, Y., Yang, Y., Liu, Y. X., & Bharath, A. A. (2019). A Recursive Ensemble Learning Approach with Noisy Labels or Unlabeled Data. IEEE Access, 7, 36459–36470. https://doi.org/10.1109/ACCESS.2019.2904403
Mendeley helps you to discover research relevant for your work.