It is well known that for some tasks, labeled data sets may be hard to gather. Self-training, or pseudo-labeling, tackles the problem of having insufficient training data. In the self-training scheme, the classifier is first trained on a limited, labeled dataset, and after that it is trained on an additional, unlabeled dataset, using its own predictions as labels, provided those predictions are made with high enough confidence. Using credible interval based on MC-dropout as a confidence measure, the proposed method is able to gain substantially better results comparing to several other pseudo-labeling methods and out-performs the former state-of-the-art pseudo-labeling technique by 7 % on the MNIST dataset. In addition to learning from large and static unlabeled datasets, the suggested approach may be more suitable than others as an online learning method where the classifier keeps getting new unlabeled data. The approach may be also applicable in the recent method of pseudo-gradients for training long sequential neural networks.
CITATION STYLE
Bank, D., Greenfeld, D., & Hyams, G. (2019). Improved training for self training by confidence assessments. In Advances in Intelligent Systems and Computing (Vol. 858, pp. 163–173). Springer Verlag. https://doi.org/10.1007/978-3-030-01174-1_13
Mendeley helps you to discover research relevant for your work.