Abstract
In supervised learning, we often face with ambiguous (A) samples that are difficult to label even by domain experts. In this paper, we consider a binary classification problem in the presence of such A samples. This problem is substantially different from semi-supervised learning since unlabeled samples are not necessarily difficult samples. Also, it is different from 3-class classification with the positive (P), negative (N), and A classes since we do not want to classify test samples into the A class. Our proposed method extends binary classification with reject option, which trains a classifier and a rejector simultaneously using P and N samples based on the 0-1-c loss with rejection cost c. More specifically, we propose to train a classifier and a rejector under the 0-1-c-d loss using P, N, and A samples, where d is the misclassification penalty for ambiguous samples. In our practical implementation, we use a convex upper bound of the 0-1-c-d loss for computational tractability. Numerical experiments demonstrate that our method can successfully utilize the additional information brought by such A training data.
Author supplied keywords
Cite
CITATION STYLE
Otani, N., Otsubo, Y., Koike, T., & Sugiyama, M. (2020). Binary classification with ambiguous training data. Machine Learning, 109(12), 2369–2388. https://doi.org/10.1007/s10994-020-05915-2
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.