In the subfield of weakly supervised learning, the problem of learning from label proportions has received growing attention in the machine learning community. Learning from the type of data characteristic of this problem requires specific learning methodologies. Nowadays, the state of the art includes many different techniques to learn the most typical types of classifiers from proportions of labels. However, for the sake of simplicity, all these contributions use synthetic data to validate the proposed learning techniques. Evaluation with real label proportions data has barely been explored. This paper proposes a whole framework for model validation and evaluation when only data labeled with label proportions are available. An approximation to the number of true positive examples is proposed based on a reasonable assumption, which enables the completion of a confusion matrix and thus, the calculation of all the standard evaluation metrics. Additionally, a discussion on the performance of different approaches to cross-validation in this specific problem is presented. Both large empirical studies have been carried out to support the discussion on the contributions of this work.
CITATION STYLE
Hernández-González, J. (2019). A framework for evaluation in learning from label proportions. Progress in Artificial Intelligence, 8(3), 359–373. https://doi.org/10.1007/s13748-019-00187-x
Mendeley helps you to discover research relevant for your work.