Facial attribute prediction is a facial analysis task that describes images using natural language features. While many works have attempted to optimize prediction accuracy on CelebA, the largest and most widely used facial attribute dataset, few works have analyzed the accuracy of the dataset’s attribute labels. In this paper, we seek to do just that. Despite the popularity of CelebA, we find through quantitative analysis that there are widespread inconsistencies and inaccuracies in its attribute labeling. We estimate that at least one third of all images have one or more incorrect labels, and reliable predictions are impossible for several attributes due to inconsistent labeling. Our results demonstrate that classifiers struggle with many CelebA attributes not because they are difficult to predict, but because they are poorly labeled. This indicates that the CelebA dataset is flawed as a facial analysis tool and may not be suitable as a generic evaluation benchmark for imbalanced classification.
CITATION STYLE
Lingenfelter, B., Davis, S. R., & Hand, E. M. (2022). A Quantitative Analysis of Labeling Issues in the CelebA Dataset. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13598 LNCS, pp. 129–141). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-20713-6_10
Mendeley helps you to discover research relevant for your work.