Learning to tag from open vocabulary labels

Edith Law; Burr Settles; Tom Mitchell

Conference ProceedingsOPEN ACCESS

Learning to tag from open vocabulary labels

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2010) 6322 LNAI(PART 2) 211-226

DOI: 10.1007/978-3-642-15883-4_14

14Citations

51Readers

Abstract

Most approaches to classifying media content assume a fixed, closed vocabulary of labels. In contrast, we advocate machine learning approaches which take advantage of the millions of free-form tags obtainable via online crowd-sourcing platforms and social tagging websites. The use of such open vocabularies presents learning challenges due to typographical errors, synonymy, and a potentially unbounded set of tag labels. In this work, we present a new approach that organizes these noisy tags into well-behaved semantic classes using topic modeling, and learn to predict tags accurately using a mixture of topic classes. This method can utilize an arbitrary open vocabulary of tags, reduces training time by 94% compared to learning from these tags directly, and achieves comparable performance for classification and superior performance for retrieval. We also demonstrate that on open vocabulary tasks, human evaluations are essential for measuring the true performance of tag classifiers, which traditional evaluation methods will consistently underestimate. We focus on the domain of tagging music clips, and demonstrate our results using data collected with a human computation game called TagATune. © 2010 Springer-Verlag Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Law, E., Settles, B., & Mitchell, T. (2010). Learning to tag from open vocabulary labels. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6322 LNAI, pp. 211–226). https://doi.org/10.1007/978-3-642-15883-4_14

Learning to tag from open vocabulary labels

Abstract

Author supplied keywords

Cite

Register to see more suggestions