Generating pseudo-ground truth for predicting new concepts in social streams

2Citations
Citations of this article
17Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and recommendation of new content in real-time applications. We present an unsupervised method for generating pseudo-ground truth for training a named entity recognizer to specifically identify entities that will become concepts in a knowledge base in the setting of social streams. We show that our method is able to deal with missing labels, justifying the use of pseudo-ground truth generation in this task. Finally, we show how our method significantly outperforms a lexical-matching baseline, by leveraging strategies for sampling pseudo-ground truth based on entity confidence scores and textual quality of input documents. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Graus, D., Tsagkias, M., Buitinck, L., & De Rijke, M. (2014). Generating pseudo-ground truth for predicting new concepts in social streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8416 LNCS, pp. 286–298). Springer Verlag. https://doi.org/10.1007/978-3-319-06028-6_24

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free