Generating pseudo-ground truth for predicting new concepts in social streams

David Graus; Manos Tsagkias; Lars Buitinck; Maarten De Rijke

Conference Proceedings

Generating pseudo-ground truth for predicting new concepts in social streams

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8416 LNCS 286-298

DOI: 10.1007/978-3-319-06028-6_24

2Citations

17Readers

Get full text

Abstract

The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and recommendation of new content in real-time applications. We present an unsupervised method for generating pseudo-ground truth for training a named entity recognizer to specifically identify entities that will become concepts in a knowledge base in the setting of social streams. We show that our method is able to deal with missing labels, justifying the use of pseudo-ground truth generation in this task. Finally, we show how our method significantly outperforms a lexical-matching baseline, by leveraging strategies for sampling pseudo-ground truth based on entity confidence scores and textual quality of input documents. © 2014 Springer International Publishing Switzerland.

Cite

CITATION STYLE

APA

Graus, D., Tsagkias, M., Buitinck, L., & De Rijke, M. (2014). Generating pseudo-ground truth for predicting new concepts in social streams. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8416 LNCS, pp. 286–298). Springer Verlag. https://doi.org/10.1007/978-3-319-06028-6_24

Generating pseudo-ground truth for predicting new concepts in social streams

Abstract

Cite

Register to see more suggestions