Sequential supervised learning for hypernym discovery from Wikipedia

Berenike Litz; Hagen Langer; Rainer Malaka

Conference Proceedings

Sequential supervised learning for hypernym discovery from Wikipedia

Communications in Computer and Information Science (2011) 128 CCIS 68-80

DOI: 10.1007/978-3-642-19032-2_5

2Citations

7Readers

Get full text

Abstract

Hypernym discovery is an essential task for building and extending ontologies automatically. In comparison to the whole Web as a source for information extraction, online encyclopedias provide far more structuredness and reliability. In this paper we propose a novel approach that combines syntactic and lexical-semantic information to identify hypernymic relationships. We compiled semi-automatically and manually created training data and a gold standard for evaluation with the first sentences from the German version of Wikipedia. We trained a sequential supervised learner with a semantically enhanced tagset. The experiments showed that the cleanliness of the data is far more important than the amount of the same. Furthermore, it was shown that bootstrapping is a viable approach to ameliorate the results. Our approach outperformed the competitive lexico-syntactic patterns by 7% leading to an F1-measure of over .91. © 2011 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Litz, B., Langer, H., & Malaka, R. (2011). Sequential supervised learning for hypernym discovery from Wikipedia. In Communications in Computer and Information Science (Vol. 128 CCIS, pp. 68–80). https://doi.org/10.1007/978-3-642-19032-2_5

Sequential supervised learning for hypernym discovery from Wikipedia

Abstract

Author supplied keywords

Cite

Register to see more suggestions