Hypernym discovery is an essential task for building and extending ontologies automatically. In comparison to the whole Web as a source for information extraction, online encyclopedias provide far more structuredness and reliability. In this paper we propose a novel approach that combines syntactic and lexical-semantic information to identify hypernymic relationships. We compiled semi-automatically and manually created training data and a gold standard for evaluation with the first sentences from the German version of Wikipedia. We trained a sequential supervised learner with a semantically enhanced tagset. The experiments showed that the cleanliness of the data is far more important than the amount of the same. Furthermore, it was shown that bootstrapping is a viable approach to ameliorate the results. Our approach outperformed the competitive lexico-syntactic patterns by 7% leading to an F1-measure of over .91. © 2011 Springer-Verlag.
CITATION STYLE
Litz, B., Langer, H., & Malaka, R. (2011). Sequential supervised learning for hypernym discovery from Wikipedia. In Communications in Computer and Information Science (Vol. 128 CCIS, pp. 68–80). https://doi.org/10.1007/978-3-642-19032-2_5
Mendeley helps you to discover research relevant for your work.