Spelling variants of words or word sense ambiguity takes many costs in such processes as Data Integration, Information Searching, data pre-processing for Data Mining, and so on. It is useful to construct relations between a word or phrases and a representative name of the entity to meet these demands. To reduce the costs, this paper discusses how to automatically discover "sameAs" and "meaningOf" links from Japanese Wikipedia. In order to do so, we gathered relevant features such as IDF, string similarity, number of hypernym, and so on. We have identified the link-based score on salient features based on SVM results with 960,000 anchor link pairs. These case studies show us that our link discovery method goes well with more than 70% precision/ recall rate. © Springer International Publishing 2014.
CITATION STYLE
Kagawa, K., Tamagawa, S., & Yamaguchi, T. (2014). An automatic sameAs link discovery from Wikipedia. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8388 LNCS, pp. 399–413). Springer Verlag. https://doi.org/10.1007/978-3-319-06826-8_29
Mendeley helps you to discover research relevant for your work.