A hybrid approach for robust multilingual toponym extraction and disambiguation

7Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Toponym extraction and disambiguation are key topics recently addressed by fields of Information Extraction and Geographical Information Retrieval. Toponym extraction and disambiguation are highly dependent processes. Not only toponym extraction effectiveness affects disambiguation, but also disambiguation results may help improving extraction accuracy. In this paper we propose a hybrid toponym extraction approach based on Hidden Markov Models (HMM) and Support Vector Machines (SVM). Hidden Markov Model is used for extraction with high recall and low precision. Then SVM is used to find false positives based on informativeness features and coherence features derived from the disambiguation results. Experimental results conducted with a set of descriptions of holiday homes with the aim to extract and disambiguate toponyms showed that the proposed approach outperform the state of the art methods of extraction and also proved to be robust. Robustness is proved on three aspects: language independence, high and low HMM threshold settings, and limited training data. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Habib, M. B., & Van Keulen, M. (2013). A hybrid approach for robust multilingual toponym extraction and disambiguation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7912 LNCS, pp. 1–15). https://doi.org/10.1007/978-3-642-38634-3_1

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free