Fine-grained geolocalization of user-generated short text based on a weight probability model

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.

This article is free to access.

Abstract

Recently, the fine-grained geolocalization of User-Generated Short Texts (UGST) has become increasingly important. One challenge is that UGST contains relatively little location-indicative information due to such limitations as text length. Therefore, extract and effectively use the location-indicative information is the key issue for improving the effect of geolocalization. The existing works only consider the global weight of the terms and do not distinguish between the importance of identical terms in different locations. In addition, the existing add-one smoothing masks the difference between the features of different locations. In this paper, we propose a fine-grained geolocalization method to predict the PoI-level location of UGSTs based on a weight probability model (FGST-WP). The method mainly includes three parts: 1) Using the reverse maximum match algorithm to filter out UGSTs that do not contain any location-indicative information. 2) Building coupling of terms and locations and adopting a mixed weight strategy to assign weights to terms. 3) Calculating the probability of nongeotagged UGST posted from each location and selecting k locations according to the top- k probabilities. The accuracy of FGST-WP on the three ground-truth datasets reaches 45%, 68%, and 72%, respectively. The results indicate the superior performance of FGST-WP.

Cite

CITATION STYLE

APA

Gao, C., Li, Y., Yang, J., & Dong, W. (2019). Fine-grained geolocalization of user-generated short text based on a weight probability model. IEEE Access, 7, 153579–153591. https://doi.org/10.1109/ACCESS.2019.2948355

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free