Many solutions for coarse geolocating of users at the time they post a message exist. However, for many important applications, like traffic monitoring and event detection, finer geolocation at the level of city neighborhoods, i.e., at a sub-city level, is needed. Data-driven approaches often do not guarantee good accuracy and efficiency due to the higher number of sub-city level positions to be estimated and the low availability of balanced and large training sets. We claim that external information sources overcome limitations of data-driven approaches in achieving good accuracy for sub-city level geolocation and we present a knowledge-driven approach achieving good results once the reference area of a message is known. Our algorithm, called Sherloc, exploits toponyms in the message, extracts their semantic from a geographic gazetteer, and embeds them into a metric space that captures the semantic distance among them. We identify the semantically closest toponyms to a message and then cluster them with respect to their spatial locations. Sherloc requires no prior training, it can infer the location at sub-city level with high accuracy, and it is not limited to geolocating on a fixed spatial grid.
CITATION STYLE
Di Rocco, L., Dassereto, F., Bertolotto, M., Buscaldi, D., Catania, B., & Guerrini, G. (2021). Sherloc: a knowledge-driven algorithm for geolocating microblog messages at sub-city level. International Journal of Geographical Information Science, 35(1), 84–115. https://doi.org/10.1080/13658816.2020.1764003
Mendeley helps you to discover research relevant for your work.