We investigate the influence of language on the accuracy of geolocating Twitter users. Our analysis, using a large corpus of tweets written in thirteen languages, provides a new understanding of the reasons behind reported performance disparities between languages. The results show that data imbalance has a greater impact on accuracy than geographical coverage. A comparison between micro and macro averaging demonstrates that existing evaluation approaches are less appropriate than previously thought. Our results suggest both averaging approaches should be used to effectively evaluate geolocation.
CITATION STYLE
Mourad, A., Scholer, F., & Sanderson, M. (2017). Language influences on tweeter geolocation. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10193 LNCS, pp. 331–342). Springer Verlag. https://doi.org/10.1007/978-3-319-56608-5_26
Mendeley helps you to discover research relevant for your work.