Skip to content
Conference proceedings

Geotagging one hundred million Twitter accounts with total variation minimization

Compton R, Jurgens D, Allen D...(+3 more)

Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014 (2015) pp. 393-401 Published by Institute of Electrical and Electronics Engineers Inc.

  • 113


    Mendeley users who have this article in their library.
  • 16


    Citations of this article.
  • N/A


    ScienceDirect users who have downloaded this article.
Sign in to save reference


Geographically annotated social media is extremely valuable for modern information retrieval. However, when researchers can only access publicly-visible data, one quickly finds that social media users rarely publish location information. In this work, we provide a method which can geolocate the overwhelming majority of active Twitter users, independent of their location sharing preferences, using only publicly-visible Twitter data. Our method infers an unknown user's location by examining their friend's locations. We frame the geotagging problem as an optimization over a social network with a total variation-based objective and provide a scalable and distributed algorithm for its solution. Furthermore, we show how a robust estimate of the geographic dispersion of each user's ego network can be used as a per-user accuracy measure which is effective at removing outlying errors. Leave-many-out evaluation shows that our method is able to infer location for 101,846,236 Twitter users at a median error of 6.38 km, allowing us to geotag over 80\% of public tweets.

Author-supplied keywords

  • Data mining
  • Optimization
  • Social and Information Networks

Find this document

Get full text

Cite this document

Choose a citation style from the tabs below