Skip to content
Conference proceedings

Geotagging one hundred million Twitter accounts with total variation minimization

Compton R, Jurgens D, Allen D ...see all

Proceedings - 2014 IEEE International Conference on Big Data, IEEE Big Data 2014 (2015) pp. 393-401 Published by Institute of Electrical and Electronics Engineers Inc.

  • 119

    Readers

    Mendeley users who have this article in their library.
  • 23

    Citations

    Citations of this article.
  • N/A

    Views

    ScienceDirect users who have downloaded this article.
Sign in to save reference

Abstract

Geographically annotated social media is extremely valuable for modern information retrieval. However, when researchers can only access publicly-visible data, one quickly finds that social media users rarely publish location information. In this work, we provide a method which can geolocate the overwhelming majority of active Twitter users, independent of their location sharing preferences, using only publicly-visible Twitter data. Our method infers an unknown user's location by examining their friend's locations. We frame the geotagging problem as an optimization over a social network with a total variation-based objective and provide a scalable and distributed algorithm for its solution. Furthermore, we show how a robust estimate of the geographic dispersion of each user's ego network can be used as a per-user accuracy measure which is effective at removing outlying errors. Leave-many-out evaluation shows that our method is able to infer location for 101,846,236 Twitter users at a median error of 6.38 km, allowing us to geotag over 80\% of public tweets.

Author-supplied keywords

  • Data mining
  • Optimization
  • Social and Information Networks

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text

Authors

  • Ryan Compton

  • David Jurgens

  • David Allen

Cite this document

Choose a citation style from the tabs below