A statistical approach to the subnational geolocation of event data

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Geolocation is the identification of the real-world geographic location of items such as coded news events. We have developed software to geolocate high volumes of event data, often to the subnational level, by combining existing entity extraction technologies with new statistical ranking algorithms. Our three-stage pipeline consists of: (1) named-entity recognition (identifying the text strings that represent named entities from underlying text and classifying them by type); (2) entity resolution (matching location strings to specific real-world locations referenced in a gazetteer); and, (3) location determination (selecting the most appropriate location for the event). We have used this software operationally to geolocate tens of millions of events and have formally evaluated both the accuracy and specificity of our results. Our latest formal evaluation had an overall subnational accuracy of 78%, with 85% of all events geolocated at a subnational level.

Cite

CITATION STYLE

APA

Lautenschlager, J., Starz, J., & Warfield, I. (2017). A statistical approach to the subnational geolocation of event data. In Advances in Intelligent Systems and Computing (Vol. 480, pp. 333–343). Springer Verlag. https://doi.org/10.1007/978-3-319-41636-6_27

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free