Geolocation is the identification of the real-world geographic location of items such as coded news events. We have developed software to geolocate high volumes of event data, often to the subnational level, by combining existing entity extraction technologies with new statistical ranking algorithms. Our three-stage pipeline consists of: (1) named-entity recognition (identifying the text strings that represent named entities from underlying text and classifying them by type); (2) entity resolution (matching location strings to specific real-world locations referenced in a gazetteer); and, (3) location determination (selecting the most appropriate location for the event). We have used this software operationally to geolocate tens of millions of events and have formally evaluated both the accuracy and specificity of our results. Our latest formal evaluation had an overall subnational accuracy of 78%, with 85% of all events geolocated at a subnational level.
CITATION STYLE
Lautenschlager, J., Starz, J., & Warfield, I. (2017). A statistical approach to the subnational geolocation of event data. In Advances in Intelligent Systems and Computing (Vol. 480, pp. 333–343). Springer Verlag. https://doi.org/10.1007/978-3-319-41636-6_27
Mendeley helps you to discover research relevant for your work.