Predicting economic development using geolocated wikipedia articles

47Citations
Citations of this article
112Readers
Mendeley users who have this article in their library.

Abstract

Progress on the UN Sustainable Development Goals (SDGs) is hampered by a persistent lack of data regarding key social, environmental, and economic indicators, particularly in developing countries. For example, data on poverty - the first of seventeen SDGs - is both spatially sparse and infrequently collected in Sub-Saharan Africa due to the high cost of surveys. Here we propose a novel method for estimating socioeconomic indicators using open-source, geolocated textual information from Wikipedia articles. We demonstrate that modern NLP techniques can be used to predict community-level asset wealth and education outcomes using nearby geolocated Wikipedia articles. When paired with nightlights satellite imagery, our method outperforms all previously published benchmarks for this prediction task, indicating the potential of Wikipedia to inform both research in the social sciences and future policy decisions.

Cite

CITATION STYLE

APA

Sheehan, E., Meng, C., Jean, N., Tan, M., Burke, M., Ermon, S., … Lobell, D. (2019). Predicting economic development using geolocated wikipedia articles. In Proceedings of the ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (pp. 2698–2706). Association for Computing Machinery. https://doi.org/10.1145/3292500.3330784

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free