This paper proposes a statistical methodology for mining Wikipedia to discover characteristics associated with life outcomes. The methodology is demonstrated using first names and childhood environment. By comparing over 35,000 Wikipedia biographies against spatially and temporally matched census data, we show that individuals with rare names are twice as likely to appear in Wikipedia (RR=2.43 for females; RR=2.30 for males). This result is supported by past studies. Furthermore, birth location also plays a role in success: individuals born in New York and California are ∼2x more likely to become entertainers, and those born in the South are ∼1.5x more likely to become athletes. These results validate the proposed methodology of using Wikipedia to study life outcomes. Copyright © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
CITATION STYLE
Ng, P. C. (2012). What kobe bryant and britney spears have in common: Mining Wikipedia for characteristics of notable individuals. In ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (pp. 523–526). https://doi.org/10.1609/icwsm.v6i1.14282
Mendeley helps you to discover research relevant for your work.