Predicting ethnicity with first names in online social media networks

Bas Hofstra; Niek C. de Schipper

Journal ArticleOPEN ACCESS

Predicting ethnicity with first names in online social media networks

Big Data and Society (2018) 5(1)

DOI: 10.1177/2053951718761141

21Citations

49Readers

Abstract

Social scientists increasingly use (big) social media data to illuminate long-standing substantive questions in social science research. However, a key challenge of analyzing such data is their lower level of individual detail compared to highly detailed survey data. This limits the scope of substantive questions that can be addressed with these data. In this study, we provide a method to upgrade individual detail in terms of ethnicity in data gathered from social media via the use of register data. Our research aim is twofold: first, we predict the most likely value of ethnicity, given one's first name, and second, we show how one can test hypotheses with the predicted values for ethnicity as an independent variable while simultaneously accounting for the uncertainty in these predictions. We apply our method to social network data collected from Facebook. We illustrate our approach and provide an example of hypothesis testing using our procedure, i.e., estimating the relation between predicted network ethnic homogeneity on Facebook and trust in institutions. In a comparison of our method with two other methods, we find that our method provides the most conservative tests of hypotheses. We discuss the promise of our approach and pinpoint future research directions.

Author supplied keywords

Cite

CITATION STYLE

APA

Hofstra, B., & de Schipper, N. C. (2018). Predicting ethnicity with first names in online social media networks. Big Data and Society, 5(1). https://doi.org/10.1177/2053951718761141

Predicting ethnicity with first names in online social media networks

Abstract

Author supplied keywords

Cite

Register to see more suggestions