Simple Queries as Distant Labels for Predicting Gender on Twitter

12Citations
Citations of this article
73Readers
Mendeley users who have this article in their library.

Abstract

The majority of research on extracting missing user attributes from social media profiles use costly hand-annotated labels for supervised learning. Distantly supervised methods exist, although these generally rely on knowledge gathered using external sources. This paper demonstrates the effectiveness of gathering distant labels for self-reported gender on Twitter using simple queries. We confirm the reliability of this query heuristic by comparing with manual annotation. Moreover, using these labels for distant supervision, we demonstrate competitive model performance on the same data as models trained on manual annotations. As such, we offer a cheap, extensible, and fast alternative that can be employed beyond the task of gender classification.

Cite

CITATION STYLE

APA

Emmery, C., Chrupała, G., & Daelemans, W. (2017). Simple Queries as Distant Labels for Predicting Gender on Twitter. In 3rd Workshop on Noisy User-Generated Text, W-NUT 2017 - Proceedings of the Workshop (pp. 50–55). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4407

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free