Abstract
The Author Profiling (AP) task aims to determine specific demographic characteristics such as gender and age, by analyzing the language usage in groups of authors. Notwithstanding the recent advances in AP, this is still an unsolved problem, especially in the case of social media domains. According to the literature most of the work has been devoted to the analysis of useful textual features. The most prominent ones are those related with content and style. In spite of the success of using jointly both kinds of features, most of the authors agree in that content features are much more relevant than style, which suggest that some profiling aspects, like age or gender could be determined only by observing the thematic interests, concerns, moods, or others words related to events of daily life. Additionally, most of the research only uses traditional representations such as the BoW, rather than other more sophisticated representations to harness the content features. In this regard, this paper aims at evaluating the usefulness of some topic-based representations for the AP task. We mainly consider a representation based on Latent Semantic Analysis (LSA), which automatically discovers the topics from a given document collection, and a simplified version of the Linguistic Inquiry and Word Count (LIWC), which consists of 41 features representing manually predefined thematic categories. We report promising results in several corpora showing the effectiveness of the evaluated topic-based representations for AP in social media.
Cite
CITATION STYLE
Álvarez-Carmona, M. A., López-Monroy, A. P., Montes-Y-Gómez, M., Villaseñor-Pineda, L., & Meza, I. (2016). Evaluating topic-based representations for author profiling in social media. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10022 LNAI, pp. 151–162). Springer Verlag. https://doi.org/10.1007/978-3-319-47955-2_13
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.