Computational Measures for Language Similarity across Time in Online Communities

21Citations
Citations of this article
102Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper examines language similarity in messages over time in an online community of adolescents from around the world using three computational measures: Spearman's Correlation Coefficient, Zipping and Latent Semantic Analysis. Results suggest that the participants' language diverges over a six-week period, and that divergence is not mediated by demographic variables such as leadership status or gender. This divergence may represent the introduction of more unique words over time, and is influenced by a continual change in subtopics over time, as well as community-wide historical events that introduce new vocabulary at later time periods. Our results highlight both the possibilities and shortcomings of using document similarity measures to assess convergence in language use.

Cite

CITATION STYLE

APA

Huffaker, D., Jorgensen, J., Iacobelli, F., Tepper, P., & Cassell, J. (2006). Computational Measures for Language Similarity across Time in Online Communities. In HLT-NAAC 2006 - Analyzing Conversations in Text and Speech, ACTS 2006 - Proceedings of the Workshop (pp. 15–22). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1564535.1564538

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free