Abstract
Community-level linguistic variation is a core concept in sociolinguistics. In this paper, we use conditioned neural language models to learn vector representations for 510 online communities. We use these representations to measure linguistic variation between communities and investigate the degree to which linguistic variation corresponds with social connections between communities. We find that our sociolinguistic embeddings are highly correlated with a social network-based representation that does not use any linguistic input.
Cite
CITATION STYLE
Noble, B., & Bernardy, J. P. (2022). Conditional Language Models for Community-Level Linguistic Variation. In NLPCSS 2022 - 5th Workshop on Natural Language Processing and Computational Social Science ,NLP+CSS, Held at the 2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022 (pp. 59–78). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.nlpcss-1.9
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.