Context tailoring for text normalization

3Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

Abstract

Language processing tools suffer from significant performance drops in social media domain due to its continuously evolving language. Transforming non-standard words into their standard forms has been studied as a step towards proper processing of ill-formed texts. This work describes a normalization system that considers contextual and lexical similarities between standard and non-standard words for removing noise in texts. A bipartite graph that represents contexts shared by words in a large unlabeled text corpus is utilized for exploring normalization candidates via random walks. Input context of a non-standard word in a given sentence is tailored in cases where a direct match to shared contexts is not possible. The performance of the system was evaluated on Turkish social media texts.

Cite

CITATION STYLE

APA

Demir, S. (2016). Context tailoring for text normalization. In Proceedings of TextGraphs@NAACL-HLT 2016: The 10th Workshop on Graph-Based Methods for Natural Language Processing (pp. 6–14). The Association for Computer Linguistics. https://doi.org/10.18653/v1/w16-1402

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free