Context tailoring for text normalization

Seniz Demir

Conference ProceedingsOPEN ACCESS

Context tailoring for text normalization

Demir S

Proceedings of TextGraphs@NAACL-HLT 2016: The 10th Workshop on Graph-Based Methods for Natural Language Processing (2016) 6-14

DOI: 10.18653/v1/w16-1402

3Citations

70Readers

Abstract

Language processing tools suffer from significant performance drops in social media domain due to its continuously evolving language. Transforming non-standard words into their standard forms has been studied as a step towards proper processing of ill-formed texts. This work describes a normalization system that considers contextual and lexical similarities between standard and non-standard words for removing noise in texts. A bipartite graph that represents contexts shared by words in a large unlabeled text corpus is utilized for exploring normalization candidates via random walks. Input context of a non-standard word in a given sentence is tailored in cases where a direct match to shared contexts is not possible. The performance of the system was evaluated on Turkish social media texts.

Cite

CITATION STYLE

APA

Demir, S. (2016). Context tailoring for text normalization. In Proceedings of TextGraphs@NAACL-HLT 2016: The 10th Workshop on Graph-Based Methods for Natural Language Processing (pp. 6–14). The Association for Computer Linguistics. https://doi.org/10.18653/v1/w16-1402

Context tailoring for text normalization

Abstract

Cite

Register to see more suggestions