Character based pattern mining for neology detection

5Citations
Citations of this article
67Readers
Mendeley users who have this article in their library.

Abstract

Detecting neologisms is essential in real-time natural language processing applications. Not only can it enable to follow the lexical evolution of languages, but it is also essential for updating linguistic resources and parsers. In this paper, neology detection is considered as a classification task where a system has to assess whether a given lexical item is an actual neologism or not. We propose a combination of an unsupervised data mining technique and a supervised machine learning approach. It is inspired by current researches in stylometry and on tokenlevel and character-level patterns. We train and evaluate our system on a manually designed reference dataset in French and Russian. We show that this approach is able to outperform stateof- the-art neology detection systems. Furthermore, character-level patterns exhibit good properties for multilingual extensions of the system.

Cite

CITATION STYLE

APA

Lejeune, G., & Cartier, E. (2017). Character based pattern mining for neology detection. In EMNLP 2017 - 1st Workshop on Subword and Character Level Models in NLP, SCLeM 2017 - Proceedings of the Workshop (pp. 25–30). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w17-4103

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free