DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron

Joachim Wagner; Jennifer Foster

Conference ProceedingsOPEN ACCESS

DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron

ACL-IJCNLP 2015 - Workshop on Noisy User-Generated Text, WNUT 2015 - Proceedings of the Workshop (2015) 93-98

DOI: 10.18653/v1/w15-4314

6Citations

70Readers

Abstract

We describe the work carried out by the DCU-ADAPT team on the Lexical Normalisation shared task at W-NUT 2015. We train a generalised perceptron to annotate noisy text with edit operations that normalise the text when executed. Features are character n-grams, recurrent neural network language model hidden layer activations, character class and eligibility for editing according to the task rules. We combine predictions from 25 models trained on subsets of the training data by selecting the most-likely normalisation according to a character language model. We compare the use of a generalised perceptron to the use of conditional random fields restricted to smaller amounts of training data due to memory constraints. Furthermore, we make a first attempt to verify Chrupała (2014)'s hypothesis that the noisy channel model would not be useful due to the limited amount of training data for the source language model, i.e. the language model on normalised text.

Cite

CITATION STYLE

APA

Wagner, J., & Foster, J. (2015). DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron. In ACL-IJCNLP 2015 - Workshop on Noisy User-Generated Text, WNUT 2015 - Proceedings of the Workshop (pp. 93–98). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-4314

DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron

Abstract

Cite

Register to see more suggestions