DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron

6Citations
Citations of this article
70Readers
Mendeley users who have this article in their library.

Abstract

We describe the work carried out by the DCU-ADAPT team on the Lexical Normalisation shared task at W-NUT 2015. We train a generalised perceptron to annotate noisy text with edit operations that normalise the text when executed. Features are character n-grams, recurrent neural network language model hidden layer activations, character class and eligibility for editing according to the task rules. We combine predictions from 25 models trained on subsets of the training data by selecting the most-likely normalisation according to a character language model. We compare the use of a generalised perceptron to the use of conditional random fields restricted to smaller amounts of training data due to memory constraints. Furthermore, we make a first attempt to verify Chrupała (2014)'s hypothesis that the noisy channel model would not be useful due to the limited amount of training data for the source language model, i.e. the language model on normalised text.

Cite

CITATION STYLE

APA

Wagner, J., & Foster, J. (2015). DCU-ADAPT: Learning Edit Operations for Microblog Normalisation with the Generalised Perceptron. In ACL-IJCNLP 2015 - Workshop on Noisy User-Generated Text, WNUT 2015 - Proceedings of the Workshop (pp. 93–98). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-4314

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free