Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling

  • Berend G
N/ACitations
Citations of this article
108Readers
Mendeley users who have this article in their library.

Abstract

In this paper we propose and carefully evaluate a sequence labeling framework which solely utilizes sparse indicator features derived from dense distributed word representations. The proposed model obtains (near) state-of-the art performance for both part-of-speech tagging and named entity recognition for a variety of languages. Our model relies only on a few thousand sparse coding-derived features, without applying any modification of the word representations employed for the different tasks. The proposed model has favorable generalization properties as it retains over 89.8% of its average POS tagging accuracy when trained at 1.2% of the total available training data, i.e. 150 sentences per language.

Cite

CITATION STYLE

APA

Berend, G. (2017). Sparse Coding of Neural Word Embeddings for Multilingual Sequence Labeling. Transactions of the Association for Computational Linguistics, 5, 247–261. https://doi.org/10.1162/tacl_a_00059

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free