Deep learning architecture for part-of-speech tagging with word and suffix embeddings

Alexander Popov

Conference Proceedings

Deep learning architecture for part-of-speech tagging with word and suffix embeddings

Popov A

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2016) 9883 LNAI 68-77

DOI: 10.1007/978-3-319-44748-3_7

10Citations

10Readers

Get full text

Abstract

This paper presents a recurrent neural network (RNN) for part-of-speech (POS) tagging. The variation of RNN used is a Bidirectional Long Short-Term Memory architecture, which solves two crucial problems: the vanishing gradients phenomenon, which is architecturespecific, and the dependence of POS labels on sequential information both preceding and subsequent to them, which is task-specific. The approach is attractive compared to other machine learning approaches in that it does not require hand-crafted features or purposebuilt resources such as a morphological dictionary. The study presents preliminary results on the BulTreeBank corpus, with a tagset of 153 labels. One of its main contributions is the training of distributed word representations (word embeddings) against a large corpus of Bulgarian text. Another is complementing the word embedding input vectors with distributed morphological representations (suffix embeddings), which are shown to significantly improve the accuracy of the system.

Author supplied keywords

Cite

CITATION STYLE

APA

Popov, A. (2016). Deep learning architecture for part-of-speech tagging with word and suffix embeddings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9883 LNAI, pp. 68–77). Springer Verlag. https://doi.org/10.1007/978-3-319-44748-3_7

Deep learning architecture for part-of-speech tagging with word and suffix embeddings

Abstract

Author supplied keywords

Cite

Register to see more suggestions