Part-of-speech tagging using multiview learning

Kyungtae Lim; Jungyeul Park

Journal ArticleOPEN ACCESS

Part-of-speech tagging using multiview learning

IEEE Access (2020) 8 185184-195196

DOI: 10.1109/ACCESS.2020.3033979

2Citations

30Readers

Abstract

In natural language processing, character-level representations are vector representations of the particular character. Character-level representations have recently focused on enriching subword information by stacking deep neural models. Ideally, applications of several character-level representations can help capture different aspects of the subword information. However, this approach has often failed in the past, mainly because of the nature of traditionally used simple concatenation models. In this study, we explore different character-level modeling techniques. During the learning process, long short-term memory-based character representations can introduce different views for a part-of-speech tagger. After investigating two previously reported techniques, we propose two additional extended methods: (1) a multihead-attention character-level representation for capturing several aspects of subword information, and (2) an optimal structure for training two different character-level embeddings based on joint learning. We evaluate our results on the part-of-speech (POS) tagging dataset of the Conference on Natural Language Learning (CoNLL) 2018 shared task in universal dependencies. We show that our method substantially improves POS tagging results for many morphologically rich languages where the character information should be considered more substantially. Moreover, we compare the performance of our model with recent state-ofthe- art POS taggers, which are trained with language models such as Bidirectional Encoder Representations from Transformers (BERT) and Deep Contextualized Word Representations (ELMo); our multiview tagger shows better results for nine languages. The proposed character model shows significant improvements in Ancient Greek, with average gains of 8.89 points in accuracy compared to the previous word representation model. Therefore, our empirical experiments indicate that character-level representations are more important than word representations for morphologically rich languages in terms of performance.

Author supplied keywords

References Powered by Scopus

View more at Scopus

Cite

CITATION STYLE

APA

Lim, K., & Park, J. (2020). Part-of-speech tagging using multiview learning. IEEE Access, 8, 185184–195196. https://doi.org/10.1109/ACCESS.2020.3033979

Readers' Seniority

PhD / Post grad / Masters / Doc 4

50%

Lecturer / Post doc 2

25%

Researcher 2

25%

Readers' Discipline

Computer Science 6

67%

Social Sciences 1

11%

Engineering 1

11%

Arts and Humanities 1

11%

Part-of-speech tagging using multiview learning

Abstract

Author supplied keywords

References Powered by Scopus

Multi-view learning overview: Recent progress and new challenges

Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss

Conll 2018 shared task: Multilingual parsing from raw text to universal dependencies

Register to see more suggestions

Cite

Readers' Seniority

Readers' Discipline