Named entity recognition for vietnamese spoken texts and its application in smart mobile voice interaction

5Citations
Citations of this article
13Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Named entity recognition (NER) for written documents has been studied intensively during the past decades. However, NER for spoken texts is still at its early stage. There are several challenges behind this: spoken texts are usually less grammatical, all in lowercase, and even have no punctuation marks; continuous text chunks like email, hyperlinks are interpreted as discrete tokens; and numeric texts are sometimes interpreted as alphabetic forms. These characteristics are real obstacles for spoken text understanding. In this paper, we propose a lightweight machine learning model to NER for Vietnamese spoken texts that aims to overcome those problems. We incorporated into the model a variety of rich features including sophisticated regular expressions and various look-up dictionaries to make it robust. Unlike previous work on NER, our model does not need to rely on word boundary and part-of-speech information – that are expensive and time-consuming to prepare. We conducted a careful evaluation on a medium-sized dataset about mobile voice interaction and achieved an average F1 of 92.06. This is a significant result for such a difficult task. In addition, we kept our model compact and fast to integrate it into a mobile virtual assistant for Vietnamese.

Cite

CITATION STYLE

APA

Tran, P. N., Ta, V. D., Truong, Q. T., Duong, Q. V., Nguyen, T. T., & Phan, X. H. (2016). Named entity recognition for vietnamese spoken texts and its application in smart mobile voice interaction. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9621, pp. 170–180). Springer Verlag. https://doi.org/10.1007/978-3-662-49381-6_17

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free