Requiring a large hand-annotated corpus in supervised learning of contemporary Vietnamese Named Entity Recognition researches is challenging. We therefore propose a hybrid approach of pattern extraction and semi-supervised learning. Applied rule-based method helps generating patterns automatically. Part-of-speech tagger, lexical diversity and chunking are explored to define rules in pattern extractions which are used for identifying potential named entities. Semi-supervised learning trains a small amount of seed named entities to categorize named entities in extracted patterns. In experiments, our approach shows good increasing the system accuracy with others in Vietnamese. © 2012 Springer-Verlag.
CITATION STYLE
Vo, D. T., & Ock, C. Y. (2012). A hybrid approach of pattern extraction and semi-supervised learning for Vietnamese named entity recognition. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7653 LNAI, pp. 83–93). https://doi.org/10.1007/978-3-642-34630-9_9
Mendeley helps you to discover research relevant for your work.