In this paper, we first create a Cyrillic Mongolian named entity manually annotated corpus. The annotation types contain person names, location names, organization names and other proper names. Then, we use Condition Random Field as classifier and design few categories features of Mongolian, including orthographic feature, morphological feature, gazetteer feature, syllable feature, word clusters feature etc. Experimental results show that all the proposed features improve the overall system performance and stem features improve the most among them. Finally, with a combination of all the features our model obtains the optimal performance.
CITATION STYLE
Wang, W., Bao, F., & Gao, G. (2016). Cyrillic mongolian named entity recognition with rich features. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10102, pp. 497–505). Springer Verlag. https://doi.org/10.1007/978-3-319-50496-4_42
Mendeley helps you to discover research relevant for your work.