The paper presents the development of a dependency treebank for the Serbian language, intended for various applications in the field of natural language processing, primarily natural language understanding within human-machine dialogue. The databank is built by adding syntactical annotation to the part-of-speech (POS) tagged AlfaNum Text Corpus of Serbian. The annotation is carried out in line with the standards set by the Prague Dependency Treebank, which has been adopted as a starting point for the development of treebanks for some other kindred languages in the region. The initial dependency parsing experiments on the currently annotated portion of the corpus containing 1,148 sentences (7,117 words) provided relatively low parsing accuracy, as was expected from a preliminary experiment and a treebank of this size.
CITATION STYLE
Jakovljević, B., Kovačević, A., Sečujski, M., & Marković, M. (2014). A dependency treebank for Serbian: Initial experiments. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8773, pp. 42–49). Springer Verlag. https://doi.org/10.1007/978-3-319-11581-8_5
Mendeley helps you to discover research relevant for your work.