Persian in MULTEXT-east framework

11Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Farsi, also known as Persian, is the official language of Iran, Tajikistan and one of the two main languages spoken in Afghanistan. It is an Indo-European agglutinating language, written in Arabic script. This paper presents the first step in creating Farsi basic language resources kit. This Step comprises the specifications for morphosyntactic encoding, which is based on the EAGLES/MULTEXT model and specific resources of MULTEXT-East. This paper introduces the language i.e. Farsi, with an emphasis on its writing system and morphological properties, and its specifications. Two other important issues introduced in this paper are; one, a novel Part of Speech (PoS) categorization and, the other, a unified orthography of Farsi in digital environment. A lexicon and an annotated corpus are under preparation. © Springer-Verlag Berlin Heidelberg 2006.

Cite

CITATION STYLE

APA

QasemiZadeh, B., & Rahimi, S. (2006). Persian in MULTEXT-east framework. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4139 LNAI, pp. 541–551). Springer Verlag. https://doi.org/10.1007/11816508_54

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free