Identifying the correct root of an ambiguous Hebrew word

Yaakov Hacohen-Kerner; Ofir Tzvi Erlich

Journal Article

Identifying the correct root of an ambiguous Hebrew word

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8003 36-53

DOI: 10.1007/978-3-642-45327-4_3

1Citations

4Readers

Get full text

Abstract

Stemming is useful for various natural language processing tasks, such as document indexing and text classification. Therefore, identification of the correct root of any given word is important. For Hebrew this is not a trivial task, due to the complex nature of Hebrew morphology and its orthography. Many Hebrew words are ambiguous in the sense that each one of them can be created from a few possible roots. However, for a given word in a specific context, each word has only one correct root or no root at all. We have developed a variety of features in order to find the correct root for a Hebrew ambiguous word. These features are classified into 3 distinct groups: root-based features, conjugation-based features and statistical features. Several common machine learning methods have been tested in order to find a successful integration of the features. The best result has been achieved by Naïve Bayes, with about 87% accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Hacohen-Kerner, Y., & Erlich, O. T. (2014). Identifying the correct root of an ambiguous Hebrew word. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8003, 36–53. https://doi.org/10.1007/978-3-642-45327-4_3

Identifying the correct root of an ambiguous Hebrew word

Abstract

Author supplied keywords

Cite

Register to see more suggestions