Morphemes as Necessary Concept for Structures Discovery from Untagged Corpora

Herve Dejean

Conference Proceedings

Morphemes as Necessary Concept for Structures Discovery from Untagged Corpora

Dejean H

Proceedings of the Joint Conference on New Methods in Language Processing and Computational Natural Language Learning, NeMLaP/CoNLL 1998 (1998) 295-298

DOI: 10.3115/1603899.1603952

53Citations

78Readers

Get full text

Abstract

This paper describes an overview of a method which allows discovery of syntactic structures from untagged corpora. It is composed of three main steps: the discovery of the grammatical morphemes of the language. Then the construction of the chunks which axe a multilingual conceptual level allowing the bypass of the limping notion of words. And Finally the discovery of the relations between chunks. We give an overview of the ditferent procedures realized and we especially describe the discovery of morphemes. This operation is divided into three steps: the discovery of the most frequent morphemes of the language. Then the discovery of the other morphemes, and finally the segmentation of the words of the corpus. We concluded with the procedure of correction which required the chunk level. The concepts and algorithms were tested on a twenty natural languages like English, German, Turkish, Vietnamese, Swahili, Finnish, Latin, Indonesian.

Cite

CITATION STYLE

APA

Dejean, H. (1998). Morphemes as Necessary Concept for Structures Discovery from Untagged Corpora. In Proceedings of the Joint Conference on New Methods in Language Processing and Computational Natural Language Learning, NeMLaP/CoNLL 1998 (pp. 295–298). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1603899.1603952

Morphemes as Necessary Concept for Structures Discovery from Untagged Corpora

Abstract

Cite

Register to see more suggestions