Semi-automatic term extraction for the African languages, with special reference to Northern Sotho

Elsabé Taljard; Gilles Maurice De Schryver

Journal ArticleOPEN ACCESS

Semi-automatic term extraction for the African languages, with special reference to Northern Sotho

Lexikos (2002) 12 44-74

DOI: 10.5788/12-0-760

16Citations

11Readers

Abstract

Worldwide, semi-automatically extracting terms from corpora is becoming the norm for the compilation of terminology lists, term banks or dictionaries for special purposes. If African-language terminologists are willing to take their rightful place in the new millennium, they must not only take cognisance of this trend but also be ready to implement the new technology. In this article it is advocated that the best way to do the latter two at this stage, is to opt for computationally straightforward alternatives (i.e. use 'raw corpora') and to make use of widely available software tools (e.g. WordSmith Tools). The main aim is therefore to discover whether or not the semi-automatic extraction of terminology from untagged and unmarked running text by means of basic corpus query software is feasible for the African languages. In order to answer this question a full-blown case study revolving around Northern Sotho linguistic texts is discussed in great detail. The computational results are compared throughout with the outcome of a manual excerption, and vice versa. Attention is given to the concepts 'recall' and 'precision'; different approaches are suggested for the treatment of single-word terms versus multi-word terms; and the various findings are summarised in a Linguistics Terminology lexicon presented as an Appendix.

Author supplied keywords

Cite

CITATION STYLE

APA

Taljard, E., & De Schryver, G. M. (2002). Semi-automatic term extraction for the African languages, with special reference to Northern Sotho. Lexikos, 12, 44–74. https://doi.org/10.5788/12-0-760

Semi-automatic term extraction for the African languages, with special reference to Northern Sotho

Abstract

Author supplied keywords

Cite

Register to see more suggestions