This paper describes the development of a memory-based lemmatiser for Afrikaans called Lia. The paper commences with a brief overview of Afrikaans lemmatisation and it is indicated that lemmatisation is seen as a simplified process of morphological analysis within the context of this paper. This overview is followed by an introduction to memory-based learning - the machine learning technique that is used in the development of the Afrikaans lemmatiser. The deployment of Lia is then discussed with specific emphasis on the format of the training and testing data that is used. The Afrikaans lemmatiser is then evaluated and it is indicated that Lia achieves a linguistic accuracy figure of over 90%. The paper concludes with some ideas on future work that can be done to improve the linguistic accuracy of the Afrikaans lemmatiser.
CITATION STYLE
Groenewald, H. J. (2006). Educating Lia: The development of a linguistically accurate memory-based lemmatiser for Afrikaans. IFIP International Federation for Information Processing, 228, 431–440. https://doi.org/10.1007/978-0-387-44641-7_45
Mendeley helps you to discover research relevant for your work.