This paper presents a part-of-speech tagger based on a genetic algorithm which, after the “evolution” of a population of sequences of tags for the words in the text, selects the best individual as solution. The paper describes the main issues arising in the algorithm, such as the chromosome representation and the evaluation and design of genetic operators for crossover and mutation. A probabilistic model, based on the context of each word (the tags of the surrounding words) has been devised in order to define the fitness function. The model has been implemented and different issues have been investigated: size of the training corpus, effect of the context size, and parameters of the evolutionary algorithm, such as population size and crossover and mutation rates. The accuracy obtained with this method is comparable to that of other probabilistic approaches, but evolutionary algorithms are more efficient in obtaining the results.
CITATION STYLE
Araujo, L. (2002). Part-of-speech tagging with evolutionary algorithms. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2276, pp. 230–239). Springer Verlag. https://doi.org/10.1007/3-540-45715-1_21
Mendeley helps you to discover research relevant for your work.