Feature selection analysis for maximum entropy-based WSD

2Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Supervised learning on a corpus-based Word Sense Disambiguation (WSD) system uses a previously classified set of linguistic contexts. In order to perform the training of the system, it is usual to define a set of functions that inform of any linguistic feature in each example. It is usual to look for the same kind of information for each word too, at least on words of the same part-of-speech. In this paper, a study of feature selection in a supervised learning method of WSD based on corpus, Maximum Entropy conditional probability models, is presented. For a few words selected from the DSO corpus, the behaviour of several types of features has been analyzed in order to identify their contribution to gains in accuracy and to determine the influence of sense frequency in that corpus. This paper shows that not all words are better disambiguated with the same combination of features. Moreover, an improved definition of features in order to increase efficiency is presented as well.

Cite

CITATION STYLE

APA

Suárez, A., & Palomar, M. (2002). Feature selection analysis for maximum entropy-based WSD. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 2276, pp. 146–155). Springer Verlag. https://doi.org/10.1007/3-540-45715-1_12

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free