Exploring extensive linguistic feature sets in near-synonym lexical choice

Mari Sanna Paukkeri; Jaakko Väyrynen; Antti Arppe

Conference Proceedings

Exploring extensive linguistic feature sets in near-synonym lexical choice

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7182 LNCS(PART 2) 1-12

DOI: 10.1007/978-3-642-28601-8_1

0Citations

6Readers

Get full text

Abstract

In the near-synonym lexical choice task, the best alternative out of a set of near-synonyms is selected to fill a lexical gap in a text. We experiment on an approach of an extensive set, over 650, linguistic features to represent the context of a word, and a range of machine learning approaches in the lexical choice task. We extend previous work by experimenting with unsupervised and semi-supervised methods, and use automatic feature selection to cope with the problems arising from the rich feature set. It is natural to think that linguistic analysis of the word context would yield almost perfect performance in the task but we show that too many features, even linguistic, introduce noise and make the task difficult for unsupervised and semi-supervised methods. We also show that purely syntactic features play the biggest role in the performance, but also certain semantic and morphological features are needed. © 2012 Springer-Verlag.

Author supplied keywords

Cite

CITATION STYLE

APA

Paukkeri, M. S., Väyrynen, J., & Arppe, A. (2012). Exploring extensive linguistic feature sets in near-synonym lexical choice. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7182 LNCS, pp. 1–12). https://doi.org/10.1007/978-3-642-28601-8_1

Exploring extensive linguistic feature sets in near-synonym lexical choice

Abstract

Author supplied keywords

Cite

Register to see more suggestions