Prepositional Phrase (PP) attachment can be addressed by considering frequency counts of dependency triples seen in a non-annotated corpus. However, not all triples appear even in very big corpora. To solve this problem, several techniques have been used. We evaluate two different backoff methods, one based on WordNet and the other on a distributional (automatically created) thesaurus. We work on Spanish. The thesaurus is created using the dependency triples found in the same corpus used for counting the frequency of unambiguous triples. The training corpus used for both methods is an encyclopaedia. The method based on a distributional thesaurus has higher coverage but lower precision than the WordNet method. © Springer-Verlag Berlin Heidelberg 2005.
CITATION STYLE
Calvo, H., Gelbukh, A., & Kilgarriff, A. (2005). Distributional thesaurus versus WordNet: A comparison of backoff techniques for unsupervised PP attachment. In Lecture Notes in Computer Science (Vol. 3406, pp. 177–188). Springer Verlag. https://doi.org/10.1007/978-3-540-30586-6_17
Mendeley helps you to discover research relevant for your work.