Deriving an ambiguous word’s part-of-speech distribution from unannotated text

Reinhard Rapp

Conference Proceedings

Deriving an ambiguous word’s part-of-speech distribution from unannotated text

Rapp R

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2007) 53-56

DOI: 10.3115/1557769.1557787

1Citations

80Readers

Get full text

Abstract

A distributional method for part-of-speech induction is presented which, in contrast to most previous work, determines the part-of-speech distribution of syntactically ambiguous words without explicitly tagging the underlying text corpus. This is achieved by assuming that the word pair consisting of the left and right neighbor of a particular token is characteristic of the part of speech at this position, and by clustering the neighbor pairs on the basis of their middle words as observed in a large corpus. The results obtained in this way are evaluated by comparing them to the part-of-speech distributions as found in the manually tagged Brown corpus.

Cite

CITATION STYLE

APA

Rapp, R. (2007). Deriving an ambiguous word’s part-of-speech distribution from unannotated text. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 53–56). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1557769.1557787

Deriving an ambiguous word’s part-of-speech distribution from unannotated text

Abstract

Cite

Register to see more suggestions