Abstract
A distributional method for part-of-speech induction is presented which, in contrast to most previous work, determines the part-of-speech distribution of syntactically ambiguous words without explicitly tagging the underlying text corpus. This is achieved by assuming that the word pair consisting of the left and right neighbor of a particular token is characteristic of the part of speech at this position, and by clustering the neighbor pairs on the basis of their middle words as observed in a large corpus. The results obtained in this way are evaluated by comparing them to the part-of-speech distributions as found in the manually tagged Brown corpus.
Cite
CITATION STYLE
Rapp, R. (2007). Deriving an ambiguous word’s part-of-speech distribution from unannotated text. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 53–56). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1557769.1557787
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.