This paper investigates the automatic identification of aspects of Information Structure (IS) in texts. The experiments use the Prague Dependency Treebank which is annotated with IS following the Praguian approach of Topic Focus Articulation. We automatically detect t(opic) and f(ocus), using node attributes from the treebank as basic features and derived features inspired by the annotation guidelines. We show the performance of C4.5, Bagging, and Ripper classifiers on several classes of instances such as nouns and pronouns, only nouns, only pronouns. A baseline system assigning always f(ocus) has an F-score of 42.5%. Our best system obtains 82.04%. © 2005 Association for Computational Linguistics.
CITATION STYLE
Postolache, O. (2005). Learning information structure in the Prague Treebank. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 115–120). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1628960.1628982
Mendeley helps you to discover research relevant for your work.