We report on an investigation of the pragmatic category of topic in Danish dialog and its correlation to surface features of NPs. Using a corpus of 444 utterances, we trained a decision tree system on 16 features. The system achieved nearhuman performance with success rates of 84-89% and F 1-scores of 0.63-0.72 in 10-fold cross validation tests (human performance: 89% and 0.78). The most important features turned out to be preverbal position, definiteness, pronominalisation, and non-subordination. We discovered that NPs in epistemic matrix clauses (e.g. "I think . . . ") were seldom topics and we suspect that this holds for other interpersonal matrix clauses as well. © 2005 Association for Computational Linguistics.
CITATION STYLE
Diderichsen, P., & Elming, J. (2005). A corpus-based approach to topic in Danish dialog. In ACL-05 - 43rd Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference (pp. 109–114). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1628960.1628981
Mendeley helps you to discover research relevant for your work.