Effectively identifying events in unstructured text is a very difficult task. This is largely due to the fact that an individual event can be expressed by several sentences. In this paper, we investigate the use of clustering methods for the task of grouping the text spans in a news article that refer to the same event. The key idea is to cluster the sentences, using a novel distance metric that exploits regularities in the sequential structure of events within a document. When this approach is compared to a simple bag of words baseline, a statistically significant increase in performance is observed.
CITATION STYLE
Naughton, M. (2007). Exploiting structure for event discovery using the MDI algorithm. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (Vol. 2007-June, pp. 31–36). Association for Computational Linguistics (ACL). https://doi.org/10.3115/1557835.1557842
Mendeley helps you to discover research relevant for your work.