A methodology to enhance spatial understanding of disease outbreak events reported in news articles

  • Chanlekha H
  • Collier N
  • 40

    Readers

    Mendeley users who have this article in their library.
  • 4

    Citations

    Citations of this article.

Abstract

Purpose: The emergence and re-emergence of disease outbreaks of international concern in the last several years has raised the importance of health surveillance systems that exploit the open media for their timely and precise detection of events. However, one of the key barriers faced by current event-based health surveillance systems is in identifying fine-grained terms for an outbreak's geographical location. In this article, we present a method to tackle this problem by associating each reported event with the most specific spatial information available in a news report. This would be useful not only for health surveillance systems, but also for other event-centered processing systems. Methods: To develop an automated spatial attribute annotation system, we first created a gold standard corpus for training a machine learning model. Since the qualitative analysis on data suggested that the event class might have an impact on the spatial attribute annotation, we also developed an event classification system to incorporate event class information into the spatial attribute annotation model. To automatically recognize the spatial attribute of events, several approaches, ranging from a simple heuristic technique to a more sophisticated approach based on a state-of-the-art Conditional Random Fields (CRFs) model were explored. Different feature sets were incorporated into the model and compared. Results: The evaluations were conducted on 100 outbreak news articles. Spatial attribute recognition performance was evaluated based on three metrics; precision, recall and the harmonic mean of precision and recall (F-score). Among three strategies proposed in this article, the CRF model appeared to be the most promising for spatial attribute recognition with a best performance of 85.5% F-score (86.3% precision and 84.7% recall). Conclusion: We presented a methodology for associating each event in media outbreak reports with their spatial attribute at the finest level of granularity. Our goal has been to provide a means for enhancing the spatial understanding of outbreak-related events. Evaluation studies showed promising results for automatic spatial attribute annotation. In the future, we plan to explore more features, such as semantic correlation between words, that maybe useful for the spatial attribute annotation task. © 2010 Elsevier Ireland Ltd. All rights reserved.

Author-supplied keywords

  • Geographical information
  • Information system
  • Natural language processing
  • Public health surveillance

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Hutchatai Chanlekha

  • Nigel Collier

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free