Twitter, used in 200 countries with over 250 million tweets a day, is a rich source of local news from around the world. Many events of local importance are first reported on Twitter, including many that never reach news channels. Further, there are often only a few tweets reporting each such event, in contrast with the larger volumes that follow events of wider significance. Even though such events may be primarily of local importance, they can also be of critical interest to some specific but possibly far flung entities: For example, a fire in a supplier's factory half-way around the world may be of interest even from afar. In this paper we describe how this 'long tail' of events can be detected in spite of their sparsity. We then extract and correlate information from multiple tweets describing the same event. Our generic architecture for converting a tweet-stream into event-objects uses locality sensitive hashing, classification, boosting, information extraction and clustering. Our results, based on millions of tweets monitored over many months, appear to validate our approach and architecture: We achieved success-rates in the 80% range for event detection and 76% on event-correlation; we also reduced tweet-comparisons by 80% using LSH. Copyright © 2012, Association for the Advancement of Artificial Intelligence (www.aaai.org). All rights reserved.
CITATION STYLE
Agarwal, P., Vaithiyanathan, R., Sharma, S., & Shroff, G. (2012). Catching the long-tail: Extracting local news events from Twitter. In ICWSM 2012 - Proceedings of the 6th International AAAI Conference on Weblogs and Social Media (pp. 379–382). https://doi.org/10.1609/icwsm.v6i1.14317
Mendeley helps you to discover research relevant for your work.