The Web is a great resource and archive of news articles for the world. We present a framework, based on probabilistic topic modeling, for uncovering the meaningful structure and trends of important topics and issues hidden within the news archives on the Web. Central in the framework is a topic chain, a temporal organization of similar topics. We experimented with various topic similarity metrics and present our insights on how best to construct topic chains. We discuss how to interpret the topic chains to understand the news corpus by looking at long-term topics, temporary issues, and shifts of focus in the topic chains. We applied our framework to nine months of Korean Web news corpus and present our findings. © 2011 Springer-Verlag.
CITATION STYLE
Kim, D., & Oh, A. (2011). Topic chains for understanding a news corpus. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 6609 LNCS, pp. 163–176). https://doi.org/10.1007/978-3-642-19437-5_13
Mendeley helps you to discover research relevant for your work.