Abstract
We present an automated quick news system called KWB. KWB crawls and collects around the clock news items from over 120 news websites in mainland China, eliminates duplicates, and retrieves a summary of up to 600 characters for each news article using a proprietary summary engine. It then uses a Labeled-LDA classifier to classify the remaining news items into 19 categories, computes popularity ranks called PopuRank of the newly collected news items in each category, and displays the summaries of news items in each category sorted according to Popu-Rank together with a picture, if there is any, on http://www.kuaiwenbao.com and mobile apps. We will describe in this paper the system architecture of KWB, the data crawler structure, the functionalities of the central database, and the definition of PopuRank. We will show, through experiments, the running time of obtaining PopuRank. We will also demonstrate the use of KWB.
Cite
CITATION STYLE
Bai, Y., Yang, W., Zhang, H., Wang, J., Jia, M., Tong, R., & Wang, J. (2015). KWB: An automated quick news system for Chinese readers. In Proceedings of the 8th SIGHAN Workshop on Chinese Language Processing, SIGHAN 2015 - co-located with 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, ACL IJCNLP 2015 (pp. 110–119). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/w15-3118
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.