This paper presents SoCRFSum, a summary model which integrates user-generated content as comments and third-party sources such as relevant articles of a Web document to generate a high-quality summarization. The summarization was formulated as a sequence labeling problem, which exploits the support of external information to model sentences and comments. After modeling, Conditional Random Fields were adopted for sentence selection. SoCRFSum was validated on a dataset collected from Yahoo News. Promising results indicate that by integrating the user-generated and third-party information, our method obtains improvements of ROUGE-scores over state-of-the-art baselines.
CITATION STYLE
Nguyen, M. T., Tran, D. V., Tran, C. X., & Nguyen, M. L. (2017). Summarizing web documents using sequence labeling with user-generated content and third-party sources. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 10260 LNCS, pp. 454–467). Springer Verlag. https://doi.org/10.1007/978-3-319-59569-6_54
Mendeley helps you to discover research relevant for your work.