Content models for survey generation: A factoid-based evaluation

Rahul Jha; Catherine Finegan-Dollak; Reed Coke; Ben King; Dragomir Radev

Conference ProceedingsOPEN ACCESS

Content models for survey generation: A factoid-based evaluation

ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (2015) 1 441-450

DOI: 10.3115/v1/p15-1043

14Citations

106Readers

Abstract

We present a new factoid-Annotated dataset for evaluating content models for scientific survey article generation containing 3,425 sentences from 7 topics in natural language processing. We also introduce a novel HITS-based content model for automated survey article generation called HITSUM that exploits the lexical network structure between sentences from citing and cited papers. Using the factoid-Annotated data, we conduct a pyramid evaluation and compare HITSUM with two previous state-of-The-Art content models: C-Lexrank, a network based content model, and TOPICSUM, a Bayesian content model. Our experiments show that our new content model captures useful survey-worthy information and outperforms C-Lexrank by 4% and TOPICSUM by 7% in pyramid evaluation.

Cite

CITATION STYLE

APA

Jha, R., Finegan-Dollak, C., Coke, R., King, B., & Radev, D. (2015). Content models for survey generation: A factoid-based evaluation. In ACL-IJCNLP 2015 - 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing of the Asian Federation of Natural Language Processing, Proceedings of the Conference (Vol. 1, pp. 441–450). Association for Computational Linguistics (ACL). https://doi.org/10.3115/v1/p15-1043

Content models for survey generation: A factoid-based evaluation

Abstract

Cite

Register to see more suggestions