While research in natural language processing has progressed significantly in creative language generation, the question of whether language models can interpret the intended meaning of creative language largely remains unanswered. Poetry as a creative art form has existed for generations, and summarization of such content requires deciphering the figurative patterns to find out the actual intent and message of the poet. This task can provide the researchers an opportunity to evaluate the creative language interpretation capacity of the language models. Unlike typical text, summarization of poems is a challenging task as poems carry a deeper meaning, which can be easily lost if only the literal meaning is considered. That being said, we propose a new task in the field of natural language understanding called 'Poem Summarization'. As a starting, we propose the first-ever dataset for this task, named 'PoemSum', consisting of 3011 samples of poetry and its corresponding summarized interpretation in the English language. We have benchmarked the performance of different state-of-the-art summarization models and provided observations on their limitations. The dataset and all relevant code used in this work have been made publicly available.
CITATION STYLE
Mahbub, R., Khan, I. T., Anuva, S. S., Shahriar, S., Laskar, T. R., & Ahmed, S. (2023). Unveiling the Essence of Poetry: Introducing a Comprehensive Dataset and Benchmark for Poem Summarization. In EMNLP 2023 - 2023 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 14878–14886). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.emnlp-main.920
Mendeley helps you to discover research relevant for your work.