BPoMP: The Benchmark of Poetic Minimal Pairs - Limericks, Rhyme, and Narrative Coherence

1Citations
Citations of this article
34Readers
Mendeley users who have this article in their library.

Abstract

We adapt BLiMP (Benchmark of Linguistic Minimal Pairs) language model evaluation framework to the context of poetry, introducing the first of a series of tasks titled Benchmark of Poetic Minimal Pairs (BPoMP). The tasks presented herein use one genre of English-language poetry, the limerick (five-lines, rhyme scheme AABBA). Following the BLiMP schema, the BPoMP tasks use 10,000 minimal pairs of limerick/corrupted limerick. The latter is created by (1) shuffling two rhyming end-of-the-line words, (2) shuffling two rhyming lines, (3) replacing end-of-the-line word by a non-rhyming synonym. Our general task is detection of the original limerick, which we believe tests a language model's capacity to utilize “end rhymes”, a common feature of poetry. We evaluate Transformer-based models by checking if they assign a higher probability to the non-corrupted limerick in each minimal pair. We find that the models identify the original limerick at rates better than chance, but with a nontrivial gap relative to human accuracy (average of 98.3% across tasks). The publicly available curated set of limericks accompanying this paper is an additional contribution. In general, we see this as a first step to create a community of NLP activity around the rigorous computational study of poetry.

Cite

CITATION STYLE

APA

Abdibayev, A., Riddell, A., & Rockmore, D. (2021). BPoMP: The Benchmark of Poetic Minimal Pairs - Limericks, Rhyme, and Narrative Coherence. In International Conference Recent Advances in Natural Language Processing, RANLP (pp. 1–9). Incoma Ltd. https://doi.org/10.26615/978-954-452-072-4_001

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free