Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Avia Efrat; Uri Shaham; Dan Kilman; Omer Levy

Conference ProceedingsOPEN ACCESS

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (2021) 4186-4192

DOI: 10.18653/v1/2021.emnlp-main.344

10Citations

59Readers

Abstract

Current NLP datasets targeting ambiguity can be solved by a native speaker with relative ease. We present Cryptonite, a large-scale dataset based on cryptic crosswords, which is both linguistically complex and naturally sourced. Each example in Cryptonite is a cryptic clue, a short phrase or sentence with a misleading surface reading, whose solving requires disambiguating semantic, syntactic, and phonetic wordplays, as well as world knowledge. Cryptic clues pose a challenge even for experienced solvers, though top-tier experts can solve them with almost 100% accuracy. Cryptonite is a challenging task for current models; fine-tuning T5-Large on 470k cryptic clues achieves only 7.6% accuracy, on par with the accuracy of a rule-based clue solver (8.6%).

Cite

CITATION STYLE

APA

Efrat, A., Shaham, U., Kilman, D., & Levy, O. (2021). Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language. In EMNLP 2021 - 2021 Conference on Empirical Methods in Natural Language Processing, Proceedings (pp. 4186–4192). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2021.emnlp-main.344

Cryptonite: A Cryptic Crossword Benchmark for Extreme Ambiguity in Language

Abstract

Cite

Register to see more suggestions