Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts. To facilitate the evaluation of the compositional abilities of language processing architectures, we introduce COGS, a semantic parsing dataset based on a fragment of English. The evaluation portion of COGS contains multiple systematic gaps that can only be addressed by compositional generalization; these include new combinations of familiar syntactic structures, or new combinations of familiar words and familiar structures. In experiments with Transformers and LSTMs, we found that in-distribution accuracy on the COGS test set was near-perfect (96-99%), but generalization accuracy was substantially lower (16-35%) and showed high sensitivity to random seed (±6-8%). These findings indicate that contemporary standard NLP models are limited in their compositional generalization capacity, and position COGS as a good way to measure progress.
CITATION STYLE
Kim, N., & Linzen, T. (2020). COGS: A compositional generalization challenge based on semantic interpretation. In EMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference (pp. 9087–9105). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.emnlp-main.731
Mendeley helps you to discover research relevant for your work.