COMAVE: Contrastive Pre-training with Multi-scale Masking for Attribute Value Extraction

1Citations
Citations of this article
7Readers
Mendeley users who have this article in their library.

Abstract

Attribute Value Extraction (AVE) aims to automatically obtain attribute value pairs from product descriptions to aid e-commerce. Despite the progressive performance of existing approaches in e-commerce platforms, they still suffer from two challenges: 1) difficulty in identifying values at different scales simultaneously; 2) easy confusion by some highly similar fine-grained attributes. This paper proposes a pre-training technique for AVE to address these issues. In particular, we first improve the conventional token-level masking strategy, guiding the language model to understand multi-scale values by recovering spans at the phrase and sentence level. Second, we apply clustering to build a challenging negative set for each example and design a pre-training objective based on contrastive learning to force the model to discriminate similar attributes. Comprehensive experiments show that our solution provides a significant improvement over traditional pre-trained models in the AVE task, and achieves state-of-the-art on four benchmarks.

Cite

CITATION STYLE

APA

Guo, X., Deng, W., Chen, Y., Li, Y., Zhou, M., Qi, G., … Pan, Y. (2023). COMAVE: Contrastive Pre-training with Multi-scale Masking for Attribute Value Extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6007–6018). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.373

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free