Attribute Value Extraction (AVE) aims to automatically obtain attribute value pairs from product descriptions to aid e-commerce. Despite the progressive performance of existing approaches in e-commerce platforms, they still suffer from two challenges: 1) difficulty in identifying values at different scales simultaneously; 2) easy confusion by some highly similar fine-grained attributes. This paper proposes a pre-training technique for AVE to address these issues. In particular, we first improve the conventional token-level masking strategy, guiding the language model to understand multi-scale values by recovering spans at the phrase and sentence level. Second, we apply clustering to build a challenging negative set for each example and design a pre-training objective based on contrastive learning to force the model to discriminate similar attributes. Comprehensive experiments show that our solution provides a significant improvement over traditional pre-trained models in the AVE task, and achieves state-of-the-art on four benchmarks.
CITATION STYLE
Guo, X., Deng, W., Chen, Y., Li, Y., Zhou, M., Qi, G., … Pan, Y. (2023). COMAVE: Contrastive Pre-training with Multi-scale Masking for Attribute Value Extraction. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 6007–6018). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2023.findings-acl.373
Mendeley helps you to discover research relevant for your work.