Utilizing Cross-Modal Contrastive Learning to Improve Item Categorization BERT Model

3Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.

Abstract

Item categorization (IC) is a core natural language processing (NLP) task in e-commerce. As a special text classification task, fine-tuning pre-trained models, e.g., BERT, has become a main stream solution. To improve IC performance further, other product metadata, e.g., product images, have been used. Although multimodal IC (MIC) systems show higher performance, expanding from processing text to more resource-demanding images brings large engineering impacts and hinders the deployment of such dual-input MIC systems. In this paper, we proposed a new way of using product images to improve text-only IC model: leveraging cross-modal signals between products’ titles and associated images to adapt BERT models in a self-supervised learning (SSL) way. Our experiments on the three genres in the public Amazon product dataset show that the proposed method generates improved prediction accuracy and macro-F1 values than simply using the original BERT. Moreover, the proposed method is able to keep using existing text-only IC inference implementation and shows a resource advantage than the deployment of a dual-input MIC system.

Cite

CITATION STYLE

APA

Chen, L., & Chou, H. W. (2022). Utilizing Cross-Modal Contrastive Learning to Improve Item Categorization BERT Model. In ECNLP 2022 - 5th Workshop on e-Commerce and NLP, Proceedings of the Workshop (pp. 217–223). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2022.ecnlp-1.25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free