TXtract: Taxonomy-aware knowledge extraction for thousands of product categories

31Citations
Citations of this article
153Readers
Mendeley users who have this article in their library.

Abstract

Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. Through category conditional self-attention and multi-task learning, our approach is both scalable, as it trains a single model for thousands of categories, and effective, as it extracts category-specific attribute values. Experiments on products from a taxonomy with 4,000 categories show that TXtract outperforms state-of-the-art approaches by up to 10% in F1 and 15% in coverage across all categories.

Cite

CITATION STYLE

APA

Karamanolakis, G., Ma, J., & Dong, X. L. (2020). TXtract: Taxonomy-aware knowledge extraction for thousands of product categories. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 8489–8502). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.751

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free