TXtract: Taxonomy-aware knowledge extraction for thousands of product categories

Giannis Karamanolakis; Jun Ma; Xin Luna Dong

Conference ProceedingsOPEN ACCESS

TXtract: Taxonomy-aware knowledge extraction for thousands of product categories

Proceedings of the Annual Meeting of the Association for Computational Linguistics (2020) 8489-8502

DOI: 10.18653/v1/2020.acl-main.751

32Citations

153Readers

Abstract

Extracting structured knowledge from product profiles is crucial for various applications in e-Commerce. State-of-the-art approaches for knowledge extraction were each designed for a single category of product, and thus do not apply to real-life e-Commerce scenarios, which often contain thousands of diverse categories. This paper proposes TXtract, a taxonomy-aware knowledge extraction model that applies to thousands of product categories organized in a hierarchical taxonomy. Through category conditional self-attention and multi-task learning, our approach is both scalable, as it trains a single model for thousands of categories, and effective, as it extracts category-specific attribute values. Experiments on products from a taxonomy with 4,000 categories show that TXtract outperforms state-of-the-art approaches by up to 10% in F1 and 15% in coverage across all categories.

Cite

CITATION STYLE

APA

Karamanolakis, G., Ma, J., & Dong, X. L. (2020). TXtract: Taxonomy-aware knowledge extraction for thousands of product categories. In Proceedings of the Annual Meeting of the Association for Computational Linguistics (pp. 8489–8502). Association for Computational Linguistics (ACL). https://doi.org/10.18653/v1/2020.acl-main.751

TXtract: Taxonomy-aware knowledge extraction for thousands of product categories

Abstract

Cite

Register to see more suggestions