Expanding Taxonomies with Implicit Edge Semantics

39Citations
Citations of this article
46Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Curated taxonomies enhance the performance of machine-learning systems via high-quality structured knowledge. However, manually curating a large and rapidly-evolving taxonomy is infeasible. In this work, we propose Arborist, an approach to automatically expand textual taxonomies by predicting the parents of new taxonomy nodes. Unlike previous work, Arborist handles the more challenging scenario of taxonomies with heterogeneous edge semantics that are unobserved. Arborist learns latent representations of the edge semantics along with embeddings of the taxonomy nodes to measure taxonomic relatedness between node pairs. Arborist is then trained by optimizing a large-margin ranking loss with a dynamic margin function. We propose a principled formulation of the margin function, which theoretically guarantees that Arborist minimizes an upper-bound on the shortest-path distance between the predicted parents and actual parents in the taxonomy. Via extensive evaluation on a curated taxonomy at Pinterest and several public datasets, we demonstrate that Arborist outperforms the state-of-the-art, achieving up to 59% in mean reciprocal rank and 83% in recall at 15. We also explore the ability of Arborist to infer nodes' taxonomic-roles, without explicit supervision on this task.

Cite

CITATION STYLE

APA

Manzoor, E., Li, R., Shrouty, D., & Leskovec, J. (2020). Expanding Taxonomies with Implicit Edge Semantics. In The Web Conference 2020 - Proceedings of the World Wide Web Conference, WWW 2020 (pp. 2044–2054). Association for Computing Machinery, Inc. https://doi.org/10.1145/3366423.3380271

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free