CESI: Canonicalizing open knowledge bases using embeddings and side information

73Citations
Citations of this article
102Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Open Information Extraction (OpenIE) methods extract (noun phrase, relation phrase, noun phrase) triples from text, resulting in the construction of large Open Knowledge Bases (Open KBs). The noun phrases (NPs) and relation phrases in such Open KBs are not canonicalized, leading to the storage of redundant and ambiguous facts. Recent research has posed canonicalization of Open KBs as clustering over manually-defined feature spaces. Manual feature engineering is expensive and often sub-optimal. In order to overcome this challenge, we propose Canonicalization using Embeddings and Side Information (CESI) - a novel approach which performs canonicalization over learned embeddings of Open KBs. CESI extends recent advances in KB embedding by incorporating relevant NP and relation phrase side information in a principled manner. Through extensive experiments on multiple real-world datasets, we demonstrate CESI's effectiveness.

Cite

CITATION STYLE

APA

Vashishth, S., Jain, P., & Talukdar, P. (2018). CESI: Canonicalizing open knowledge bases using embeddings and side information. In The Web Conference 2018 - Proceedings of the World Wide Web Conference, WWW 2018 (pp. 1317–1327). Association for Computing Machinery, Inc. https://doi.org/10.1145/3178876.3186030

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free