CASIA-KB: A multi-source Chinese semantic knowledge base built from structured and unstructured web data

5Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Knowledge bases play a crucial role in intelligent systems, especially in the Web age. Many domain dependent and general purpose knowledge bases have been developed to support various kinds of applications. In this paper, we propose the CASIA-KB, a Chinese semantic knowledge base built from various Web resources. CASIA-KB utilizes Semantic Web and Natural Language Processing techniques and mainly focuses on declarative knowledge. Most of the knowledge is textual knowledge extracted from structured and unstructured sources, such as Web-based Encyclopedias (where more formal and static knowledge comes from), Microblog posts and News (where most updated factual knowledge comes from). CASIA-KB also aims at bringing in images and videos (which serve as non-textual knowledge) as relevant knowledge for specific instances and concepts since they bring additional interpretation and understanding of textual knowledge. For knowledge base organization, we briefly discussed the current ontology of CASIA-KB and the entity linking efforts for linking semantically equivalent entities together. In addition, we build up a SPARQL endpoint with visualization functionality for query processing and result presentation, which can produce query output in different formats and with result visualization supports. Analysis on the entity degree distributions of each individual knowledge source and the whole CASIA-KB shows that each of the branch knowledge base follows power law distribution and when entities from different resources are linked together to build a merged knowledge base, the whole knowledge base still keeps this structural property. © 2014 Springer International Publishing.

Cite

CITATION STYLE

APA

Zeng, Y., Wang, D., Zhang, T., Wang, H., Hao, H., & Xu, B. (2014). CASIA-KB: A multi-source Chinese semantic knowledge base built from structured and unstructured web data. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 8388 LNCS, pp. 75–88). Springer Verlag. https://doi.org/10.1007/978-3-319-06826-8_7

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free