WCD-new approach combining words, concepts and documents based on ontology

0Citations
Citations of this article
1Readers
Mendeley users who have this article in their library.
Get full text

Abstract

In traditional Information Retrieval (IR) system, the document is represented by the set of words or terms. If the words or terms are regarded as the components of a vector, the model is called the vector space model (VSM). VSM has been widely used in IR systems in recently decades. As the the new words appear dramatically in the Internet era, the amount of computation is very large and it draws back the IR system's performance. This paper puts forward a new approach according to the relations among the words, concepts and the document by using the concept of the ontology. The new approach has two levels, the Word-Concept (WC) level and the Concept-Document (CD) level. In the WC level, the transition probability matrix is constructed by using the word-word pairs appeared in the same paragraph, and the biggest eigenvector of matrix is computed. The eigenvector reflects the importance of the word to the concept. In the CD level, the distance matrix is constructed by using the distance between words in the concept, and the average variance values of elements is computed. The value determines the relevance of the document to the concept. In order to expand the query sentence, the Personal Information Profile (PIP) of the user is defined by using the query history of the user. It is proofed to be more effective than previous one. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Wang, H., Guo, Y., & Shi, X. (2012). WCD-new approach combining words, concepts and documents based on ontology. In Communications in Computer and Information Science (Vol. 316 CCIS, pp. 449–458). https://doi.org/10.1007/978-3-642-34289-9_50

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free