The text document classification employs either text based approach or semantic based approach to index and retrieve text documents. The former uses keywords and therefore provides limited capabilities to capture and exploit the conceptualization involved in user information needs and content meanings. The latter aims to solve these limitations using content meanings, rather than keywords. More formally, the semantic based approach uses the domain ontology to exploit the content meanings of a particular domain. This approach however has some drawbacks. It lacks enrichment of ontology concepts with new lexical resources and evaluation of the importance indicated by weights of those concepts. Therefore to address these issues, this paper proposes a new ontology based text document classification framework. The proposed framework incorporates a newly developed objective metric called SEMCON to enrich the domain ontology with new concepts by combining contextual as well as semantic information of a term within a text document. The framework also introduces a new approach to automatically estimate the importance of ontology concepts which is indicated by the weights of these concepts, and to enhance the concept vector space model using automatically estimated weights.
CITATION STYLE
Kastrati, Z., Imran, A. S., & Yayilgan, S. Y. (2015). A general framework for text document classification using SEMCON and ACVSR. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 9172, pp. 310–319). Springer Verlag. https://doi.org/10.1007/978-3-319-20612-7_30
Mendeley helps you to discover research relevant for your work.