Classification of text documents using B-tree

B. S. Harish; D. S. Guru; S. Manjunath

Conference Proceedings

Classification of text documents using B-tree

Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (2012) 85(PART 2) 627-636

DOI: 10.1007/978-3-642-27308-7_66

0Citations

2Readers

Get full text

Abstract

In this paper, we propose an unconventional method of representing and classifying text documents, which preserves the sequence of term occurrence in a test document. The term sequence is effectively preserved with the help of a novel datastructure called 'Status Matrix'. In addition, in order to avoid sequential matching during classification, we propose to index the terms in B-tree, an efficient index scheme. Each term in B-tree is associated with a list of class labels of those documents which contain the term. Further the corresponding classification technique has been proposed. To corroborate the efficacy of the proposed representation and status matrix based classification, we have conducted extensive experiments on various datasets. © Institute for Computer Sciences, Social Informatics and Telecommunications Engineering 2012.

Author supplied keywords

Cite

CITATION STYLE

APA

Harish, B. S., Guru, D. S., & Manjunath, S. (2012). Classification of text documents using B-tree. In Lecture Notes of the Institute for Computer Sciences, Social-Informatics and Telecommunications Engineering, LNICST (Vol. 85, pp. 627–636). https://doi.org/10.1007/978-3-642-27308-7_66

Classification of text documents using B-tree

Abstract

Author supplied keywords

Cite

Register to see more suggestions