Context-based term frequency assessment for text classification

Rey Long Liu

Conference Proceedings

Context-based term frequency assessment for text classification

Liu R

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2008) 5351 LNAI 1004-1009

DOI: 10.1007/978-3-540-89197-0_98

2Citations

2Readers

Get full text

Abstract

Automatic text classification (TC) is a fundamental component for information processing and management. To properly classify a document d, it is essential to identify semantics of each term t in d, while the semantics heavily depends on contexts (neighboring terms) of t in d. In this paper, we present a technique CTFA (Context-based Term Frequency Assessment) that improves text classifiers by considering term contexts in test documents. Results of the term context recognition are used to re-assess term frequencies, and hence CTFA may easily work with various kinds of text classifiers that base their TC decisions on term frequencies. Moreover, CTFA is efficient, and neither huge memory nor domain-specific knowledge is required. Experimental Results show that CTFA may successfully enhance performances of Rocchio and SVM (Support Vector Machine) classifiers on Reuters and Newsgroups data. © 2008 Springer Berlin Heidelberg.

Author supplied keywords

Cite

CITATION STYLE

APA

Liu, R. L. (2008). Context-based term frequency assessment for text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5351 LNAI, pp. 1004–1009). https://doi.org/10.1007/978-3-540-89197-0_98

Context-based term frequency assessment for text classification

Abstract

Author supplied keywords

Cite

Register to see more suggestions