Abstract
Automatic text classification (TC) is a fundamental component for information processing and management. To properly classify a document d, it is essential to identify semantics of each term t in d, while the semantics heavily depends on contexts (neighboring terms) of t in d. In this paper, we present a technique CTFA (Context-based Term Frequency Assessment) that improves text classifiers by considering term contexts in test documents. Results of the term context recognition are used to re-assess term frequencies, and hence CTFA may easily work with various kinds of text classifiers that base their TC decisions on term frequencies. Moreover, CTFA is efficient, and neither huge memory nor domain-specific knowledge is required. Experimental Results show that CTFA may successfully enhance performances of Rocchio and SVM (Support Vector Machine) classifiers on Reuters and Newsgroups data. © 2008 Springer Berlin Heidelberg.
Author supplied keywords
Cite
CITATION STYLE
Liu, R. L. (2008). Context-based term frequency assessment for text classification. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 5351 LNAI, pp. 1004–1009). https://doi.org/10.1007/978-3-540-89197-0_98
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.