Integration of manual and automatic text categorization. A categorization workbench for text-based email and spam

1Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

As a method structuring information and knowledge contained in texts, text categorization can be to a great extend automated. The automatic text classification systems implement machine learning algorithms and need training samples. In commercial applications however, the automatic categorization appear to come up against limiting factors. For example, it turns out to be difficult to reduce the sample complexity without the categorization quality in terms of recall and precision will suffer. Instead of trying to fully replace the human work by machine, it could be more effective and ultimately efficient to let human and machine cooperate. So we have developed a categorization workbench to realise synergy between manual and machine categorization. To compare the categorization workbench with common automatic classification systems, the automatic categorizer of the IBM DB2 Information Integrator for Content has been chosen for tests. The test results show that, benefiting from the incorporation of user's domain knowledge, the categorization workbench can improve the recall by a factor of two till four with the same number of training samples as the automatic categorizer uses. Further, to get a comparable categorization quality, the categorization workbench just needs an eighth till a quarter of the training samples as the automatic categorizer does. © 2004 Springer-Verlag.

Cite

CITATION STYLE

APA

Sun, Q., Schommer, C., & Lang, A. (2004). Integration of manual and automatic text categorization. A categorization workbench for text-based email and spam. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 3238 LNAI, pp. 156–167). Springer Verlag. https://doi.org/10.1007/978-3-540-30221-6_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free