Summarization and categorization of text data in high-level data cleaning for information retrieval

M. Saravanan; P. C.Reghu Raj; S. Raman

Journal ArticleOPEN ACCESS

Summarization and categorization of text data in high-level data cleaning for information retrieval

Applied Artificial Intelligence (2003) 17(5-6) 461-474

DOI: 10.1080/713827177

13Citations

26Readers

Abstract

In view of the exponential growth of online document corpora, even perfect retrieval will fetch too much material for a user to cope with. One way to reduce this problem is automatic domain-specific summarization tailored to user's needs, which is a kind of high-level data cleaning. This requires some method of discovering classes of similar item s that may be grouped into predetermined domains. We explore whether there exists a synergic relation between systems for classification and those for summarization by way of composing those subsystems. In other words, we examine whether prior summarization will increase the performance of the classifier system and vice versa. In both cases, the answer is affirmative, as we show in this paper. We propose a text-mining framework in which these subsystems are treated as constituents of a knowledge discovery process for text corpora.

Cite

CITATION STYLE

APA

Saravanan, M., Raj, P. C. R., & Raman, S. (2003). Summarization and categorization of text data in high-level data cleaning for information retrieval. Applied Artificial Intelligence, 17(5–6), 461–474. https://doi.org/10.1080/713827177

Summarization and categorization of text data in high-level data cleaning for information retrieval

Abstract

Cite

Register to see more suggestions