Exploring the stability of IDF term weighting

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

TF·IDF has been widely used as a term weighting schemes in today's information retrieval systems. However, computation time and cost have become major concerns for its application. This study investigated the similarities and differences between IDF distributions based on the global collection and on different samples and tested the stability of the IDF measure across collections. A more efficient algorithm based on random samples generated a good approximation to the IDF computed over the entire collection, but with less computation overhead. This practice may be particularly informative and helpful for analysis on large database or dynamic environment like the Web. © 2008 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Fu, X., & Chen, M. (2008). Exploring the stability of IDF term weighting. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4993 LNCS, pp. 10–21). https://doi.org/10.1007/978-3-540-68636-1_2

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free