Detecting content-bearing words by serial clustering - extended abstract

14Citations
Citations of this article
12Readers
Mendeley users who have this article in their library.

Abstract

Information Retrieval Systems typically distinguish between content bearing words and terms on a stop list. But 'content-bearing' is relative to a collection. For optimal retrieval efficiency, it is desirable to have automated methods for custom building a stop list. This paper defines the notion of serial clustering of words in text, and explores the value of such clustering as an indicator of a word bearing content. The numerical measures we propose may also be of value in assigning weights to terms in requests. Experimental support is obtained from natural text databases in three different languages.

Cite

CITATION STYLE

APA

Bookstein, A., Klein, S. T., & Raita, T. (1995). Detecting content-bearing words by serial clustering - extended abstract. In SIGIR Forum (ACM Special Interest Group on Information Retrieval) (pp. 319–327). ACM. https://doi.org/10.1145/215206.215377

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free