Inverse Document Frequency (IDF): A Measure of Deviations from Poisson

  • Church K
  • Gale W
N/ACitations
Citations of this article
158Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Low frequency words tend to be rich in content, and vice versa. But not all equally frequent words are equally mean!ngful. We will use inverse document frequency (IDF), a quantity borrowed from Information Retrieval, to distinguish words like somewhat and boycott. Both somewhat and boycott appeared approximately 1000 times in a corpus of 1989 Associated Press articles, but boycott is a better keyword because its IDF is farther from what would be expected by chance (Poisson).

Cite

CITATION STYLE

APA

Church, K., & Gale, W. (1999). Inverse Document Frequency (IDF): A Measure of Deviations from Poisson (pp. 283–295). https://doi.org/10.1007/978-94-017-2390-9_18

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free