Weighting of noun phrases based on local frequency of nouns

4Citations
Citations of this article
5Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The tf-idf is a well-known weighting measure for words in texts. It measures both the frequency and the locality of words. It is often used for information retrieval and text mining. However, a lot of infrequent words have the same tf-idf value. In this study, the words are noun phrases. This paper proposes a novel weighting measure for noun phrases in texts by using the local frequency of nouns that construct a noun phrase. The proposed measure is calculated by combining the tf-idf of a noun phrase and the average of the difference between its frequency and the frequency of nouns within the phrase. The proposed measure was evaluated in experiments on the datasets of 19,997 newsgroup texts written in English and 206 Wikipedia pages written in Japanese. The experiments showed that the number of noun phrases with the same proposed measure is less than the number of noun phrases with the same tf-idf.

Cite

CITATION STYLE

APA

Yamada, Y., Himeno, Y., & Nakatoh, T. (2018). Weighting of noun phrases based on local frequency of nouns. In Advances in Intelligent Systems and Computing (Vol. 700, pp. 436–445). Springer Verlag. https://doi.org/10.1007/978-3-319-72550-5_42

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free