Random-walk term weighting for improved text classification

15Citations
Citations of this article
116Readers
Mendeley users who have this article in their library.
Get full text

Abstract

This paper describes a new approach for estimating term weights in a text classification task. The approach uses term co-occurrence as a measure of dependency between word features. A random walk model is applied on a graph encoding words and co-occurrence dependencies, resulting in scores that represent a quantification of how a particular word feature contributes to a given context. We argue that by modeling feature weights using these scores, as opposed to the traditional frequency-based scores, we can achieve better results in a text classification task. Experiments performed on four standard classification datasets show that the new random-walk based approach outperforms the traditional term frequency approach to feature weighting.

Cite

CITATION STYLE

APA

Hassan, S., & Banea, C. (2020). Random-walk term weighting for improved text classification. In Proceedings of TextGraphs: The 1st Workshop on Graph-Based Methods for Natural Language Processing (pp. 53–60). Association for Computational Linguistics. https://doi.org/10.1142/9789819818525_0003

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free