Abstract
This paper introduces a novel graph-based approach to select features from multiple textual documents. The proposed solution enables the investigation of the importance of a term into a whole corpus of documents by utilizing contemporary graph theory methods, such as community detection algorithms and node centrality measures. Compared to well-tried existing solutions, evaluation results show that the proposed approach increases the accuracy of most text classifiers employed and decreases the number of features required to achieve ‘state-of-the-art’ accuracy. Well-known datasets used for the experimentations reported in this paper include 20Newsgroups, LingSpam, Amazon Reviews and Reuters.
Author supplied keywords
Cite
CITATION STYLE
Giarelis, N., Kanakaris, N., & Karacapilidis, N. (2020). An innovative graph-based approach to advance feature selection from multiple textual documents. In IFIP Advances in Information and Communication Technology (Vol. 583 IFIP, pp. 96–106). Springer. https://doi.org/10.1007/978-3-030-49161-1_9
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.