Discovering patterns using feature selection techniques and correlation

3Citations
Citations of this article
28Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Term Frequency and inverse document frequency is reported to have a significant contribution for various text categorization, document clustering and many other text mining related tasks. A collection of the applications and the enhancements of the Term Frequency and Inverse Document Frequency based document representation technique is examined in this work. The document representation algorithm is essential in the field of text - script mining. In this algorithm, unstructured data is converted into a vector space model where each related document is considered as a point in the vector space. Related documents come in proximity to the other related documents while the documents that are very far away from being coherent remain different from each other. In this paper, four feature selection techniques are implemented to discover the patterns from a repository of unstructured data by using correlation similarity measure. Analysis and comparison with other existing technique is also included. The validation of the patterns formed is performed by using silhouette values. Experiments are conducted to compare performance. Results indicate that TDMp1 performance is poor compared to others.

Cite

CITATION STYLE

APA

Goswami, M., & Purkayastha, B. S. (2020). Discovering patterns using feature selection techniques and correlation. In Lecture Notes on Data Engineering and Communications Technologies (Vol. 46, pp. 824–831). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-38040-3_94

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free