In the process of knowledge discovery, the reliability of results depends upon the effectiveness of attributes selected for decision. The curse of dimensionality refers to the phenomenon in which the excessive number of dimensions affect the analysis. In order to eradicate the curse of dimensionality in text analysis, we are proposing an ontology-based semantic measure for intelligent selection/reduction of features. Among the various text mining techniques, ontology-based mining has a significant contribution to the field. The ontology-based semantic measures, which are mathematical models used to find the similarity between various concepts in the ontology, have made a significant contribution to feature engineering. The proposed measure is an amalgamation of semantic similarity, relatedness, and distance. The measure allows performing an in-depth analysis of various semantic relationships between concepts of the English language. The performance of the measure was evaluated against benchmarked dimension reduction techniques such as PCA. The results show improvement by reducing the size of dimensions up to 35%. The results were further evaluated by training a classifier to validate that the features are not creating any underfit/overfit model.
CITATION STYLE
Siddiqui, S., Rehman, M. A., Muhammad Doudpota, S., & Waqas, A. (2019). Ontology Driven Feature Engineering for Opinion Mining. IEEE Access, 7, 67392–67401. https://doi.org/10.1109/ACCESS.2019.2918584
Mendeley helps you to discover research relevant for your work.