Comparative analysis of different data representations for the task of chemical compound extraction

3Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.

Abstract

Chemical Compound Extraction refers to the task of recognizing chemical instances such as oxygen nitrogen and others. The majority of studies that addressed the task of chemical compound extraction used machine-learning techniques. The key challenge behind using machine-learning techniques lies in employing a robust set of features. The literature shows that there are numerous types of features used in the task of chemical compound extraction. Such dimensionality of features can be determined via data representation. Some researchers have used N-gram representation for biomedical named entity recognition, where the most significant terms are represented as features. Meanwhile, others have used detailed-attribute representation in which the features are generalized. As a result, identifying the best combination of features to yield high-accuracy classification becomes challenging. This paper aims to apply the Wrapper Subset Selection approach using two data representations-N-gram and detailed-attributes. Since each data representation would suit a specific classification algorithm, two classifiers were utilized-Naïve Bayes (for detailedattributes) and Support Vector Machine (for N-gram). The results show that the application of feature selection using detailedattributes outperformed that of N-gram representation by achieving a 0.722 f-measure. Despite the higher classification accuracy, the selected features using detailed-attribute representation have more meaning and can be applied for further datasets.

Cite

CITATION STYLE

APA

Alshaikhdeeb, B., & Ahmad, K. (2018). Comparative analysis of different data representations for the task of chemical compound extraction. International Journal on Advanced Science, Engineering and Information Technology, 8(5), 2189–2195. https://doi.org/10.18517/ijaseit.8.5.6432

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free