Text mining is an interdisciplinary field of information retrieval, data mining, machine learning, statistics and computational linguistics. Text mining analysis is more complicated than data mining because it involves with unstructured and fuzzy data. On top of that, generation of datasets that are based on the text documents is still not available. Therefore in this study, we proposed a model and finally a tool called Dataset Generator Based on Malay Stemmer Algorithm (DGMS) and experimented based on the news articles from National News Agency of Malaysian (Bernama). The result shows that, the DGMS tool can be used to extract the features and finally generated the desired dataset.
CITATION STYLE
Abdullah, Z., Mohamad, S. Z., Zulkifli, N. S., Herawan, T., & Hamdan, A. R. (2019). DGMS: Dataset Generator Based on Malay Stemmer Algorithm. In Lecture Notes in Electrical Engineering (Vol. 520, pp. 51–60). Springer Verlag. https://doi.org/10.1007/978-981-13-1799-6_6
Mendeley helps you to discover research relevant for your work.