Converting Text to Numerical RepresentationusingModified Bayesian Vectorization Technique for Multi-Class Classification

  • Sueno H
N/ACitations
Citations of this article
17Readers
Mendeley users who have this article in their library.

Abstract

The first step towards making the text documents machine-readable is vectorization. Vectorisation allows the machines to understand textual content by transforming it into meaningful numerical representations. This study proposes a modified Bayesian vectorization and employing the Laplace smoothing method to reduce the dimensionality of features and improve the classification accuracy. Dataset of news articles was used in building the model and was evaluated across the metrics of precision, recall, F1-score, and accuracy. To validate the effectiveness of the enhancement, the model was compared to the Term Frequency and Inverse Document Frequency (TF-IDF) method. The results revealed that the proposed enhancement has significantly better results having 98% classification accuracy compared to 81% classification accuracy of the TF-IDF vectorization technique.

Cite

CITATION STYLE

APA

Sueno, H. T. (2020). Converting Text to Numerical RepresentationusingModified Bayesian Vectorization Technique for Multi-Class Classification. International Journal of Advanced Trends in Computer Science and Engineering, 9(4), 5618–5623. https://doi.org/10.30534/ijatcse/2020/211942020

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free