An evaluation of machine learning and latent semantic analysis in text sentiment classification

Justyna Miazga; Tomasz Hachaj

Journal ArticleOPEN ACCESS

An evaluation of machine learning and latent semantic analysis in text sentiment classification

Miazga J
Hachaj T

Technical Transactions (2020) 1-11

DOI: 10.37705/techtrans/e2020030

N/ACitations

5Readers

Abstract

In this paper, we compare the following machine learning methods as classifiers for sentiment analysis: k – nearest neighbours (kNN), artificial neural network (ANN), support vector machine (SVM), random forest. We used a dataset containing 5,000 movie reviews in which 2,500 were marked as positive and 2,500 as negative. We chose 5,189 words which have an influence on sentence sentiment. The dataset was prepared using a term document matrix (TDM) and classical multidimensional scaling (MDS). This is the first time that TDM and MDS have been used to choose the characteristics of text in sentiment analysis. In this case, we decided to examine different indicators of the specific classifier, such as kernel type for SVM and neighbour count in kNN. All calculations were performed in the R language, in the program R Studio v 3.5.2. Our work can be reproduced because all of our data sets and source code are public.

Cite

CITATION STYLE

APA

Miazga, J., & Hachaj, T. (2020). An evaluation of machine learning and latent semantic analysis in text sentiment classification. Technical Transactions, 1–11. https://doi.org/10.37705/techtrans/e2020030

Readers over time

Readers' Seniority

PhD / Post grad / Masters / Doc 1

100%

Readers' Discipline

Engineering 1

100%

An evaluation of machine learning and latent semantic analysis in text sentiment classification

Abstract

Register to see more suggestions

Cite

Readers over time

Readers' Seniority

Readers' Discipline