Abstract
As machine learning and the Internet continue to advance, the role of sentiment analysis in discerning positive and negative emotions within text has become increasingly pervasive. Its applications extend to enhancing human-computer interactions, monitoring mental health, and conducting business analyses. Numerous efforts have been dedicated to sentiment analysis and prediction. This article utilizes a dataset of 50,000 movie reviews, sourced from natural language processing (NLP) and included in the spaCy library, to contribute to this ongoing body of work. Two feature extractions, Count Vectorizer and TF-IDF are used, and three Machine Learning (ML) algorithms, Logistic Regression (LR), Decision Tree (DT) and the Multilayer Perceptron (MLP) are used to make predictions on the IMDB review data set with Python programming language. Comparing the experimental results, it can be found that different models have significant differences under different feature extractions, and using TF-IDF feature extraction combined with logistic regression model achieves the best accuracy (88.53%). It proves that LR, among three tested models, performs best in sentimental analysis.
Cite
CITATION STYLE
Shen, Y. (2024). An Effective Sentimental Analysis Model Based on spaCy. Highlights in Science, Engineering and Technology, 85, 1065–1072. https://doi.org/10.54097/91axqs95
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.