Adapting Outperformer from Topic Modeling Methods for Topic Extraction and Analysis: The Case of Afaan Oromo, Amharic, and Tigrigna Facebook Text Comments

4Citations
Citations of this article
18Readers
Mendeley users who have this article in their library.

Abstract

Facebook users generate a vast amount of data, including posts, comments, and replies, in various formats such as short text, long text, structured, unstructured, and semi structured. Consequently, obtaining import information from social media data becomes a significant challenge for low-resource languages such as Afaan Oromo, Amharic, and Tigrigna. Topic modeling algorithms are designed to identify and categorize topics within a set of documents based on their semantic similarity which helps obtain insight from documents. This study proposes latent Dirichlet allocation, matrix factorization, probabilistic latent semantic analysis, and BERTopic to extract topics from Facebook text comments in Afaan Oromo, Amharic, and Tigrigna. The study utilized text comments from the Facebook pages of various individuals, including activists, politicians, athletes, media companies, and government offices. BERTopic was found to be the most effective for identifying major topics and providing valuable insights into user conversations and social media trends with coherence scores of 82.74%, 87.85%, and 81.79% respectively.

Cite

CITATION STYLE

APA

Defersha, N. B., Tune, K. K., & Abate, S. T. (2024). Adapting Outperformer from Topic Modeling Methods for Topic Extraction and Analysis: The Case of Afaan Oromo, Amharic, and Tigrigna Facebook Text Comments. International Journal of Advanced Computer Science and Applications, 15(3), 912–919. https://doi.org/10.14569/IJACSA.2024.0150391

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free