Abstract
Facebook users generate a vast amount of data, including posts, comments, and replies, in various formats such as short text, long text, structured, unstructured, and semi structured. Consequently, obtaining import information from social media data becomes a significant challenge for low-resource languages such as Afaan Oromo, Amharic, and Tigrigna. Topic modeling algorithms are designed to identify and categorize topics within a set of documents based on their semantic similarity which helps obtain insight from documents. This study proposes latent Dirichlet allocation, matrix factorization, probabilistic latent semantic analysis, and BERTopic to extract topics from Facebook text comments in Afaan Oromo, Amharic, and Tigrigna. The study utilized text comments from the Facebook pages of various individuals, including activists, politicians, athletes, media companies, and government offices. BERTopic was found to be the most effective for identifying major topics and providing valuable insights into user conversations and social media trends with coherence scores of 82.74%, 87.85%, and 81.79% respectively.
Author supplied keywords
Cite
CITATION STYLE
Defersha, N. B., Tune, K. K., & Abate, S. T. (2024). Adapting Outperformer from Topic Modeling Methods for Topic Extraction and Analysis: The Case of Afaan Oromo, Amharic, and Tigrigna Facebook Text Comments. International Journal of Advanced Computer Science and Applications, 15(3), 912–919. https://doi.org/10.14569/IJACSA.2024.0150391
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.