A corpus-based real-time text classification and tagging approach for social data

0Citations
Citations of this article
2Readers
Mendeley users who have this article in their library.

Abstract

With the rapid accumulation of large amounts of user-generated content through social media, social data reuse and integration have gained increasing attention recently. This has made it almost obsolete for software applications to collect, store, and work with their own data stored on local servers. While, with the provision of Application Programming Interfaces from the leading social networking sites, data acquisition and integration has become possible, the meaningful usage of such unstructured, non-uniform, and incoherent data collections needs special procedures of data summarization, understanding, and visualization. One particular aspect in this regard that needs special attention is the procedures for data (text snippets in the form of social media posts) categorization and concept tagging to filter out the relevant and most suitable data for the particular audience and for the particular purpose. In this regard, we propose a corpus-based approach for searching and successively categorizing and tagging the social data with relevant concepts in real time. The proposed approach is capable of addressing the semantical and morphological similarities, as well as domain-specific vocabularies of query strings and tagged concepts. We demonstrate the feasibility and application of our proposed approach in a web-based tool that allows searching Facebook posts and provides search results together with a concept map for further navigation, filtering, and refining of search results. The tool has been evaluated by performing multiple search queries, and resultant concept maps and annotated texts are analyzed in terms of their precision. The approach is thereby found effective in achieving its stated goal of classifying text snippets in real time.

Cite

CITATION STYLE

APA

Memon, A. B., Sootahar, D. K., Luhana, K. K., & Meyer, K. (2024). A corpus-based real-time text classification and tagging approach for social data. Frontiers in Computer Science, 6. https://doi.org/10.3389/fcomp.2024.1294985

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free