Comparing writing style feature-based classification methods for estimating user reputations in social media

Jong Hwan Suh

Journal ArticleOPEN ACCESS

Comparing writing style feature-based classification methods for estimating user reputations in social media

Suh J

SpringerPlus (2016) 5(1)

DOI: 10.1186/s40064-016-1841-1

7Citations

44Readers

Abstract

In recent years, the anonymous nature of the Internet has made it difficult to detect manipulated user reputations in social media, as well as to ensure the qualities of users and their posts. To deal with this, this study designs and examines an automatic approach that adopts writing style features to estimate user reputations in social media. Under varying ways of defining Good and Bad classes of user reputations based on the collected data, it evaluates the classification performance of the state-of-art methods: four writing style features, i.e. lexical, syntactic, structural, and content-specific, and eight classification techniques, i.e. four base learners—C4.5, Neural Network (NN), Support Vector Machine (SVM), and Naïve Bayes (NB)—and four Random Subspace (RS) ensemble methods based on the four base learners. When South Korea’s Web forum, Daum Agora, was selected as a test bed, the experimental results show that the configuration of the full feature set containing content-specific features and RS-SVM combining RS and SVM gives the best accuracy for classification if the test bed poster reputations are segmented strictly into Good and Bad classes by portfolio approach. Pairwise t tests on accuracy confirm two expectations coming from the literature reviews: first, the feature set adding content-specific features outperform the others; second, ensemble learning methods are more viable than base learners. Moreover, among the four ways on defining the classes of user reputations, i.e. like, dislike, sum, and portfolio, the results show that the portfolio approach gives the highest accuracy.

Author supplied keywords

Cite

CITATION STYLE

APA

Suh, J. H. (2016). Comparing writing style feature-based classification methods for estimating user reputations in social media. SpringerPlus, 5(1). https://doi.org/10.1186/s40064-016-1841-1

Comparing writing style feature-based classification methods for estimating user reputations in social media

Abstract

Author supplied keywords

Cite

Register to see more suggestions