Abstract
This study provides preliminary insights into the linguistic features that contribute to Internet censorship in mainland China. We collected a corpus of 344 censored and uncensored microblog posts that were published on Sina Weibo and built a Naive Bayes classifier based on the linguistic, topic-independent, features. The classifier achieves a 79.34% accuracy in predicting whether a blog post would be censored on Sina Weibo.
Author supplied keywords
Cite
CITATION STYLE
Ng, K. Y., Feldman, A., & Leberknight, C. (2018). Detecting censorable content on sina weibo: A pilot study. In ACM International Conference Proceeding Series. Association for Computing Machinery. https://doi.org/10.1145/3200947.3201037
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.