On Scalable and Robust Truth Discovery in Big Data Social Media Sensing Applications

36Citations
Citations of this article
61Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Identifying trustworthy information in the presence of noisy data contributed by numerous unvetted sources from online social media (e.g., Twitter, Facebook, and Instagram) has been a crucial task in the era of big data. This task, referred to as truth discovery, targets at identifying the reliability of the sources and the truthfulness of claims they make without knowing either a priori. In this work, we identified three important challenges that have not been well addressed in the current truth discovery literature. The first one is misinformation spread where a significant number of sources are contributing to false claims, making the identification of truthful claims difficult. For example, on Twitter, rumors, scams, and influence bots are common examples of sources colluding, either intentionally or unintentionally, to spread misinformation and obscure the truth. The second challenge is data sparsity or the long-Tail phenomenon where a majority of sources only contribute a small number of claims, providing insufficient evidence to determine those sources' trustworthiness. For example, in the Twitter datasets that we collected during real-world events, more than 90 percent of sources only contributed to a single claim. Third, many current solutions are not scalable to large-scale social sensing events because of the centralized nature of their truth discovery algorithms. In this paper, we develop a Scalable and Robust Truth Discovery (SRTD) scheme to address the above three challenges. In particular, the SRTD scheme jointly quantifies both the reliability of sources and the credibility of claims using a principled approach. We further develop a distributed framework to implement the proposed truth discovery scheme using Work Queue in an HTCondor system. The evaluation results on three real-world datasets show that the SRTD scheme significantly outperforms the state-of-The-Art truth discovery methods in terms of both effectiveness and efficiency.

Cite

CITATION STYLE

APA

Zhang, D., Wang, D., Vance, N., Zhang, Y., & Mike, S. (2019). On Scalable and Robust Truth Discovery in Big Data Social Media Sensing Applications. IEEE Transactions on Big Data, 5(2), 195–208. https://doi.org/10.1109/TBDATA.2018.2824812

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free