Towards unbiased sampling of online social networks

7Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Unbiased sampling of online social networks (OSNs) makes it possible to get accurate statistical properties of large-scale OSNs. However, the most used sampling methods, Breadth-First-Search (BFS) and Greedy, are known to be biased towards high degree nodes, yielding inaccurate statistical results. To give a general requirement for unbiased sampling, we model the crawling process as a Markov Chain and deduce a necessary and sufficient condition, which enables us to design various efficient unbiased sampling methods. To the best of our knowledge, we are among the first to give such a condition. Metropolis-Hastings Random Walk (MHRW) is an example which satisfies the condition. However, walkers in MHRW may stay at some low-degree nodes for a long time, resulting considerable self-loops on these nodes, which adversely affect the crawling efficiency. Based on the condition, a new unbiased sampling method, called USRS, is proposed to reduce the probabilities of self-loops. We use the dataset of Renren, the largest OSN in China, to evaluate the performance of USRS. The results have demonstrated that USRS generates unbiased samples with low self-loop probabilities, and achieves higher crawling efficiency. © 2011 IEEE.

Cite

CITATION STYLE

APA

Wang, D., Li, Z., & Xie, G. (2011). Towards unbiased sampling of online social networks. In IEEE International Conference on Communications. https://doi.org/10.1109/icc.2011.5963203

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free