HADI : Fast Diameter Estimation and Mining in Massive Graphs with Hadoop

  • Kang U
  • Tsourakakis C
  • Appel A
  • 66

    Readers

    Mendeley users who have this article in their library.
  • N/A

    Citations

    Citations of this article.

Abstract

How can we quickly find the diameter of a petabyte-sized graph? Large graphs are ubiquitous: social networks (Facebook, LinkedIn, etc.), the World Wide Web, biological networks, computer networks and many more. The size of graphs of interest has been increasing rapidly in recent years and with it also the need for algorithms that can handle tera- and peta-byte graphs. A promising direction for coping with such sizes is the emerging map/reduce architecture and its open-source implementation, HADOOP. Estimating the diameter of a graph, as well as the radius of each node, is a valuable operation that can help us spot outliers and anomalies. We propose HADI (HAdoop based DIameter estimator), a carefully designed algorithm to compute the diameters of petabyte-scale graphs. We run the algorithm to analyze the largest public web graph ever analyzed, with billions of nodes and edges. Additional contributions include the following: (a) We propose several performance optimizations (b) we achieve excellent scale-up, and (c) we report interesting observations including outliers and related patterns, on this real graph (116Gb), as well as several other real, smaller graphs. One of the observations is that the Albert et al. conjecture about the diameter of Networked systems are ubiquitous. The analysis of networks such as the World Wide Web, social, computer and biological networks has attracted much attention recently. Some of the typical measures to compute are

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

There are no full text links

Authors

  • U Kang

  • Charalampos Tsourakakis

  • Ana Paula Appel

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free