Sign up & Download
Sign in

A unified network coordinate system for bandwidth and latency

by Venugopalan Ramasubramanian, Dahlia Malkhi, Fabian Kuhn, I Abraham, M Balakrishnan, A Gupta, A Akella
researchmicrosoftcom (2008)

Cite this document (BETA)

Available from research.microsoft.com
Page 1
hidden

A unified network coordinate system for bandwidth and latency

A Unified Network “Coordinate” System for Bandwidth and Latency
Venugopalan Ramasubramanian‡ Dahlia Malkhi‡ Fabian Kuhn†
Ittai Abraham‡ Mahesh Balakrishnan‡ Archit Gupta∗ and Aditya Akella∗
∗University of Wisconsin-Madison, Madison, WI 53706
†ETH, Zurich, Switzerland
‡Microsoft Research Silicon Valley, Mountain View, CA 94043
Abstract
Network coordinate systems, such as GNP and Vivaldi,
provide virtual positions for networked hosts, which en-
able the hosts to connect to nearby peers, find the closest
server, or organize themselves in a topologically-aware
manner. Current network coordinate systems, however,
only use latency to compute the positions, leaving out an
important network metric—namely bandwidth.
In this paper, we present a unified approach that pro-
vides virtual positions based on both bandwidth and la-
tency. The key intuition is that network latency and
bandwidth are approximate tree metrics, that is, a set of
distances that can be embedded in a tree. We first ar-
gue based on intuition and analysis of three real-world
datasets why bandwidth and latency can be represented
as tree metrics. Then, we present Sequoia, an accurate
and light-weight system that provides virtual network po-
sitions by embedding bandwidth or latency on trees; the
network positions computed by Sequoia are as easy to
use as a set of coordinates. Finally, we present an evalua-
tion based on the three datasets showing that: 1) Sequoia
represents latency as accurately as Vivaldi in addition to
being the first ”coordinate” system for bandwidth; 2) it
enables selection of the closest and the most-provisioned
(highest bandwidth) server with low error and overhead;
and 3) it computes topologically-aware trees, which can
be used to organize a networked system efficiently.
1 Introduction
Latency-aware network applications are pervasive these
days: Web-based services and content distribution net-
works (CDNs) often redirect client requests to the closest
server while peer-to-peer systems and distributed hash
tables (DHTs) prefer to select neighbors based on net-
work proximity. Naturally, several systems have been
designed and built to provide latency-centric functional-
ities. Systems such as IDMaps [8], Meridian [25], and
Oasis [9] provide the capability for discovering closest-
servers efficiently. Other systems such as GNP [16], Vi-
valdi [6], and PIC [5] provide a convenient set of coor-
dinates for each host that can then be used to estimate
latency and select proximal peers.
However, a totally different network property—
namely bandwidth—has emerged as a crucial perfor-
mance factor. With the increasing advent of online-
media-streaming, podcasts, and movie/video downloads,
clients of web-based multi-media services and hosts of
peer-to-peer CDNs have a new need to select servers
based on bandwidth in addition to latency. Current sys-
tems that compute network coordinates based on latency,
unfortunately, fail for bandwidth; while, the suitability of
other systems for bandwidth-based server selection has
not been explored yet.
This paper, for the first time, presents an intuitive
model for network bandwidth. The proposed model is
based on the observation that under certain typical cir-
cumstances, bandwidth is a tree metric; that is, the set
of bandwidth measures can be exactly embedded as dis-
tances on a tree. For instance, bandwidth is a tree met-
ric when it primarily depends on the last-mile, access
links. Claims from prior measurement studies support-
ing the prevalance of this instance and an analysis of a
real-world bandwidth dataset presented in this paper in-
dicate that Internet bandwidth is indeed an approximate
tree metric.
Fortuitously, Internet latency also turns out to be an
approximate tree metric. While this revelation may not
be surprising since the Internet is quite hierarchical, our
analysis of two real-world latency datasets confirm that
end-to-end latencies closely follow the hierarchy.
The outcome of the above observations is an elegant
and unified model to represent both latency and band-
width. This paper explores the model of embedding net-
work latency and bandwidth onto trees. More concretely,
it presents the notion of prediction trees, where end hosts
at the leaf level connected via a network of virtual inner
nodes with carefully assigned link weights model latency
or bandwidth. Note that systems that reconstruct the
internal topology of the Internet already exist [15, 27].
Page 2
hidden
However, unlike these systems that try to locate individ-
ual gateways and routers through expensive and intrusive
measurement probes, our approach strives to build a “vir-
tual” model (where internal nodes represent fake routers)
through light-weight, end-to-end mechanisms.
Prediction trees provide key intrinsic advantages:
First, predicting latency or bandwidth between two
hosts in a tree requires a mere computation (just like
coordinate-based network positioning systems) when the
paths to a common ancestor are known. Thus, the
paths to a common root host can serve as “coordi-
nates” for each host. Second, finding the closest or
best-provisioned server can be accomplished efficiently
through a simple search along the path from the root to a
leaf. Finally, a reasonably-balanced tree is a highly scal-
able structure where operations (such as search or join)
can be accomplished with low overhead.
This paper presents a system called Sequoia that
provides a variety of network-centric functionalities
based on the above-described notion of prediction
trees. Sequoia maintains a collection of virtual trees
between the participating hosts and provides effi-
cient latency/bandwidth prediction, server selection, and
topology-aware clustering through easily-decentralized,
light-weight mechanisms. It is resilient to violation of
the triangle inequality condition in network measures,
tolerates non-availability of some measurements, and
shows good scalability with increasing number of hosts.
We envision that Sequoia would serve the needs of
several networked systems and applications. First, path
quality prediction is useful for neighbor selection in Dis-
tributed Hash Tables and peer-to-peer file sharing ser-
vices (e.g., BitTorrent). Second, web services can use
Sequoia to efficiently redirect clients to the closest server
(like Meridian does) or to a well-provisioned server. Fi-
nally, hierarchical networked systems, such as overlay
multicast systems [4, 28, 22] and network monitoring
systems [19, 26] can leverage the intrinsic topology-
aware hierarchy built by Sequoia to organize themselves.
Overall, this paper makes the following major contri-
butions: First, it presents an elegant approach for repre-
senting network measures as tree metrics, providing intu-
itive and analytical reasons to argue that at least two im-
portant measures—latency and bandwidth—are approx-
imate tree metrics. Second, it outlines the design and
implementation of the Sequoia system, which constructs
prediction trees for latency and bandwidth in a cost-
efficient yet accurate manner. Finally, it demonstrates
the new abilities that Sequoia provides for bandwidth-
based server selection and topology-aware hierarchical
clustering while highlighting Sequoia’s ability to match
the state-of-the-art in latency-based network positioning,
all through an extensive evaluation driven by real-world
datasets.
The rest of the paper has the following organization:
Section 2 provides some background about tree metrics
and makes a case for embedding network measures on
trees. Section 3 then describes the Sequoia system in de-
tail while Section 4 evaluates Sequoia. Finally, we dis-
cuss suitable applications and related work in Sections 5
and 6 and conclude in Section 7.
2 Background and Intuition
The key intuition behind this work is that Internet path
measures such as bandwidth and latency are approxi-
mate tree metrics. In this section, we provide some back-
ground about tree metrics and present intuitive and ana-
lytical arguments to back up this intuition.
2.1 Tree Metrics
Consider a set D of pair-wise measurements of some net-
work path property, say latency or bandwidth, between
a set V of networked hosts. This set of measures D is
a tree metric if there exists a tree T with non-negative
weights such that V ⊆ T and dV (u, v) = dT (u, v) for
all u, v ∈ V , where d(u, v) represents the pair-wise path
property. In other words, a set of measures is a tree met-
ric if it can be derived from distances on a tree, that is,
embedded on a tree. Note that in the above definition,
the tree T may have additional nodes not present in the
set V .
There is a convenient condition called the Four-Points
Condition [3] (4PC) to verify whether a set of mea-
sures is a tree metric. The four-points condition states
that for any four hosts w, x, y, and z ordered such that
d(w, x) + d(y, z) ≤ d(w, y) + d(x, z) ≤ d(w, z) +
d(x, y), d(w, y) + d(x, z) = d(w, z) + d(x, y). That
is, of the three sums of distinct pairs of distances, the
highest sum is equal to the second highest sum. A set
of measures is a tree metric if and only if every set of
four hosts satisfies the 4PC [3]. Figure 1 illustrates the
four-points condition graphically.
Another form of tree metric that is useful for modeling
network path properties is the ultra metric. An ultra met-
ric embeds network hosts into a hierarchically-separated
tree, where any pair of hosts with the same least common
ancestor also has the same distance. By least common
ancestor, we mean the closest common ancestor in the
tree. Note that, once again, the tree might include addi-
tional hosts not present in the set of measures. An ultra
metric is a stricter form of tree metric in that every ul-
tra metric is a tree metric while the inverse is not always
true.
Similar to tree metrics, there is a convenient condi-
tion to verify whether a set of measures is an ultra met-
ric. This condition, called the Three-Points Condition
(3PC) [3], states that for any three hosts x, y, and z or-
dered such that d(y, z) ≤ d(x, z) ≤ d(x, y), d(x, y) =

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

5 Readers on Mendeley
by Discipline
 
by Academic Status
 
60% Ph.D. Student
 
20% Doctoral Student
 
20% Other Professional
by Country
 
60% Germany
 
20% China
 
20% United Kingdom