Probabilistic Estimation of Network Size and Diameter
- ISBN: 9781424446780
- DOI: 10.1109/LADC.2009.19
Abstract
Determining the size of a network and its diameter are important functions in distributed systems, as there are a number of algorithms which rely on such parameters, or at least on estimates of those values. The Extrema Propagation technique allows the estimation of the size of a network in a fast, distributed and fault tolerant manner. The technique was previously studied in a simulation setting where rounds advance synchronously and where there is no message loss. This work presents two main contributions. The first, is the study of the Extrema Propagation technique under asynchronous rounds and integrated in the Network Friendly Epidemic Multicast (NeEM) framework. The second, is the evaluation of a diameter estimation technique associated with the Extrema Propagation. This study also presents a small enhancement to the Extrema Propagation in terms of communication cost and points out some other possible enhancements. Results show that there is a clear trade-off between time and communication that must be considered when configuring the protocol-a faster convergence time implies a higher communication cost. Results also show that its possible to reduce the total communication cost by more than 18% using a simple approach. The diameter estimation technique is shown to have a relative error of less than 10% even when using a small sample of nodes.
Author-supplied keywords
Probabilistic Estimation of Network Size and Diameter
Jorge C. S. Cardoso
E.Artes / CITAR
Universidade Cato´lica Portuguesa (UCP)
Porto, Portugal
Email: jorgecardoso@ieee.org
Carlos Baquero and Paulo Se´rgio Almeida
DI/CCTC
Universidade do Minho
Braga, Portugal
Email: {cbm, psa}@di.uminho.pt
Abstract—Determining the size of a network and its diameter
are important functions in distributed systems, as there are a
number of algorithms which rely on such parameters, or at least
on estimates of those values.
The Extrema Propagation technique allows the estimation of
the size of a network in a fast, distributed and fault tolerant
manner. The technique was previously studied in a simulation
setting where rounds advance synchronously and where there is
no message loss.
This work presents two main contributions. The first, is the
study of the Extrema Propagation technique under asynchronous
rounds and integrated in the Network Friendly Epidemic Mul-
ticast (NeEM) framework. The second, is the evaluation of
a diameter estimation technique associated with the Extrema
Propagation. This study also presents a small enhancement to
the Extrema Propagation in terms of communication cost and
points out some other possible enhancements.
Results show that there is a clear trade-off between time and
communication that must be considered when configuring the
protocol—a faster convergence time implies a higher communi-
cation cost. Results also show that its possible to reduce the total
communication cost by more than 18% using a simple approach.
The diameter estimation technique is shown to have a relative
error of less than 10% even when using a small sample of nodes.
Keywords-Aggregation; Network Size Estimation; Network Di-
ameter Estimation; Probabilistic Estimation;
I. INTRODUCTION
Determining the size of a network is an important function
in a distributed system. There are a number of algorithms
which rely on an estimate of the network size, or that would
at least benefit from such an estimate, for example: distributed
hash tables can take advantage of having an estimate of the
network size to adjust the size of the routing table that each
node keeps; gossip-based protocols [1], [2] can use an estimate
of the network size to better adjust the gossiping fanout
parameter;
Although important, determining the size of a network, in
a distributed manner, is not trivial. Algorithms that do this
should be fast, to cope with high churn; fault-tolerant, to
cope with link and node failures; and use a small number of
messages, in order not to impose a high overhead in network
bandwidth.
Another important network parameter is its diameter—the
maximum shortest-path length between any two pairs of nodes.
Knowing the diameter, or at least having an estimate of its
value is important to configure, for example, the time-to-live
field on many protocols.
In a previous work, [3], [4] introduced the Extrema Propa-
gation technique, which allows the estimation of the size of a
network. This technique is fast because it produces estimates
in a number of steps close to the theoretical minimum;
completely distributed, because every node determines the
estimate by itself; does not require global identifiers; and
tolerates message loss. Also, it is possible to adapt it in order
to have it produce an estimate of the network diameter.
However, the Extrema Propagation technique was originally
studied in a simulation setting where rounds advance syn-
chronously and where there is no message loss. In a real
scenario, networks and nodes are often not synchronous, some
nodes may fail, links have different latencies and messages
may be lost in transit. The purpose of this work is to:
• study the Extrema Propagation technique under asyn-
chronous rounds;
• extend the technique in order to have it produce an
estimate of the network diameter;
• evaluate an optimization to the original message size of
the Extrema Propagation technique.
In order to study the technique in a more realistic setting,
we adapted and integrated it in the Network Friendly Epidemic
Multicast [5] (NeEM) framework1 since NeEM could also
benefit from the estimates produced by Extrema.
This paper is organized as follows: Section II describes
some algorithms that perform data aggregation across a net-
work and compares them to the algorithm used in this paper.
Section III describes the NeEM software that was modified and
used in this work. Section IV gives an overview of the Extrema
Propagation technique for estimating the size of a network and
how it can be adapted to estimate the network’s diameter. Sec-
tion V describes the general experimental procedure and the
various experiments performed. Section VI presents the results
from the experiments and, finally, Section VII concludes.
II. RELATED WORK
There are numerous algorithms for estimating the size of a
network or, more generally, for performing data aggregation
across a network.
Some algorithms, more directly related to this study are
described next.
1NeEM is a software framework for group communication based on gossip
protocols.
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



