Sign up & Download
Sign in

Revealing botnet membership using DNSBL counter-intelligence

by Anirudh Ramachandran, Nick Feamster, David Dagon
Proceedings of the 2nd conference on Steps to Reducing Unwanted Traffic on the InternetVolume 2 (2006)

Abstract

Botnets - networks of (typically compromised) machines - are often used for nefarious activities (e.g., spam, click fraud, denial-of-service attacks, etc.). Identifying members of botnets could help stem these attacks, but passively detecting botnet membership (i.e., without disrupting the operation of the botnet) proves to be difficult. This paper studies the effectiveness of monitoring lookups to a DNS-based blackhole list (DNSBL) to expose botnet membership. We perform counter-intelligence based on the insight that botmasters themselves perform DNSBL lookups to determine whether their spamming bots are blacklisted. Using heuristics to identify which DNSBL lookups are perpetrated by a botmaster performing such reconnaissance, we are able to compile a list of likely bots. This paper studies the prevalence of DNSBL reconnaissance observed at a mirror of a well-known blacklist for a 45-day period, identifies the means by which botmasters are performing reconnaissance, and suggests the possibility of using counter-intelligence to discover likely bots. We find that bots are performing reconnaissance on behalf of other bots. Based on this finding, we suggest counter-intelligence techniques that may be useful for early bot detection.

Cite this document (BETA)

Available from Nick Feamster's profile on Mendeley.
Page 1
hidden

Revealing botnet membership using DNSBL counter-intelligence

Revealing Botnet Membership Using DNSBL Counter-Intelligence
Anirudh Ramachandran, Nick Feamster and David Dagon
College of Computing, Georgia Institute of Technology
{avr, feamster, dagon}@cc.gatech.edu
ABSTRACT
Botnets—networks of (typically compromised)
machines—are often used for nefarious activities
(e.g., spam, click fraud, denial-of-service attacks, etc.).
Identifying members of botnets could help stem these
attacks, but passively detecting botnet membership (i.e.,
without disrupting the operation of the botnet) proves
to be difcult. This paper studies the effectiveness
of monitoring lookups to a DNS-based blackhole list
(DNSBL) to expose botnet membership.
We perform counter-intelligence based on the insight
that botmasters themselves perform DNSBL lookups to
determine whether their spamming bots are blacklisted.
Using heuristics to identify which DNSBL lookups are
perpetrated by a botmaster performing such reconnais-
sance, we are able to compile a list of likely bots. This
paper studies the prevalence of DNSBL reconnaissance
observed at a mirror of a well-known blacklist for a 45-
day period, identies the means by which botmasters are
performing reconnaissance, and suggests the possibility
of using counter-intelligence to discover likely bots. We
nd that bots are performing reconnaissance on behalf
of other bots. Based on this nding, we suggest counter-
intelligence techniques that may be useful for early bot
detection.
1. Introduction
Internet malice has evolved from pranks conceived
and executed by amateur hackers to a global business
involving signicant monetary gains for the perpetra-
tors [19]. Examples include: (1) unsolicited commercial
email (“spam”), which threatens to render email useless
by immensely decreasing the signal-to-noise ratio of traf-
c [17]; (2) denial of service attacks, which have become
common [12], and (3) click fraud, whereby a group of
attackers send bogus “clicks” for online advertisements
that mimic legitimate request patterns, swindling adver-
tisers out of large sums of money [4].
Botnets are a root cause of these problems [8], since
they allow attackers to distribute tasks over thousands of
hosts distributed across the Internet. A botnet is network
of compromised hosts (“bots”) connected to the Internet
under the control of a single entity (“botmaster”, “con-
troller”, or command and control) [5]. The large cumula-
tive bandwidth and relatively untraceable nature of spam
from bots makes botnets an attractive choice for large-
scale spamming. Previous work provides further back-
ground on botnets [5, 6].
If network operators and system administrators could
reliably determine whether a host is a member of a bot-
net, they could take appropriate steps towards mitigating
the attacks they perpetrate. Although previous work has
described an active detection technique using DNS hi-
jacking technique and social engineering [6], there are
few efcient methods to passively detect and identify
bots (i.e., without disrupting the operation of the botnet).
Indeed, detecting botnets proves to be very challenging:
a victim of a botnet attack can typically only observe the
attack from a single network, from which point the at-
tack trafc may closely resemble the trafc of legitimate
users. Regrettably, the state-of-the-art in botnet identi-
cation is based on user complaints, localized honeypots
and intrusion detection systems, or through the complex
correlation of data collected through darknets [13].
We propose a set of techniques to identify bot-
nets using passive analysis of DNS-based blackhole
list (DNSBL) lookup trafc. Many Internet Service
Providers (ISPs) and enterprise networks use DNSBLs
to track IP addresses that originate spam, so that fu-
ture emails sent from these IP addresses can be re-
jected. For the same reason, botmasters are known to
sell “clean” bots (i.e., not listed in any DNSBL) at a pre-
mium. This paper addresses the possibility of performing
counter-intelligence to help us discover identities of bots,
based on the insight that botmasters themselves must per-
form “reconnaissance” lookups to determine their bots’
blacklist status.
The contributions of this paper include:1. Passive heuristics for counter-intelligence. We de-
velop heuristics to distinguish DNSBL reconnaissance
queries for a botnet from legitimate DNSBL trafc (ei-
ther ofine or in real-time), to identify likely bots.
These heuristics are based on an enumeration of possi-
ble lookup techniques that botmasters are likely to use
to perform reconnaissance, which we detail in Section 2.
Unlike previous detection schemes, our techniques are
covert and do not disrupt the botnet’s activity.2. Study of DNSBL reconnaissance techniques. We
study the prevalence of DNSBL reconnaissance by an-
alyzing logs from a mirror of a well-known blackhole
list for a 45-day period from November 17, 2005 to
December 31, 2005. Section 4 discusses the prevalence
of the different types of reconnaissance techniques that
Page 2
hidden
Attacker(s) performing
DNSBL reconnaissance
Blacklist
DNS−based
Legitimate DNSBL
mailserverlookups from victim’s
Spamming Bots
Record of Queries
(potentially misleading) DNSBL responses
Spam recipient
C&C Commands
Figure 1: DNSBL-based Spam Mitigation Architecture.
we observed. Much to our surprise, we nd that bots
are performing reconnaissance on behalf of other (pos-
sibly newly infected) bots. Although some bots perform
a large number of reconnaissance queries, it appears
that much of the reconnaissance activity is spread across
many bots each of which issue few queries, thus making
detection more difcult.3. Identification of new bots. We analyze DNSBL
queries that are likely being performed by botmasters to
identify “clean” bots. Such reconnaissance usually pre-
cedes the use of bots in an attack, suggesting the possi-
bility that this DNSBL counter-intelligence can be used
to bolster responses. Section 3 demonstrates the possi-
bility of such early warning. To validate our detection
scheme, we correlate the IP addresses of these likely bots
with data collected at a botnet sinkhole (sinkholing tech-
nique explained in previous work [6]) over the same time
period (this dataset has been used as “ground truth” for
botnet membership in previous studies [6, 17]).4. DNSBL-based countermeasures. Our heuristics
could be used to detect reconnaissance in real-time. This
ability potentially allows for active countermeasures,
such as returning misleading responses to reconnaissance
lookups, as shown in Figure 1. We revisit this topic in
Section 5.
2. Model of Reconnaissance Techniques
This section describes our model for DNSBL recon-
naissance techniques (i.e., the techniques that botmasters
may be using to determine whether bots have been black-
listed). Our goal in developing these models and heuris-
tics is to distinguish DNSBL queries issued by botmas-
ters from those performed by legitimate mail servers.1
2.1 Properties of Reconnaissance Queries
Our detection heuristics are based on the construc-
tion of a DNSBL query graph, where an edge in the
graph from node A to node B indicates that node A
has issued a query to a DNSBL to determine whether
node B is listed. After constructing this graph, we de-
velop detection heuristics based on the expected spatial
and temporal characteristics of legitimate lookups ver-
sus reconnaissance-based lookups. These characteristics
hold primarily in cases when members of the botnet are
not performing queries on behalf of each other, a case
that makes detecting reconnaissance more difcult, as we
explain in Section 2.2.3. As we describe below, our de-
tection heuristics exploit both spatial and temporal prop-
erties of the DNSBL query graph.
Property 1 (Spatial relationships) A legitimate mail
server will perform queries and be the object of
queries. In contrast, hosts performing reconnaissance-
based lookups will only perform queries; they will not be
queried by other hosts.2
In other words, legitimate mail servers are likely to be
queried by other mail servers that are receiving mail from
that server. On the other hand, a host that is not itself be-
ing looked up by any other mail servers is, in all like-
lihood, not a mail server. We can use this observation
to identify hosts that are likely performing reconnais-
sance: lookups from hosts that have a high out-degree in
the DNSBL query graph (i.e., hosts that are performing
many lookups) but have a low in-degree are likely unre-
lated to the delivery of legitimate mail. To quantify this
effect, we dene the lookup ratio, λ, of some node n as
follows:
λn =
dn,out
dn,in
where dout is the number of distinct IP addresses that
node n queries, and din is the number of distinct IP ad-
dresses that issue a query for node n.3 This metric is most
effective when hosts performing reconnaissance are dis-
joint from hosts that are actually used to spam, which ap-
pears to the case today.However, as reconnaissance tech-
niques become increasingly more sophisticated (as we
describe in Section 2.2.3), this metric may become less
useful. Still, we nd that this metric proves to be quite
useful in detecting many instances of DNSBL-based re-
connaissance.
The temporal arrival pattern of queries at the DNSBL
by hosts performing reconnaissance may differ from
temporal characteristics of queries performed by legit-
imate hosts. We expect this to be the case because,
whereas legitimate DNSBL lookups are driven by the
arrival of actual email, reconnaissance queries will not
reect any realistic arrival patterns of actual email.
Property 2 (Temporal relationships) A legitimate mail
server’s DNSBL lookups reect actual arrival patterns
of real email messages: legitimate lookups are typically
driven automatically when emails arrive at the mail
server and will thus arrive at a rate that mirrors the ar-
rival rates of emails. Reconnaissance-based lookups, on
the other hand, will not mirror the arrival patterns of le-
gitimate email.
We may be able to exploit the fact that email trafc tends
to be diurnal [9] to tease apart DNSBL lookups that are

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

22 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
50% Ph.D. Student
 
23% Student (Master)
 
9% Researcher (at a non-Academic Institution)
by Country
 
41% United States
 
9% United Kingdom
 
9% Portugal