Detecting spammers with SNARE: Spatio-temporal network-level automatic reputation engine
Abstract
Users and network administrators need ways to filter email messages based primarily on the reputation of the sender. Unfortunately, conventionalmechanisms for sender reputationnotably, IP blacklistsare cumber- some to maintain and evadable. This paper investigates ways to infer the reputation of an email sender based solely on network-level features, without looking at the contents of a message. First, we study first-order prop- erties of network-level features thatmay help distinguish spammers fromlegitimate senders. We examine features that can be ascertainedwithout ever looking at a packets contents, such as the distance in IP space to other email senders or the geographic distance between sender and receiver. We derive features that are lightweight, since they do not require seeing a large amount of email from a single IP address and can be gleaned without looking at an emails contentsmany such features are appar- ent from even a single packet. Second, we incorporate these features into a classification algorithm and evalu- ate the classifiers ability to automatically classify email senders as spammers or legitimate senders. We build an automated reputation engine, SNARE, based on these features using labeled data from a deployed commercial spam-filtering system. We demonstrate that SNARE can achieve comparable accuracy to existing static IP black- lists: about a 70%detection rate for less than a 0.3%false positive rate. Third, we show how SNARE can be inte- grated into existing blacklists, essentially as a first-pass filter.
Detecting spammers with SNARE: Spatio-temporal network-level automatic reputation engine
Spatio-temporal Network-level Automatic Reputation Engine
Shuang Hao, Nadeem Ahmed Syed, Nick Feamster, Alexander G. Gray, Sven Krasser ∗
College of Computing, Georgia Tech ∗McAfee, Inc.
{shao, nadeem, feamster, agray}@cc.gatech.edu, sven_krasser@mcafee.com
Abstract
Users and network administrators need ways to filter
email messages based primarily on the reputation of
the sender. Unfortunately, conventional mechanisms for
sender reputation—notably, IP blacklists—are cumber-
some to maintain and evadable. This paper investigates
ways to infer the reputation of an email sender based
solely on network-level features, without looking at the
contents of a message. First, we study first-order prop-
erties of network-level features that may help distinguish
spammers from legitimate senders. We examine features
that can be ascertained without ever looking at a packet’s
contents, such as the distance in IP space to other email
senders or the geographic distance between sender and
receiver. We derive features that are lightweight, since
they do not require seeing a large amount of email from
a single IP address and can be gleaned without looking
at an email’s contents—many such features are appar-
ent from even a single packet. Second, we incorporate
these features into a classification algorithm and evalu-
ate the classifier’s ability to automatically classify email
senders as spammers or legitimate senders. We build
an automated reputation engine, SNARE, based on these
features using labeled data from a deployed commercial
spam-filtering system. We demonstrate that SNARE can
achieve comparable accuracy to existing static IP black-
lists: about a 70% detection rate for less than a 0.3% false
positive rate. Third, we show how SNARE can be inte-
grated into existing blacklists, essentially as a first-pass
filter.
1 Introduction
Spam filtering systems use two mechanisms to filter
spam: content filters, which classify messages based on
the contents of a message; and sender reputation, which
maintains information about the IP address of a sender
as an input to filtering. Content filters (e.g., [22, 23])
can block certain types of unwanted email messages, but
they can be brittle and evadable, and they require ana-
lyzing the contents of email messages, which can be ex-
pensive. Hence, spam filters also rely on sender repu-
tation to filter messages; the idea is that a mail server
may be able to reject a message purely based on the rep-
utation of the sender, rather than the message contents.
DNS-based blacklists (DNSBLs) such as Spamhaus [7]
maintain lists of IP addresses that are known to send
spam. Unfortunately, these blacklists can be both in-
complete and slow-to-respond to new spammers [32].
This unresponsiveness will only become more serious
as both botnets and BGP route hijacking make it easier
for spammers to dynamically obtain new, unlisted IP ad-
dresses [33, 34]. Indeed, network administrators are still
searching for spam-filtering mechanisms that are both
lightweight (i.e., they do not require detailed message or
content analysis) and automated (i.e., they do not require
manual update, inspection, or verification).
Towards this goal, this paper presents SNARE (Spatio-
temporal Network-level Automatic Reputation Engine),
a sender reputation engine that can accurately and au-
tomatically classify email senders based on lightweight,
network-level features that can be determined early in
a sender’s history—sometimes even upon seeing only a
single packet. SNARE relies on the intuition that about
95% of all email is spam, and, of this, 75 − 95% can be
attributed to botnets, which often exhibit unusual send-
ing patterns that differ from those of legitimate email
senders. SNARE classifies senders based on how they are
sending messages (i.e., traffic patterns), rather than who
the senders are (i.e., their IP addresses). In other words,
SNARE rests on the assumption that there are lightweight
network-level features that can differentiate spammers
from legitimate senders; this paper finds such features
and uses them to build a system for automatically deter-
mining an email sender’s reputation.
SNARE bears some similarity to other approaches that
classify senders based on network-level behavior [12,21,
message contents, gathering information across a large
number of recipients, or both. In contrast, SNARE is
based on lightweight network-level features, which could
allow it to scale better and also to operate on higher traf-
fic rates. In addition, SNARE ismore accurate than previ-
ous reputation systems that use network-level behavioral
features to classify senders: for example, SNARE’s false
positive rate is an order of magnitude less than that in
our previous work [34] for a similar detection rate. It is
the first reputation system that is both as accurate as ex-
isting static IP blacklists and automated to keep up with
changing sender behavior.
Despite the advantages of automatically inferring
sender reputation based on “network-level” features, a
major hurdle remains: We must identify which features
effectively and efficiently distinguish spammers from le-
gitimate senders. Given the massive space of possible
features, finding a collection of features that classifies
senders with both low false positive and low false neg-
ative rates is challenging. This paper identifies thirteen
such network-level features that require varying levels of
information about senders’ history.
Different features impose different levels of overhead.
Thus, we begin by evaluating features that can be com-
puted purely locally at the receiver, with no information
from other receivers, no previous sending history, and
no inspection of the message itself. We found several
features that fall into this category are surprisingly ef-
fective for classifying senders, including: The AS of the
sender, the geographic distance between the IP address of
the sender and that of the receiver, the density of email
senders in the surrounding IP address space, and the time
of day the message was sent. We also looked at var-
ious aggregate statistics across messages and receivers
(e.g., the mean and standard deviations of messages sent
from a single IP address) and found that, while these
features require slightly more computation and message
overhead, they do help distinguish spammers from legit-
imate senders as well. After identifying these features,
we analyze the relative importance of these features and
incorporate them into an automated reputation engine,
based on the RuleFit [19] ensemble learning algorithm.
In addition to presenting the first automated classifier
based on network-level features, this paper presents sev-
eral additional contributions. First, we presented a de-
tailed study of various network-level characteristics of
both spammers and legitimate senders, a detailed study
of how well each feature distinguishes spammers from
legitimate senders, and explanations of why these fea-
tures are likely to exhibit differences between spammers
and legitimate senders. Second, we use state-of-the-art
ensemble learning techniques to build a classifier using
these features. Our results show that SNARE’s perfor-
mance is at least as good as static DNS-based blacklists,
achieving a 70% detection rate for about a 0.2% false
positive rate. Using features extracted from a single mes-
sage and aggregates of these features provides slight im-
provements, and adding an AS “whitelist” of the ASes
that host the most commonly misclassified senders re-
duces the false positive rate to 0.14%. This accuracy
is roughly equivalent to that of existing static IP black-
lists like SpamHaus [7]; the advantage, however, is that
SNARE is automated, and it characterizes a sender based
on its sending behavior, rather than its IP address, which
may change due to dynamic addressing, newly compro-
mised hosts, or route hijacks. Although SNARE’s per-
formance is still not perfect, we believe that the benefits
are clear: Unlike other email sender reputation systems,
SNARE is both automated and lightweight enough to op-
erate solely on network-level information. Third, we pro-
vide a deployment scenario for SNARE. Even if others do
not deploy SNARE’s algorithms exactly as we have de-
scribed, we believe that the collection of network-level
features themselves may provide useful inputs to other
commercial and open-source spam filtering appliances.
The rest of this paper is organized as follows. Sec-
tion 2 presents background on existing sender reputation
systems and a possible deployment scenario for SNARE
and introduces the ensemble learning algorithm. Sec-
tion 3 describes the network-level behavioral properties
of email senders and measures first-order statistics re-
lated to these features concerning both spammers and
legitimate senders. Section 4 evaluates SNARE’s perfor-
mance using different feature subsets, ranging from those
that can be determined from a single packet to those that
require some amount of history. We investigate the po-
tential to incorporate the classifier into a spam-filtering
system in Section 5. Section 6 discusses evasion and
other limitations, Section 7 describes related work, and
Section 8 concludes.
2 Background
In this section, we provide background on existing sender
reputation mechanisms, present motivation for improved
sender reputation mechanisms (we survey other related
work in Section 7), and describe a classification algo-
rithm called RuleFit to build the reputation engine. We
also describe McAfee’s TrustedSource system, which is
both the source of the data used for our analysis and a
possible deployment scenario for SNARE.
2.1 Email Sender Reputation Systems
Today’s spam filters look up IP addresses in DNS-
based blacklists (DNSBLs) to determine whether an
IP address is a known source of spam at the time
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



