BotSniffer : Detecting Botnet Command and Control Channels in Network Traffic
Technology (2008)
- ISSN: 03601315
- DOI: 10.1.1.110.8092
Available from citeseerx.ist.psu.edu
or
Abstract
The paper presents BotSniffer, a IDS that monitors the command channel of bots to detect them inside a network. This is based on the time and message correlation of command communication.
Available from citeseerx.ist.psu.edu
Page 1
BotSniffer : Detecting Botnet Com...
BotSniffer: Detecting Botnet Command and Control Channels in Network Traffic Guofei Gu, Junjie Zhang, and Wenke Lee School of Computer Science, College of Computing Georgia Institute of Technology Atlanta, GA 30332 {guofei, jjzhang, wenke}@cc.gatech.edu Abstract Botnets are now recognized as one of the most serious security threats. In contrast to previous malware, botnets have the characteristic of a command and control (C&C) channel. Botnets also often use existing common protocols, e.g., IRC, HTTP, and in protocol-conforming manners. This makes the detection of botnet C&C a challenging problem. In this paper, we propose an approach that uses network-based anomaly detection to identify botnet C&C channels in a local area network without any prior knowl- edge of signatures or C&C server addresses. This detection approach can identify both the C&C servers and infected hosts in the network. Our approach is based on the observa- tion that, because of the pre-programmed activities related to C&C, bots within the same botnet will likely demonstrate spatial-temporal correlation and similarity. For example, they engage in coordinated communication, propagation, and attack and fraudulent activities. Our prototype system, BotSniffer, can capture this spatial-temporal correlation in network traffic and utilize statistical algorithms to detect botnets with theoretical bounds on the false positive and false negative rates. We evaluated BotSniffer using many real-world network traces. The results show that BotSniffer can detect real-world botnets with high accuracy and has a very low false positive rate. 1 Introduction Botnets (or, networks of zombies) are recognized as one of the most serious security threats today. Botnets are different from other forms of malware such as worms in that they use command and control (C&C) channels. It is important to study this botnet characteristic so as to develop effective countermeasures. First, a botnet C&C channel is relatively stable and unlikely to change among bots and their variants. Second, it is the essential mechanism that allows a ���botmaster��� (who controls the botnet) to direct the actions of bots in a botnet. As such, the C&C channel can be considered the weakest link of a botnet. That is, if we can take down an active C&C or simply interrupt the communication to the C&C, the botmaster will not be able to control his botnet. Moreover, the detection of the C&C channel will reveal both the C&C servers and the bots in a monitored network. Therefore, understanding and detecting the C&Cs has great value in the battle against botnets. Many existing botnet C&Cs are based on IRC (Internet Relay Chat) protocol, which provides a centralized com- mand and control mechanism. The botmaster can interact with the bots (e.g., issuing commands and receiving re- sponses) in real-time by using IRC PRIVMSG messages. This simple IRC-based C&C mechanism has proven to be highly successful and has been adopted by many botnets. There are also a few botnets that use the HTTP protocol for C&C. HTTP-based C&C is still centralized, but the botmaster does not directly interact with the bots using chat- like mechanisms. Instead, the bots periodically contact the C&C server(s) to obtain their commands. Because of its proven effectiveness and efficiency, we expect that centralized C&C (e.g., using IRC or HTTP) will still be widely used by botnets in the near future. In this paper, we study the problem of detecting centralized botnet C&C channels using network anomaly detection techniques. In particular, we focus on the two commonly used botnet C&C mechanisms, namely, IRC and HTTP based C&C channels. Our goal is to develop a detection approach that does not require prior knowledge of a botnet, e.g., signatures of C&C patterns including the name or IP address of a C&C server. We leave the problem of detection of P2P botnets (e.g., Nugache [19], and Peacomm [14]) as our future work. Botnet C&C traffic is difficult to detect because: (1) it follows normal protocol usage and is similar to normal traffic, (2) the traffic volume is low, (3) there may be very
Page 2
few bots in the monitored network, and (4) may contain encrypted communication. However, we observe that the bots of a botnet demonstrate spatial-temporal correlation and similarities due to the nature of their pre-programmed response activities to control commands. This invariant helps us identify C&C within network traffic. For in- stance, at a similar time, the bots within a botnet will execute the same command (e.g., obtain system informa- tion, scan the network), and report to the C&C server with the progress/result of the task (and these reports are likely to be similar in structure and content). Normal network activities are unlikely to demonstrate such a synchronized or correlated behavior. Using a sequential hypothesis testing algorithm, when we observe multiple instances of corre- lated and similar behaviors, we can conclude that a botnet is detected. Our research makes several contributions. First, we study two typical styles of control used in centralized botnet C&C. The first is the ���push��� style, where commands are pushed or sent to bots. IRC-based C&C is an example of the push style. The second is the ���pull��� style, where commands are pulled or downloaded by bots. HTTP-based C&C is an example of the pull style. Observing the spatial- temporal correlation and similarity nature of these botnet C&Cs, we provide a set of heuristics that distinguish C&C traffic from normal traffic. Second, we propose anomaly-based detection algorithms to identify both IRC and HTTP based C&Cs in a port- independent manner. The advantages of our algorithms include: (1) they do not require prior knowledge of C&C servers or content signatures, (2) they are able to detect encrypted C&C, (3) they do not require a large number of bots to be present in the monitored network, and may even be able to detect a botnet with just a single member in the monitored network in some cases, (4) they have bounded false positive and false negative rates, and do not require a large number of C&C communication packets. Third, we develop a system, BotSniffer, which is based on our proposed anomaly detection algorithms and is im- plemented as several plug-ins for the open-source Snort [24]. We have evaluated BotSniffer using real-world net- work traces. The results show that it has high accuracy in detecting botnet C&Cs with a very low false positive rate. The rest of the paper is organizedas follows. In Section 2 we provide a backgroundon botnet C&C and the motivation of our botnet detection approach. In Section 3, we describe the architecture of BotSniffer and describe in detail its detection algorithms. In Section 4, we report our evalu- ation of BotSniffer on various datasets. In Section 5, we discuss possible evasions to BotSniffer, the corresponding solutions, and future work. We review the related work in Section 6 and conclude in Section 7. 2 Background and Motivation In this section, we first use case studies to provide a background on botnet C&C mechanisms. We then discuss the intuitions behind our detection algorithms. 2.1 Case Study of Botnet C&C As shown in Figure 1(a), centralized C&C architecture can be categorized into ���push��� or ���pull��� style, depending on how a botmaster���s commands reach the bots. In a push style C&C, the bots are connected to the C&C server, e.g., IRC server, and wait for commands from botmaster. The botmaster issues a command in the channel, and all the bots connected to the channel can receive it in real-time. That is, in a push style C&C the botmaster has real-time control over the botnet. IRC-based C&C is the representative example of push style. Many existing botnets use IRC, including the most common bot fami- lies such as Phatbot, Spybot, Sdbot, Rbot/Rxbot, GTBot [5]. A botmaster sets up an (or a set of) IRC server(s) as C&C hosts. After a bot is newly infected, it will connect to the C&C server, join a certain IRC channel and wait for commands from the botmaster. Commands will be sent in IRC PRIVMSG messages (like a regular chatting message) or a TOPIC message. The bots receive com- mands, understand what the botmaster wants them to do, and execute and then reply with the results. Figure 1(b) shows a sample command and control session. The botmas- ter first authenticates himself using a username/password. Once the password is accepted, he can issue commands to obtain some information from the bot. For example, ���.bot.about��� gets some basic bot information such as version. ���.sysinfo��� obtains the system information of the bot infected machine, ���.scan.start��� instructs the bots to begin scanning for other vulnerable machines. The bots respond to the commands in pre-programmedfashions. The botmaster has a rich command library to use [5], which enables the botmaster to fully control and utilize the in- fected machines. In a pull style C&C, the botmaster simply sets the command in a file at a C&C server (e.g., a HTTP server). The bots frequently connect back to read the command file. This style of command and control is relatively loose in that the botmaster typically does not have real-time control over the bots because there is a delay between the time when he ���issues��� a command and the time when a bot gets the command. There are several botnets using HTTP protocol for C&C, for example, Bobax [25], which is designed mainly to send spams. The bots of this botnet periodically connect to the C&C server with an URL such as http://hostname/reg?u=[8-digit-hex-id] &v=114, and receive the command in a HTTP response.
Page 3
(a) Two styles of botnet C&C. (b) An IRC-based C&C communication example. Figure 1. Botnet command and control. The command is in one of the six types, e.g., prj (send spams), scn (scan others), upd (update binary). Botnets can have fairly frequent C&C traffic. For example, in a CERT report [16], researchers report a Web based bot that queries for the command file every 5 seconds and then executes the commands. 2.2 Botnet C&C: Spatial-Temporal Correlation and Similarity There are several invariants in botnet C&C regardless of the push or pull style. First, bots need to connect to C&C servers in order to obtain commands. They may either keep a long connection or frequently connect back. In either case, we can consider that there is a (virtually) long-lived session of C&C chan- nel.1 Second, bots need to perform certain tasks and respond to the received commands. We can define two types of responses observable in network traffic, namely, message response and activity response. A typical example of mes- sage response is IRC-based PRIVMSG reply as shown in Figure 1(b). When a bot receives a command, it will execute and reply in the same IRC channel with the execution result (or status/progress). The activity responses are the network activities the bots exhibit when they perform the malicious tasks (e.g., scanning, spamming, binary update) as directed by the botmaster���s commands. According to [31], about 53% of botnet commands observed in thousands of real- world IRC-based botnets are scan related (for spreading or DDoS purpose), about 14.4% are binary download related (for malware updating purpose). Also, many HTTP-based 1We consider a session live if the TCP connection is live, or within a certain time window, there is at least one connection to the server. botnets are mainly used to send spams [25]. Thus, we will observe these malicious activity responses with a high probability [8]. If there are multiple bots in the channel to respond to commands, most of them are likely to respond in a similar fashion. For example, the bots send similar message or activity traffic at a similar time window, e.g., sending spam as in [23]. Thus, we can observe a response crowd of botnet members responding to a command, as shown in Figure 2. Such crowd-like behaviors are consistent with all botnet C&C commands and throughout the life-cycle of a botnet. On the other hand, for a normal network service (e.g., an IRC chatting channel), it is unlikely that many clients consistently respond similarly and at a similar time. That is, the bots have much stronger (and more consistent) synchronization and correlation in their responses than nor- mal (human) users do. Based on the above observation, our botnet C&C detec- tion approach is aimed at recognizing the spatial-temporal correlation and similarities in bot responses. When monitor- ing network traffic, as the detection system observes multi- ple crowd-like behaviors, it can declare that the machines in the crowd are bots of a botnet when the accumulated degree of synchronization/correlation (and hence the likelihood of bot traffic) is above a given threshold. 3 BotSniffer: Architecture and Algorithms Figure 3 shows the architecture of BotSniffer. There are two main components, i.e., the monitor engine and the correlation engine. The monitor engine is deployed at the perimeter of a monitored network. It examines network traffic, generates connection record of suspicious C&C pro- tocols, and detects activity response behavior (e.g., scan-
Readership Statistics
92 Readers on Mendeley
by Discipline
2% Engineering
1% Education
by Academic Status
39% Ph.D. Student
25% Student (Master)
10% Researcher (at a non-Academic Institution)
by Country
34% United States
9% United Kingdom
5% Taiwan
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime



