Sign up & Download
Sign in

Backtracking Intrusions

by Samuel T King, Peter M Chen
Electrical Engineering (2003)

Abstract

Analyzing intrusions today is an arduous, largely manual task because system administrators lack the information and tools needed to understand easily the sequence of steps that occurred in an attack. The goal of BackTracker is to identify automatically potential sequences of steps that occurred in an intrusion. Starting with a single detection point (e.g., a suspicious file), BackTracker identifies files and processes that could have affected that detection point and displays chains of events in a dependency graph. We use BackTracker to analyze several real attacks against computers that we set up as honeypots. In each case, BackTracker is able to highlight effectively the entry point used to gain access to the system and the sequence of steps from that entry point to the point at which we noticed the intrusion. The logging required to support BackTracker added 9% overhead in running time and generated 1.2 GB per day of log data for an operating-system intensive workload.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

Backtracking Intrusions

ABSTRACT
Analyzing intrusions today is an arduous, largely manual task
because system administrators lack the information and tools
needed to understand easily the sequence of steps that occurred in
an attack. The goal of BackTracker is to identify automatically
potential sequences of steps that occurred in an intrusion. Starting
with a single detection point (e.g., a suspicious file), BackTracker
identifies files and processes that could have affected that detection
point and displays chains of events in a dependency graph. We use
BackTracker to analyze several real attacks against computers that
we set up as honeypots. In each case, BackTracker is able to high-
light effectively the entry point used to gain access to the system
and the sequence of steps from that entry point to the point at
which we noticed the intrusion. The logging required to support
BackTracker added 9% overhead in running time and generated
1.2 GB per day of log data for an operating-system intensive work-
load.
Categories and Subject Descriptors
D.4.6 [Operating Systems]: Security and Protection–
information flow controls, invasive software (e.g., viruses,
worms, Trojan horses); K.6.4 [Management of Computing and
Information Systems]: System Management–management
audit; K.6.5 [Management of Computing and Information
Systems]: Security and Protection–invasive software,
unauthorized access (e.g., hacking, phreaking).
General Terms
Management, Security.
Keywords
Computer forensics, intrusion analysis, information flow.
1. INTRODUCTION
The frequency of computer intrusions has been increasing rapidly
for several years [4]. It seems likely that, for the foreseeable future,
even the most diligent system administrators will continue to cope
routinely with computer break-ins. After discovering an intrusion,
a diligent system administrator should do several things to recover
from the intrusion. First, the administrator should understand how
the intruder gained access to the system. Second, the administrator
should identify the damage inflicted on the system (e.g., modified
files, leaked secrets, installed backdoors). Third, the administrator
should fix the vulnerability that allowed the intrusion and try to
undo the damage wrought by the intruder. This paper addresses the
methods and tools an administrator uses to understand how an
intruder gained access to the system.
Before an administrator can start to understand an intrusion, she
must first detect that an intrusion has occurred [2]. There are
numerous ways to detect a compromise. A tool such as TripWire
[20] can detect a modified system file; a network or host firewall
can notice a process conducting a port scan or launching a
denial-of-service attack; a sandboxing tool can notice a program
making disallowed or unusual patterns of system calls [18, 16] or
executing foreign code [22]. We use the term detection point to
refer to the state on the local computer system that alerts the
administrator to the intrusion. For example, a detection point could
be a deleted, modified, or additional file, or it could be a process
that is behaving in an unusual or suspicious manner.
Once an administrator is aware that a computer is compromised,
the next step is to investigate how the compromise took place [1].
Administrators typically use two main sources of information to
find clues about an intrusion: system/network logs and disk state
[15]. An administrator might find log entries that show unexpected
output from vulnerable applications, deleted or forgotten attack
toolkits on disk, or file modification dates which hint at the
sequence of events during the intrusion. Many tools exist that
make this job easier. For example, Snort can log network traffic;
Ethereal can present application-level views of that network traffic;
and The Coroner’s Toolkit can recover deleted files [14] or sum-
marize the times at which files were last modified, accessed, or
created [13] (similar tools are Guidance Software’s EnCase,
Access Data’s Forensic Toolkit, Internal Revenue Services’ ILook,
and ASR Data’s SMART).
Unfortunately, current sources of information suffer from one or
more limitations. Host logs typically show only partial, applica-
tion-specific information about what happened, such as HTTP con-
nections or login attempts, and they often show little about what
occurred on the system after the initial compromise. Network logs
may contain encrypted data, and the administrator may not be able
to recover the decryption key. The attacker may also use an obfus-
cated custom command set to communicate with a backdoor, and
the administrator may not be able to recover the backdoor program
to help understand the commands. Disk images may contain useful
information about the final state, but they do not provide a com-
plete history of what transpired during the attack. A general limita-
tion of most tools and sources of information is that they
Backtracking Intrusions
Department of Electrical Engineering and Computer Science
University of Michigan
Ann Arbor, MI 48109-2122
Samuel T. King
kingst@umich.edu
Peter M. Chen
pmchen@umich.edu
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, or
republish, to post on servers or to redistribute to lists, requires prior
specific permission and/or a fee.
SOSP’03, October 19-22, 2003, Bolton Landing, New York, USA.
Copyright 2003 ACM 1-58113-757-5/03/0010...$5.00.
223
Page 2
hidden
intermingle the actions of the intruder (or the state caused by those
actions) with the actions/state of legitimate users. Even in cases
where the logs and disk state contain enough information to under-
stand an attack, identifying the sequence of events from the initial
compromise to the point of detection point is still largely a manual
process.
This paper describes a tool called BackTracker that attempts to
address the shortcomings in current tools and sources of informa-
tion and thereby help an administrator more easily understand
what took place during an attack. Working backward from a detec-
tion point, BackTracker identifies chains of events that could have
led to the modification that was detected. An administrator can
then focus her detective work on those chains of events, leading to
a quicker and easier identification of the vulnerability. In order to
identify these chains of events, BackTracker logs the system calls
that induce most directly dependencies between operating system
objects (e.g., creating a process, reading and writing files). Back-
Tracker’s goal is to provide helpful information for most attacks; it
does not provide complete information for every possible attack.
We have implemented BackTracker for Linux in two components:
an on-line component that logs events and an off-line component
that graphs events related to the attack. BackTracker currently
tracks many (but not all) relevant OS events. We found that these
events can be logged and analyzed with moderate time and space
overhead and that the output generated by BackTracker was help-
ful in understanding several real attacks against computers we set
up as honeypots.
2. DESIGN OF BACKTRACKER
BackTracker’s goal is to reconstruct a time-line of events that
occur in an attack. Figure 1 illustrates this with BackTracker’s
results for an intrusion on our honeypot machine that occurred on
March 12, 2003. The graph shows that the attacker caused the
Apache web server (httpd) to create a command shell (bash),
downloaded and unpacked an executable (/tmp/xploit/ptrace), then
ran the executable using a different group identity (we believe the
executable was seeking to exploit a race condition in the Linux
ptrace code to gain root access). We detected the intrusion by see-
ing the ptrace process in the process listing.
There are many levels at which events and objects can be observed.
Application-level logs such as Apache’s log of HTTP requests are
semantically rich. However, they provide no information about the
attacker’s own programs, and they can be disabled by an attacker
who gains privileged access. Network-level logs provide more
information for remote attacks, but they can be rendered useless by
encryption or obfuscation. Logging low-level events such as
machine instructions can provide complete information about the
computer’s execution [12], but these can be difficult for adminis-
trators to understand quickly.
BackTracker works by observing OS-level objects (e.g., files, file-
names, processes) and events (e.g., system calls). This level is a
compromise between the application level (semantically rich but
easily disabled) and the machine level (difficult to disable but
semantically poor). Unlike application-level logging, OS-level log-
ging cannot separate objects within an application (e.g., user-level
threads), but rather considers the application as a whole. While
OS-level semantics can be disrupted by attacking the kernel, gain-
ing kernel-mode control can be made considerably more difficult
than gaining privileged user-mode control [19]. Unlike net-
work-level logging, OS-level events can be interpreted even if the
attacker encrypts or obfuscates his network communication.
This section’s description of BackTracker is divided into three
parts (increasing in degree of aggregation): objects, events that
cause dependencies between objects, and dependency graphs. The
description and implementation of BackTracker is given for
Unix-like operating systems.
2.1 Objects
Three types of OS-level objects are relevant to BackTracker’s anal-
ysis: processes, files, and filenames.
A process is identified uniquely by a process ID and a version
number. BackTracker keeps track of a process from the time it is
created by a fork or clone system call to the point where it exits.
The one process that is not created by fork or clone is the first pro-
cess (swapper); BackTracker starts keeping track of swapper when
it makes its first system call.
A file object includes any data or metadata that is specific to that
file, such as its contents, owner, or modification time. A file is
identified uniquely by a device, an inode number, and a version
number. Because files are identified by inode number rather than
by name, BackTracker tracks a file across rename operations.
BackTracker treats pipes and named pipes as normal files. Objects
associated with System V IPC (messages, shared memory, sema-
Figure 1: Filtered dependency graph for ptrace attack.
Processes are shown as boxes (labeled by program names called
by execve during that process’s lifetime); files are shown as ovals;
sockets are shown as diamonds. BackTracker can also show
process IDs, file inode numbers, and socket ports. The detection
point is shaded.
swapper
init
rc
S85httpd
ptrace,newgrp
pipe
ptrace
socket
wget
/tmp/pt.tar.gz
pipe
tar
gzip
/tmp/xploit/ptrace
nice,initlog
httpd
httpd
sh,bash
sockethttpd
224

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

15 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
60% Ph.D. Student
 
7% Post Doc
 
7% Other Professional
by Country
 
33% United States
 
27% China
 
7% India