TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones
Available from www2.seattle.intel-research.net
Page 1
TaintDroid: An Information-Flow Tracking System for Realtime Privacy Monitoring on Smartphones
To appear at the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10)
TaintDroid: An Information-Flow Tracking System for Realtime Privacy
Monitoring on Smartphones
William Enck
The Pennsylvania State University
Peter Gilbert
Duke University
Byung-Gon Chun
Intel Labs
Landon P. Cox
Duke University
Jaeyeon Jung
Intel Labs
Patrick McDaniel
The Pennsylvania State University
Anmol N. Sheth
Intel Labs
Abstract
Today’s smartphone operating systems frequently fail
to provide users with adequate control over and visibility
into how third-party applications use their private data.
We address these shortcomings with TaintDroid, an ef-
ficient, system-wide dynamic taint tracking and analy-
sis system capable of simultaneously tracking multiple
sources of sensitive data. TaintDroid provides realtime
analysis by leveraging Android’s virtualized execution
environment. TaintDroid incurs only 14% performance
overhead on a CPU-bound micro-benchmark and im-
poses negligible overhead on interactive third-party ap-
plications. Using TaintDroid to monitor the behavior of
30 popular third-party Android applications, we found
68 instances of potential misuse of users’ private infor-
mation across 20 applications. Monitoring sensitive data
with TaintDroid provides informed use of third-party ap-
plications for phone users and valuable input for smart-
phone security service firms seeking to identify misbe-
having applications.
1 Introduction
A key feature of modern smartphone platforms is a
centralized service for downloading third-party applica-
tions. The convenience to users and developers of such
“app stores” has made mobile devices more fun and use-
ful, and has led to an explosion of development. Apple’s
App Store alone served nearly 3 billion applications af-
ter only 18 months [4]. Many of these applications com-
bine data from remote cloud services with information
from local sensors such as a GPS receiver, camera, mi-
crophone, and accelerometer. Applications often have le-
gitimate reasons for accessing this privacy sensitive data,
but users would also like assurances that their data is used
properly. Incidents of developers relaying private infor-
mation back to the cloud [35, 12] and the privacy risks
posed by seemingly innocent sensors like accelerome-
ters [19] illustrate the danger.
Resolving the tension between the fun and utility of
running third-party mobile applications and the privacy
risks they pose is a critical challenge for smartphone plat-
forms. Mobile-phone operating systems currently pro-
vide only coarse-grained controls for regulating whether
an application can access private information, but pro-
vide little insight into how private information is actu-
ally used. For example, if a user allows an application
to access her location information, she has no way of
knowing if the application will send her location to a
location-based service, to advertisers, to the application
developer, or to any other entity. As a result, users must
blindly trust that applications will properly handle their
private data. This lack of transparency forces users to
blindly trust that applications will properly handle pri-
vate data.
This paper describes TaintDroid, an extension to the
Android mobile-phone platform that tracks the flow of
privacy sensitive data through third-party applications.
TaintDroid assumes that downloaded, third-party appli-
cations are not trusted, and monitors–in realtime–how
these applications access and manipulate users’ personal
data. Our primary goals are to detect when sensitive data
leaves the system via untrusted applications and to facil-
itate analysis of applications by phone users or external
security services [33, 55].
Analysis of applications’ behavior requires sufficient
contextual information about what data leaves a device
and where it is sent. Thus, TaintDroid automatically
labels (taints) data from privacy-sensitive sources and
transitively applies labels as sensitive data propagates
through program variables, files, and interprocess mes-
sages. When tainted data are transmitted over the net-
work, or otherwise leave the system, TaintDroid logs the
data’s labels, the application responsible for transmitting
the data, and the data’s destination. Such realtime feed-
back gives users and security services greater insight into
what mobile applications are doing, and can potentially
identify misbehaving applications.
1
TaintDroid: An Information-Flow Tracking System for Realtime Privacy
Monitoring on Smartphones
William Enck
The Pennsylvania State University
Peter Gilbert
Duke University
Byung-Gon Chun
Intel Labs
Landon P. Cox
Duke University
Jaeyeon Jung
Intel Labs
Patrick McDaniel
The Pennsylvania State University
Anmol N. Sheth
Intel Labs
Abstract
Today’s smartphone operating systems frequently fail
to provide users with adequate control over and visibility
into how third-party applications use their private data.
We address these shortcomings with TaintDroid, an ef-
ficient, system-wide dynamic taint tracking and analy-
sis system capable of simultaneously tracking multiple
sources of sensitive data. TaintDroid provides realtime
analysis by leveraging Android’s virtualized execution
environment. TaintDroid incurs only 14% performance
overhead on a CPU-bound micro-benchmark and im-
poses negligible overhead on interactive third-party ap-
plications. Using TaintDroid to monitor the behavior of
30 popular third-party Android applications, we found
68 instances of potential misuse of users’ private infor-
mation across 20 applications. Monitoring sensitive data
with TaintDroid provides informed use of third-party ap-
plications for phone users and valuable input for smart-
phone security service firms seeking to identify misbe-
having applications.
1 Introduction
A key feature of modern smartphone platforms is a
centralized service for downloading third-party applica-
tions. The convenience to users and developers of such
“app stores” has made mobile devices more fun and use-
ful, and has led to an explosion of development. Apple’s
App Store alone served nearly 3 billion applications af-
ter only 18 months [4]. Many of these applications com-
bine data from remote cloud services with information
from local sensors such as a GPS receiver, camera, mi-
crophone, and accelerometer. Applications often have le-
gitimate reasons for accessing this privacy sensitive data,
but users would also like assurances that their data is used
properly. Incidents of developers relaying private infor-
mation back to the cloud [35, 12] and the privacy risks
posed by seemingly innocent sensors like accelerome-
ters [19] illustrate the danger.
Resolving the tension between the fun and utility of
running third-party mobile applications and the privacy
risks they pose is a critical challenge for smartphone plat-
forms. Mobile-phone operating systems currently pro-
vide only coarse-grained controls for regulating whether
an application can access private information, but pro-
vide little insight into how private information is actu-
ally used. For example, if a user allows an application
to access her location information, she has no way of
knowing if the application will send her location to a
location-based service, to advertisers, to the application
developer, or to any other entity. As a result, users must
blindly trust that applications will properly handle their
private data. This lack of transparency forces users to
blindly trust that applications will properly handle pri-
vate data.
This paper describes TaintDroid, an extension to the
Android mobile-phone platform that tracks the flow of
privacy sensitive data through third-party applications.
TaintDroid assumes that downloaded, third-party appli-
cations are not trusted, and monitors–in realtime–how
these applications access and manipulate users’ personal
data. Our primary goals are to detect when sensitive data
leaves the system via untrusted applications and to facil-
itate analysis of applications by phone users or external
security services [33, 55].
Analysis of applications’ behavior requires sufficient
contextual information about what data leaves a device
and where it is sent. Thus, TaintDroid automatically
labels (taints) data from privacy-sensitive sources and
transitively applies labels as sensitive data propagates
through program variables, files, and interprocess mes-
sages. When tainted data are transmitted over the net-
work, or otherwise leave the system, TaintDroid logs the
data’s labels, the application responsible for transmitting
the data, and the data’s destination. Such realtime feed-
back gives users and security services greater insight into
what mobile applications are doing, and can potentially
identify misbehaving applications.
1
Page 2
To appear at the 9th USENIX Symposium on Operating Systems Design and Implementation (OSDI’10)
To be practical, the performance overhead of the Taint-
Droid runtime must be minimal. Unlike existing so-
lutions that rely on heavy-weight whole-system emula-
tion [7, 57], we leveraged Android’s virtualized archi-
tecture to integrate four granularities of taint propaga-
tion: variable-level, method-level, message-level, and
file-level. Though the individual techniques are not
new, our contributions lie in the integration of these
techniques and in identifying an appropriate trade-off
between performance and accuracy for resource con-
strained smartphones. Experiments with our prototype
for Android show that tracking incurs a runtime over-
head of less than 14% for a CPU-bound microbench-
mark. More importantly, interactive third-party applica-
tions can be monitored with negligible perceived latency.
We evaluated the accuracy of TaintDroid using 30 ran-
domly selected, popular Android applications that use lo-
cation, camera, or microphone data. TaintDroid correctly
flagged 105 instances in which these applications trans-
mitted tainted data; of the 105, we determined that 37
were clearly legitimate. TaintDroid also revealed that 15
of the 30 applications reported users’ locations to remote
advertising servers. Seven applications collected the de-
vice ID and, in some cases, the phone number and the
SIM card serial number. In all, two-thirds of the applica-
tions in our study used sensitive data suspiciously. Our
findings demonstrate that TaintDroid can help expose po-
tential misbehavior by third-party applications.
Like similar information-flow tracking systems [7,
57], a fundamental limitation of TaintDroid is that it can
be circumvented through leaks via implicit flows. The
use of implicit flows to avoid taint detection is, in and of
itself, an indicator of malicious intent, and may well be
detectable through other techniques such as automated
static code analysis [14, 46] as we discuss in Section 8.
The rest of this paper is organized as follows: Sec-
tion 2 provides a high-level overview of TaintDroid, Sec-
tion 3 describes background information on the Android
platform, Section 4 describes our TaintDroid design,
Section 5 describes the taint sources tracked by Taint-
Droid, Section 6 presents results from our Android ap-
plication study, Section 7 characterizes the performance
of our prototype implementation, Section 8 discusses the
limitations of our approach, Section 9 describes related
work, and Section 10 summarizes our conclusions.
2 Approach Overview
We seek to design a framework that allows users to
monitor how third-party smartphone applications handle
their private data in realtime. Many smartphone appli-
cations are closed-source, therefore, static source code
analysis is infeasible. Even if source code is available,
runtime events and configuration often dictate informa-
tion use; realtime monitoring accounts for these environ-
ment specific dependencies.
Monitoring network disclosure of privacy sensitive in-
formation on smartphones presents several challenges:
• Smartphones are resource constrained. The re-
source limitations of smartphones precludes the use
of heavyweight information tracking systems such
as Panorama [57].
• Third-party applications are entrusted with several
types of privacy sensitive information. The mon-
itoring system must distinguish multiple informa-
tion types, which requires additional computation
and storage.
• Context-based privacy sensitive information is dy-
namic and can be difficult to identify even when
sent in the clear. For example, geographic locations
are pairs of floating point numbers that frequently
change and are hard to predict.
• Applications can share information. Limiting the
monitoring system to a single application does not
account for flows via files and IPC between applica-
tions, including core system applications designed
to disseminate privacy sensitive information.
We use dynamic taint analysis [57, 44, 8, 61, 39] (also
called “taint tracking”) to monitor privacy sensitive in-
formation on smartphones. Sensitive information is first
identified at a taint source, where a taint marking indi-
cating the information type is assigned. Dynamic taint
analysis tracks how labeled data impacts other data in a
way that might leak the original sensitive information.
This tracking is often performed at the instruction level.
Finally, the impacted data is identified before it leaves
the system at a taint sink (usually the network interface).
Existing taint tracking approaches have several lim-
itations. First and foremost, approaches that rely on
instruction-level dynamic taint analysis using whole sys-
tem emulation [57, 7, 26] incur high performance penal-
ties. Instruction-level instrumentation incurs 2-20 times
slowdown [57, 7] in addition to the slowdown introduced
by emulation, which is not suitable for realtime analysis.
Second, developing accurate taint propagation logic has
proven challenging for the x86 instruction set [40, 48].
Implementations of instruction-level tracking can experi-
ence taint explosion if the stack pointer becomes falsely
tainted [49] and taint loss if complicated instructions
such as CMPXCHG, REP MOV are not instrumented
properly [61]. While most smartphones use the ARM
instruction set, similar false positives and false negatives
could arise.
Figure 1 presents our approach to taint tracking on
smartphones. We leverage architectural features of vir-
tual machine-based smartphones (e.g., Android, Black-
Berry, and J2ME-based phones) to enable efficient,
2
To be practical, the performance overhead of the Taint-
Droid runtime must be minimal. Unlike existing so-
lutions that rely on heavy-weight whole-system emula-
tion [7, 57], we leveraged Android’s virtualized archi-
tecture to integrate four granularities of taint propaga-
tion: variable-level, method-level, message-level, and
file-level. Though the individual techniques are not
new, our contributions lie in the integration of these
techniques and in identifying an appropriate trade-off
between performance and accuracy for resource con-
strained smartphones. Experiments with our prototype
for Android show that tracking incurs a runtime over-
head of less than 14% for a CPU-bound microbench-
mark. More importantly, interactive third-party applica-
tions can be monitored with negligible perceived latency.
We evaluated the accuracy of TaintDroid using 30 ran-
domly selected, popular Android applications that use lo-
cation, camera, or microphone data. TaintDroid correctly
flagged 105 instances in which these applications trans-
mitted tainted data; of the 105, we determined that 37
were clearly legitimate. TaintDroid also revealed that 15
of the 30 applications reported users’ locations to remote
advertising servers. Seven applications collected the de-
vice ID and, in some cases, the phone number and the
SIM card serial number. In all, two-thirds of the applica-
tions in our study used sensitive data suspiciously. Our
findings demonstrate that TaintDroid can help expose po-
tential misbehavior by third-party applications.
Like similar information-flow tracking systems [7,
57], a fundamental limitation of TaintDroid is that it can
be circumvented through leaks via implicit flows. The
use of implicit flows to avoid taint detection is, in and of
itself, an indicator of malicious intent, and may well be
detectable through other techniques such as automated
static code analysis [14, 46] as we discuss in Section 8.
The rest of this paper is organized as follows: Sec-
tion 2 provides a high-level overview of TaintDroid, Sec-
tion 3 describes background information on the Android
platform, Section 4 describes our TaintDroid design,
Section 5 describes the taint sources tracked by Taint-
Droid, Section 6 presents results from our Android ap-
plication study, Section 7 characterizes the performance
of our prototype implementation, Section 8 discusses the
limitations of our approach, Section 9 describes related
work, and Section 10 summarizes our conclusions.
2 Approach Overview
We seek to design a framework that allows users to
monitor how third-party smartphone applications handle
their private data in realtime. Many smartphone appli-
cations are closed-source, therefore, static source code
analysis is infeasible. Even if source code is available,
runtime events and configuration often dictate informa-
tion use; realtime monitoring accounts for these environ-
ment specific dependencies.
Monitoring network disclosure of privacy sensitive in-
formation on smartphones presents several challenges:
• Smartphones are resource constrained. The re-
source limitations of smartphones precludes the use
of heavyweight information tracking systems such
as Panorama [57].
• Third-party applications are entrusted with several
types of privacy sensitive information. The mon-
itoring system must distinguish multiple informa-
tion types, which requires additional computation
and storage.
• Context-based privacy sensitive information is dy-
namic and can be difficult to identify even when
sent in the clear. For example, geographic locations
are pairs of floating point numbers that frequently
change and are hard to predict.
• Applications can share information. Limiting the
monitoring system to a single application does not
account for flows via files and IPC between applica-
tions, including core system applications designed
to disseminate privacy sensitive information.
We use dynamic taint analysis [57, 44, 8, 61, 39] (also
called “taint tracking”) to monitor privacy sensitive in-
formation on smartphones. Sensitive information is first
identified at a taint source, where a taint marking indi-
cating the information type is assigned. Dynamic taint
analysis tracks how labeled data impacts other data in a
way that might leak the original sensitive information.
This tracking is often performed at the instruction level.
Finally, the impacted data is identified before it leaves
the system at a taint sink (usually the network interface).
Existing taint tracking approaches have several lim-
itations. First and foremost, approaches that rely on
instruction-level dynamic taint analysis using whole sys-
tem emulation [57, 7, 26] incur high performance penal-
ties. Instruction-level instrumentation incurs 2-20 times
slowdown [57, 7] in addition to the slowdown introduced
by emulation, which is not suitable for realtime analysis.
Second, developing accurate taint propagation logic has
proven challenging for the x86 instruction set [40, 48].
Implementations of instruction-level tracking can experi-
ence taint explosion if the stack pointer becomes falsely
tainted [49] and taint loss if complicated instructions
such as CMPXCHG, REP MOV are not instrumented
properly [61]. While most smartphones use the ARM
instruction set, similar false positives and false negatives
could arise.
Figure 1 presents our approach to taint tracking on
smartphones. We leverage architectural features of vir-
tual machine-based smartphones (e.g., Android, Black-
Berry, and J2ME-based phones) to enable efficient,
2
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
7 Readers on Mendeley
by Discipline
by Academic Status
57% Ph.D. Student
14% Student (Bachelor)
14% Post Doc
by Country
86% United States
14% Italy


