Information flow control for stan...
Information Flow Control for Standard OS Abstractions Maxwell Krohn Alexander Yip Micah Brodsky Natan Cliffer M. Frans Kaashoek Eddie Kohler��� Robert Morris MIT CSAIL ���UCLA http://flume.csail.mit.edu/ ABSTRACT Decentralized Information Flow Control (DIFC) [24] is an ap- proach to security that allows application writers to control how data flows between the pieces of an application and the outside world. As applied to privacy, DIFC allows untrusted software to compute with private data while trusted security code controls the release of that data. As applied to integrity, DIFC allows trusted code to protect untrusted software from unexpected malicious in- puts. In either case, only bugs in the trusted code, which tends to be small and isolated, can lead to security violations. We present Flume, a new DIFC model and system that applies at the granularity of operating system processes and standard OS ab- stractions (e.g., pipes and file descriptors). Flume eases DIFC���s use in existing applications and allows safe interaction between con- ventional and DIFC-aware processes. Flume runs as a user-level reference monitor on Linux. A process confined by Flume cannot perform most system calls directly instead, an interposition layer replaces system calls with IPC to the reference monitor, which en- forces data flow policies and performs safe operations on the pro- cess���s behalf. We ported a complex Web application (MoinMoin wiki) to Flume, changing only 2% of the original code. The Flume version is roughly 30���40% slower due to overheads in our current implementation but supports additional security policies impossible without DIFC. Categories and Subject Descriptors: D.4.6 [Operating Systems]: Security and Protection���Information flow controls, Access controls D.4.7 [Operating Systems]: Orga- nization and Design C.5.5 [Computer System Implementation]: Servers General Terms: Security, Design, Performance Keywords: distributed information flow control, DIFC, end- points, reference monitor, system call interposition, Web services 1 INTRODUCTION As modern applications grow in size, complexity and dependence on third-party software, they become more susceptible to security flaws. Decentralized information flow control (DIFC) [24], a vari- ant of classic information flow control [1, 2, 6], can improve the security of complex applications, even in the presence of potential exploits. Existing DIFC systems operate as programming language abstractions [24] or are integrated into communication primitives in new operating systems [8, 38]. These approaches have advantages, Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. SOSP���07, October 14���17, 2007, Stevenson, Washington, USA. Copyright 2007 ACM 978-1-59593-591-5/07/0010 . . . $5.00. such as fine-grained control of information flow and high perfor- mance, but require a shift in how applications are developed. Flume instead provides process-level DIFC as a minimal extension to the communication primitives in existing operating systems, making DIFC work with the languages, tools, and operating system abstrac- tions already familiar to programmers. The Flume system provides DIFC at the granularity of processes, and integrates DIFC controls with standard communication abstrac- tions such as pipes, sockets, and file descriptors, via a user-level reference monitor. Its interface helps programmers secure existing applications and write new ones with existing tools and libraries. Flume enforces the DIFC policy as the application runs. A typical Flume application consists of processes of two types. Untrusted processes do most of the computation. They are con- strained by, but possibly unaware of, DIFC controls. Trusted pro- cesses, in contrast, are aware of DIFC and set up the privacy and integrity controls that constrain untrusted processes. Trusted pro- cesses also have the privilege to selectively violate classical in- formation flow control���for instance, by declassifying private data (perhaps to export it from the system), or by endorsing data as high- integrity. This privilege is distributed among the trusted processes according to application policy, making it decentralized (the ���D��� in DIFC). Though bugs in the trusted code can lead to compromise, bugs elsewhere in the application cannot, and trusted code can stay relatively isolated and concise even as the application expands. A central challenge for Flume is to accommodate processes that use existing communication interfaces such as sockets and pipes but also need to specify how and when they use their privileges. It would be awkward to, for example, modify each call to read or write to take arguments indicating whether privilege should be applied. Worse, the conventional process interface is rife with chan- nels that ���leak��� information, like network sockets. A system could simply mark these channels off-limits, restricting the process inter- face to those system calls with obvious and controllable information flow, but this approach would make many libraries unusable. Flume instead seeks to restrict access to these uncontrolled channels only when necessary. Our solution is an endpoint abstraction. Flume represents each resource a process uses to communicate as an endpoint, including pipes, sockets, files, and network connections. A process can spec- ify what subset of its privileges should be exercised when commu- nicating through each endpoint. Uncontrolled channels are modeled as endpoints that exit the DIFC system Flume ensures that no pro- cess can have both an uncontrolled channel and access to private data it cannot declassify. We built Flume in user-space (with a few small kernel patches) for implementation convenience and portability: the implementa- tion runs on Linux and OpenBSD. Unlike prior systems that pro- vide DIFC as part of a new kernel design (e.g., Asbestos [8] and HiStar [38]), Flume takes advantage of large existing efforts to maintain and improve the kernel support for hardware, NFS, RAID, SMP, etc. The disadvantage is that Flume���s trusted computing base is many times larger than those of dedicated DIFC kernels, leaving
it vulnerable to security flaws in the underlying operating system. Also, Flume���s user space implementation incurs some performance penalties and may expose covert channels that deeper kernel inte- gration would close. To evaluate Flume���s programmability, we ported a complex and popular application, MoinMoin wiki [22], to the Flume system. MoinMoin is a feature-rich Web document sharing system (91,000 lines of Python code), with support for access control lists, index- ing, Web-based editing, versioning, syntax highlighting for source code, downloadable ���skins���, etc. We captured Moin���s access con- trol policies with DIFC-based equivalents, thereby moving the se- curity logic out of the main application and into a small, isolated se- curity module about a thousand lines long. Only bugs in the security module, as opposed to the large tangle of Moin code and its plug- ins, can compromise end-to-end security. We also implemented a Moin security policy that could not exist without Flume: end-to- end integrity protection. Moin can pull third-party plug-ins into its address space, but with end-to-end integrity protection, users can enforce that selected plug-ins never touch (and potentially corrupt) their sensitive data, either on input or output. FlumeWiki achieves these security goals with only a thousand lines of modification to the original MoinMoin system (in addition to the new security module). Though prior DIFC work has suc- ceeded in sandboxing legacy applications [38] or rewriting them anew [8, 15], the ���drop-in��� replacement of an existing access con- trol policy with a DIFC-based one is a new result. In at least three cases, FlumeWiki closes security holes in the original Moin. Ex- periments with FlumeWiki on Linux show that the new system performs 43% slower than the original in read workloads, and 34% slower on write workloads. Slow-downs are due primarily to Flume���s user-space implementation. We expect that for many Web sites the prototype���s performance is adequate. This paper���s contributions include: ��� New DIFC rules that fit standard operating system abstractions well and that are simpler than those of Asbestos and HiStar. Flume���s DIFC rules are close to rules for classic ���centralized��� information flow control [1, 2, 6], with small extensions for de- centralization and communication abstractions found in widely- used operating systems. ��� The first design and implementation of process-level DIFC for stock operating systems (OpenBSD and Linux). ��� Refinements to Flume DIFC required to build real systems, such as machine cluster support, and DIFC primitives that scale to large numbers of users. ��� A full-featured DIFC Web site (FlumeWiki) with novel end-to- end integrity guarantees, composed largely of existing code. All security claims made for Flume rely on two important as- sumptions. First, machines running Flume must not have security bugs that result in super-user privileges. Second, though the Flume design makes a concerted effort to close all known covert storage channels (and we indicate where it falls short), Flume assumes pro- cesses running on the machine do not leak data via covert timing channels. For example, an exploit might transmit sensitive infor- mation by carefully modulating its CPU use in a way observable by other processes. Covert timing channels are present in all existing DIFC systems they can be reduced but not eliminated, particularly for systems connected to the network [38]. The rest of this paper proceeds as follows: Section 2 describes related work. Section 3 considers an abstract definition for DIFC, while Section 4 presents endpoints and the instantiation of DIFC on Unix. Section 5 and 6 describe the design of Flume and its file system, and Section 7 describes the FlumeWiki. Section 8 provides a performance evaluation, and Section 9 concludes. 2 RELATED WORK There has been much work on improving security on stock oper- ating systems, including buffer overrun protection (e.g., [5, 18]), system call interposition (e.g., [11, 12, 16, 27, 32]), isolation techniques (e.g., [13, 17]), virtual machines (e.g., [14, 33, 35]), and recovering from compromises (e.g., [7]). Flume uses some of these techniques for its implementation (e.g., LSMs [36] and sys- trace [27]), but Flume is more closely related to mandatory access control and specifically decentralized information flow control. Mandatory access control (MAC) [28] refers to a system se- curity plan in which security policies are mandatory and not en- forced at the discretion of the application writers. In many such systems, software components may be allowed to read private data but are forbidden from revealing it. Traditional MAC systems in- tend that an administrator set a single system-wide policy. When servers run multiple third-party applications, however, administra- tors cannot understand every application���s detailed security logic. DIFC promises to support such situations better than most MAC mechanisms, because it partially delegates the setting of policy to the individual applications. SELinux [20] and TrustedBSD [34] are recent examples of stock operating systems modified to support many MAC policies. They include interfaces for a security officer to dynamically insert secu- rity policies into the kernel, which then limit the behavior of ker- nel abstractions like inodes and tasks [30]. Flume, like SELinux, uses the Linux security module (LSM) framework in its imple- mentation [36]. However, SELinux and TrustedBSD do not allow untrusted applications to define and update security policies (as in DIFC). If SELinux and TrustedBSD were to provide such an API, they would need to address the challenges considered in this paper. TightLip [37] implements a specialized form of IFC that pre- vents privacy leaks in legacy applications. TightLip users tag their private data and TightLip prevents that private data from leaving the system via untrusted processes. Unlike TightLip, Flume and other DIFC systems (e.g. Asbestos and HiStar) support multiple security classes, which enable safe commingling of private data and security policies other than privacy protection. IX [21] and LOMAC [10] add information flow control to Unix, but again with support for only centralized policy decisions. Flume faces some of the same Unix-related problems as these systems, such as shared file descriptors that become storage channels. Myers and Liskov introduced a decentralized information model [23], thereby relaxing the restriction in previous information flow control systems that only a security officer could declassify. JFlow and its successor Jif are Java-based programming languages that enforce DIFC within a program, providing finer-grained con- trol than Flume [24]. One benefit is that Jif can limit declassifica- tion privileges to specific function(s) within a process, rather than (as with Flume) to the entire process. On the other hand, Jif re- quires applications (such as Web services [4]) to be rewritten while Flume provides better support for applying DIFC to existing soft- ware. Flume���s DIFC rules (Section 3) are inspired by Jif���s but split readership and ownership into separate per-process labels, and al- low ownership to be transfered as capabilities. Flume���s endpoints (Section 4) provide the glue between software written for an exist- ing API and DIFC. Asbestos [3, 8] and HiStar [38] incorporate DIFC into new op- erating systems, applying labels at the granularity of unreliable messages between processes (Asbestos) or threads, gates, and seg- ments (HiStar). Flume���s labels are influenced by Asbestos���s and