Sign up & Download
Sign in

Xen and the art of virtualization

by Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, Andrew Warfield show all authors
Memory (2003)

Abstract

Numerous systems have been designed which use virtualization to subdivide the ample resources of a modern computer. Some require specialized hardware, or cannot support commodity operating systems. Some target 100% binary compatibility at the expense of performance. Others sacrifice security or functionality for speed. Few offer resource isolation or performance guarantees; most provide only best-effort provisioning, risking denial of service.This paper presents Xen, an x86 virtual machine monitor which allows multiple commodity operating systems to share conventional hardware in a safe and resource managed fashion, but without sacrificing either performance or functionality. This is achieved by providing an idealized virtual machine abstraction to which operating systems such as Linux, BSD and Windows XP, can be ported with minimal effort.Our design is targeted at hosting up to 100 virtual machine instances simultaneously on a modern server. The virtualization approach taken by Xen is extremely efficient: we allow operating systems such as Linux and Windows XP to be hosted simultaneously for a negligible performance overhead at most a few percent compared with the unvirtualized case. We considerably outperform competing commercial and freely available solutions in a range of microbenchmarks and system-wide tests.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

Xen and the art of virtualization

Xen and the Art of Virtualization
Paul Barham∗, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris,
Alex Ho, Rolf Neugebauer†, Ian Pratt, Andrew Warfield
University of Cambridge Computer Laboratory
15 JJ Thomson Avenue, Cambridge, UK, CB3 0FD
{firstname.lastname}@cl.cam.ac.uk
ABSTRACT
Numerous systems have been designed which use virtualization to
subdivide the ample resources of a modern computer. Some require
specialized hardware, or cannot support commodity operating sys-
tems. Some target 100% binary compatibility at the expense of
performance. Others sacrifice security or functionality for speed.
Few offer resource isolation or performance guarantees; most pro-
vide only best-effort provisioning, risking denial of service.
This paper presents Xen, an x86 virtual machine monitor which
allows multiple commodity operating systems to share conventional
hardware in a safe and resource managed fashion, but without sac-
rificing either performance or functionality. This is achieved by
providing an idealized virtual machine abstraction to which oper-
ating systems such as Linux, BSD and Windows XP, can be ported
with minimal effort.
Our design is targeted at hosting up to 100 virtual machine in-
stances simultaneously on a modern server. The virtualization ap-
proach taken by Xen is extremely efficient: we allow operating sys-
tems such as Linux and Windows XP to be hosted simultaneously
for a negligible performance overhead — at most a few percent
compared with the unvirtualized case. We considerably outperform
competing commercial and freely available solutions in a range of
microbenchmarks and system-wide tests.
Categories and Subject Descriptors
D.4.1 [Operating Systems]: Process Management; D.4.2 [Opera-
ting Systems]: Storage Management; D.4.8 [Operating Systems]:
Performance
General Terms
Design, Measurement, Performance
Keywords
Virtual Machine Monitors, Hypervisors, Paravirtualization
∗Microsoft Research Cambridge, UK

Intel Research Cambridge, UK
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full citation on the first page. To copy otherwise, to
republish, to post on servers or to redistribute to lists, requires prior specific
permission and/or a fee.
SOSP’03, October 19–22, 2003, Bolton Landing, New York, USA.
Copyright 2003 ACM 1-58113-757-5/03/0010 ...$5.00.
1. INTRODUCTION
Modern computers are sufficiently powerful to use virtualization
to present the illusion of many smaller virtual machines (VMs),
each running a separate operating system instance. This has led to
a resurgence of interest in VM technology. In this paper we present
Xen, a high performance resource-managed virtual machine mon-
itor (VMM) which enables applications such as server consolida-
tion [42, 8], co-located hosting facilities [14], distributed web ser-
vices [43], secure computing platforms [12, 16] and application
mobility [26, 37].
Successful partitioning of a machine to support the concurrent
execution of multiple operating systems poses several challenges.
Firstly, virtual machines must be isolated from one another: it is not
acceptable for the execution of one to adversely affect the perfor-
mance of another. This is particularly true when virtual machines
are owned by mutually untrusting users. Secondly, it is necessary
to support a variety of different operating systems to accommodate
the heterogeneity of popular applications. Thirdly, the performance
overhead introduced by virtualization should be small.
Xen hosts commodity operating systems, albeit with some source
modifications. The prototype described and evaluated in this paper
can support multiple concurrent instances of our XenoLinux guest
operating system; each instance exports an application binary inter-
face identical to a non-virtualized Linux 2.4. Our port of Windows
XP to Xen is not yet complete but is capable of running simple
user-space processes. Work is also progressing in porting NetBSD.
Xen enables users to dynamically instantiate an operating sys-
tem to execute whatever they desire. In the XenoServer project [15,
35] we are deploying Xen on standard server hardware at econom-
ically strategic locations within ISPs or at Internet exchanges. We
perform admission control when starting new virtual machines and
expect each VM to pay in some fashion for the resources it requires.
We discuss our ideas and approach in this direction elsewhere [21];
this paper focuses on the VMM.
There are a number of ways to build a system to host multiple
applications and servers on a shared machine. Perhaps the simplest
is to deploy one or more hosts running a standard operating sys-
tem such as Linux or Windows, and then to allow users to install
files and start processes — protection between applications being
provided by conventional OS techniques. Experience shows that
system administration can quickly become a time-consuming task
due to complex configuration interactions between supposedly dis-
joint applications.
More importantly, such systems do not adequately support per-
formance isolation; the scheduling priority, memory demand, net-
work traffic and disk accesses of one process impact the perfor-
mance of others. This may be acceptable when there is adequate
provisioning and a closed user group (such as in the case of com-
Page 2
hidden
putational grids, or the experimental PlanetLab platform [33]), but
not when resources are oversubscribed, or users uncooperative.
One way to address this problem is to retrofit support for per-
formance isolation to the operating system. This has been demon-
strated to a greater or lesser degree with resource containers [3],
Linux/RK [32], QLinux [40] and SILK [4]. One difficulty with
such approaches is ensuring that all resource usage is accounted to
the correct process — consider, for example, the complex interac-
tions between applications due to buffer cache or page replacement
algorithms. This is effectively the problem of “QoS crosstalk” [41]
within the operating system. Performing multiplexing at a low level
can mitigate this problem, as demonstrated by the Exokernel [23]
and Nemesis [27] operating systems. Unintentional or undesired
interactions between tasks are minimized.
We use this same basic approach to build Xen, which multiplexes
physical resources at the granularity of an entire operating system
and is able to provide performance isolation between them. In con-
trast to process-level multiplexing this also allows a range of guest
operating systems to gracefully coexist rather than mandating a
specific application binary interface. There is a price to pay for this
flexibility — running a full OS is more heavyweight than running
a process, both in terms of initialization (e.g. booting or resuming
versus fork and exec), and in terms of resource consumption.
For our target of up to 100 hosted OS instances, we believe this
price is worth paying; it allows individual users to run unmodified
binaries, or collections of binaries, in a resource controlled fashion
(for instance an Apache server along with a PostgreSQL backend).
Furthermore it provides an extremely high level of flexibility since
the user can dynamically create the precise execution environment
their software requires. Unfortunate configuration interactions be-
tween various services and applications are avoided (for example,
each Windows instance maintains its own registry).
The remainder of this paper is structured as follows: in Section 2
we explain our approach towards virtualization and outline how
Xen works. Section 3 describes key aspects of our design and im-
plementation. Section 4 uses industry standard benchmarks to eval-
uate the performance of XenoLinux running above Xen in compar-
ison with stand-alone Linux, VMware Workstation and User-mode
Linux (UML). Section 5 reviews related work, and finally Section 6
discusses future work and concludes.
2. XEN: APPROACH & OVERVIEW
In a traditional VMM the virtual hardware exposed is function-
ally identical to the underlying machine [38]. Although full virtu-
alization has the obvious benefit of allowing unmodified operating
systems to be hosted, it also has a number of drawbacks. This is
particularly true for the prevalent IA-32, or x86, architecture.
Support for full virtualization was never part of the x86 archi-
tectural design. Certain supervisor instructions must be handled by
the VMM for correct virtualization, but executing these with in-
sufficient privilege fails silently rather than causing a convenient
trap [36]. Efficiently virtualizing the x86 MMU is also difficult.
These problems can be solved, but only at the cost of increased
complexity and reduced performance. VMware’s ESX Server [10]
dynamically rewrites portions of the hosted machine code to insert
traps wherever VMM intervention might be required. This transla-
tion is applied to the entire guest OS kernel (with associated trans-
lation, execution, and caching costs) since all non-trapping privi-
leged instructions must be caught and handled. ESX Server imple-
ments shadow versions of system structures such as page tables and
maintains consistency with the virtual tables by trapping every up-
date attempt — this approach has a high cost for update-intensive
operations such as creating a new application process.
Notwithstanding the intricacies of the x86, there are other argu-
ments against full virtualization. In particular, there are situations
in which it is desirable for the hosted operating systems to see real
as well as virtual resources: providing both real and virtual time
allows a guest OS to better support time-sensitive tasks, and to cor-
rectly handle TCP timeouts and RTT estimates, while exposing real
machine addresses allows a guest OS to improve performance by
using superpages [30] or page coloring [24].
We avoid the drawbacks of full virtualization by presenting a vir-
tual machine abstraction that is similar but not identical to the un-
derlying hardware — an approach which has been dubbed paravir-
tualization [43]. This promises improved performance, although
it does require modifications to the guest operating system. It is
important to note, however, that we do not require changes to the
application binary interface (ABI), and hence no modifications are
required to guest applications.
We distill the discussion so far into a set of design principles:
1. Support for unmodified application binaries is essential, or
users will not transition to Xen. Hence we must virtualize all
architectural features required by existing standard ABIs.
2. Supporting full multi-application operating systems is im-
portant, as this allows complex server configurations to be
virtualized within a single guest OS instance.
3. Paravirtualization is necessary to obtain high performance
and strong resource isolation on uncooperative machine ar-
chitectures such as x86.
4. Even on cooperative machine architectures, completely hid-
ing the effects of resource virtualization from guest OSes
risks both correctness and performance.
Note that our paravirtualized x86 abstraction is quite different
from that proposed by the recent Denali project [44]. Denali is de-
signed to support thousands of virtual machines running network
services, the vast majority of which are small-scale and unpopu-
lar. In contrast, Xen is intended to scale to approximately 100 vir-
tual machines running industry standard applications and services.
Given these very different goals, it is instructive to contrast Denali’s
design choices with our own principles.
Firstly, Denali does not target existing ABIs, and so can elide
certain architectural features from their VM interface. For exam-
ple, Denali does not fully support x86 segmentation although it is
exported (and widely used1) in the ABIs of NetBSD, Linux, and
Windows XP.
Secondly, the Denali implementation does not address the prob-
lem of supporting application multiplexing, nor multiple address
spaces, within a single guest OS. Rather, applications are linked
explicitly against an instance of the Ilwaco guest OS in a manner
rather reminiscent of a libOS in the Exokernel [23]. Hence each vir-
tual machine essentially hosts a single-user single-application un-
protected “operating system”. In Xen, by contrast, a single virtual
machine hosts a real operating system which may itself securely
multiplex thousands of unmodified user-level processes. Although
a prototype virtual MMU has been developed which may help De-
nali in this area [44], we are unaware of any published technical
details or evaluation.
Thirdly, in the Denali architecture the VMM performs all paging
to and from disk. This is perhaps related to the lack of memory-
management support at the virtualization layer. Paging within the
1For example, segments are frequently used by thread libraries to address
thread-local data.

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

355 Readers on Mendeley
by Discipline
 
 
 
by Academic Status
 
42% Ph.D. Student
 
25% Student (Master)
 
6% Student (Bachelor)
by Country
 
29% United States
 
9% China
 
8% United Kingdom