Crossbow : From Hardware Virtualized NICs to Virtualized Networks
Most (2009)
- ISBN: 9781605585956
- DOI: 10.1145/1592648.1592658
Available from portal.acm.org
or
Abstract
This paper describes a new architecture for achieving network virtualization using virtual NICs (VNICs) as the building blocks. The VNICs can be associated with dedicated and independent hardware lanes that consist of dedicated NIC and kernel resources. Hardware lanes support dynamic polling, which enables the fair sharing of bandwidth with no performance penalty. VNICs ensure full separation of traffic for virtual machines within the host. A collection of VNICs on one or more physical machines can be connected to create a Virtual Wire by assigning them a common attribute such as a VLAN tag.
Author-supplied keywords
Available from portal.acm.org
Page 1
Crossbow : From Hardware Virtualized NICs to Virtualized Networks
Crossbow: From Hardware Virtualized NICs to Virtualized
Networks
Sunay Tripathi
sunay.tripathi@sun.com
Nicolas Droux
nicolas.droux@sun.com
Thirumalai Srinivasan
thirumalai.srinivasan@sun.com
Kais Belgaied
kais.belgaied@sun.com
Solaris Kernel Networking
Sun Microsystems, Inc.
17 Network Circle, Menlo Park, CA 94025, USA
ABSTRACT
This paper describes a new architecture for achieving net-
work virtualization using virtual NICs (VNICs) as the build-
ing blocks. The VNICs can be associated with dedicated
and independent hardware lanes that consist of dedicated
NIC and kernel resources. Hardware lanes support dynamic
polling, which enables the fair sharing of bandwidth with no
performance penalty. VNICs ensure full separation of trac
for virtual machines within the host. A collection of VNICs
on one or more physical machines can be connected to create
a Virtual Wire by assigning them a common attribute such
as a VLAN tag.
Categories and Subject Descriptors
D.4.4 [Operating Systems]: Network communication; C.2.4
[Computer-Communication Networks]: Network oper-
ating systems
General Terms
Design, Performance, Security, Experimentation
Keywords
Virtualization, Networking, Performance, Hypervisor, VMs,
Zones, VLAN, Classication, Crossbow, vWire, VNICs
1. INTRODUCTION
Virtualization has seen tremendous growth in the last few
years. Hypervisor-based virtualization [2] and operating
system-based virtualization [18] [19] are two popular vir-
tualization technologies. More recently, I/O virtualization
(IOV), an exclusively hardware-based device and bus virtu-
alization technology [17], has also emerged.
Now that server virtualization has become mainstream,
the focus has shifted to network virtualization, where the
deployment model requires no interference between the net-
work trac of dierent virtual machines. An ideal scenario
would be to assign a physical NIC to each virtual machine.
However, the number of I/O slots available on a machine is
Copyright 2009 Sun Microsystems, Inc.
VISA’09, August 17, 2009, Barcelona, Spain.
ACM 978-1-60558-595-6/09/08.
fairly limited, and the cost per virtual machine would in-
crease due to the additional required power consumption,
physical connectivity, and administration overhead. Thus,
there is a need to securely share the NIC among the virtual
machines in a fair or policy-based manner.
Crossbow is the code name for the new OpenSolaris net-
working stack that supports virtualization of a physical NIC
into multiple VNICs. It aggressively uses the NIC hardware
features for performance, security isolation, and virtualiza-
tion. A VNIC is assigned a dedicated hardware lane, which
consists of NIC resources such as receive and transmit rings,
dedicated software resources, and CPUs. These dedicated
resources establish an independent data path, which is free
of contention. If the hardware resources are exhausted, or
if the NIC does not support virtualization, then the stack
falls back to software based NIC virtualization, albeit at an
extra performance cost.
In this paper, we rst take a look at the problems with
existing virtualization solutions. We then survey the recent
NIC hardware advancements that enable us to build the
Crossbow architecture. In Section 4, we describe the Cross-
bow architecture itself, and in Section 5, we show how it is
used to build a Virtual Wire, a fully virtualized network con-
necting virtual machines spread across one or more physical
machines. Finally, we look at other work happening in this
area and describe our future direction.
2. ISSUES INEXISTINGARCHITECTURES
First consider some of the key requirements for network
virtualization and how existing hypervisor based and fully
hardware based solutions (IOV) attempt to meet those re-
quirements.
2.1 Fair Sharing
The ability to support fair sharing of the underlying phys-
ical NIC among its constituent virtual NICs is a key net-
work virtualization requirement. This is an issue for both
hypervisor-based virtualization and IOV virtualization. In
both cases, a virtual machine can monopolize usage of the
underlying physical NIC resources and bandwidth, if it is
capable of driving the physical link at line rate.
Schedulers have focused mostly on sharing the CPU re-
sources, and network packet scheduling has been mostly left
to the NIC drivers, which are unaware of higher level services
or virtual machines. In [14] the authors discuss the prob-
53
Networks
Sunay Tripathi
sunay.tripathi@sun.com
Nicolas Droux
nicolas.droux@sun.com
Thirumalai Srinivasan
thirumalai.srinivasan@sun.com
Kais Belgaied
kais.belgaied@sun.com
Solaris Kernel Networking
Sun Microsystems, Inc.
17 Network Circle, Menlo Park, CA 94025, USA
ABSTRACT
This paper describes a new architecture for achieving net-
work virtualization using virtual NICs (VNICs) as the build-
ing blocks. The VNICs can be associated with dedicated
and independent hardware lanes that consist of dedicated
NIC and kernel resources. Hardware lanes support dynamic
polling, which enables the fair sharing of bandwidth with no
performance penalty. VNICs ensure full separation of trac
for virtual machines within the host. A collection of VNICs
on one or more physical machines can be connected to create
a Virtual Wire by assigning them a common attribute such
as a VLAN tag.
Categories and Subject Descriptors
D.4.4 [Operating Systems]: Network communication; C.2.4
[Computer-Communication Networks]: Network oper-
ating systems
General Terms
Design, Performance, Security, Experimentation
Keywords
Virtualization, Networking, Performance, Hypervisor, VMs,
Zones, VLAN, Classication, Crossbow, vWire, VNICs
1. INTRODUCTION
Virtualization has seen tremendous growth in the last few
years. Hypervisor-based virtualization [2] and operating
system-based virtualization [18] [19] are two popular vir-
tualization technologies. More recently, I/O virtualization
(IOV), an exclusively hardware-based device and bus virtu-
alization technology [17], has also emerged.
Now that server virtualization has become mainstream,
the focus has shifted to network virtualization, where the
deployment model requires no interference between the net-
work trac of dierent virtual machines. An ideal scenario
would be to assign a physical NIC to each virtual machine.
However, the number of I/O slots available on a machine is
Copyright 2009 Sun Microsystems, Inc.
VISA’09, August 17, 2009, Barcelona, Spain.
ACM 978-1-60558-595-6/09/08.
fairly limited, and the cost per virtual machine would in-
crease due to the additional required power consumption,
physical connectivity, and administration overhead. Thus,
there is a need to securely share the NIC among the virtual
machines in a fair or policy-based manner.
Crossbow is the code name for the new OpenSolaris net-
working stack that supports virtualization of a physical NIC
into multiple VNICs. It aggressively uses the NIC hardware
features for performance, security isolation, and virtualiza-
tion. A VNIC is assigned a dedicated hardware lane, which
consists of NIC resources such as receive and transmit rings,
dedicated software resources, and CPUs. These dedicated
resources establish an independent data path, which is free
of contention. If the hardware resources are exhausted, or
if the NIC does not support virtualization, then the stack
falls back to software based NIC virtualization, albeit at an
extra performance cost.
In this paper, we rst take a look at the problems with
existing virtualization solutions. We then survey the recent
NIC hardware advancements that enable us to build the
Crossbow architecture. In Section 4, we describe the Cross-
bow architecture itself, and in Section 5, we show how it is
used to build a Virtual Wire, a fully virtualized network con-
necting virtual machines spread across one or more physical
machines. Finally, we look at other work happening in this
area and describe our future direction.
2. ISSUES INEXISTINGARCHITECTURES
First consider some of the key requirements for network
virtualization and how existing hypervisor based and fully
hardware based solutions (IOV) attempt to meet those re-
quirements.
2.1 Fair Sharing
The ability to support fair sharing of the underlying phys-
ical NIC among its constituent virtual NICs is a key net-
work virtualization requirement. This is an issue for both
hypervisor-based virtualization and IOV virtualization. In
both cases, a virtual machine can monopolize usage of the
underlying physical NIC resources and bandwidth, if it is
capable of driving the physical link at line rate.
Schedulers have focused mostly on sharing the CPU re-
sources, and network packet scheduling has been mostly left
to the NIC drivers, which are unaware of higher level services
or virtual machines. In [14] the authors discuss the prob-
53
Page 2
lem of the scheduler not achieving the same level of fairness
with I/O-intensive workloads as they do with CPU-intensive
workloads.
2.2 Security
In the IOV model, a virtual machine is granted direct and
uncontrolled access to partitions of resources on the NIC,
thus maximizing the performance. However, this model has
security implications. In particular, rewall rules are not
handled by the IOV Virtual Functions (VFs), which opens
the door for a virtual machine to spoof its MAC or IP ad-
dresses, generate bridge protocol data units (PDUs), or fake
routing messages and bring down the entire layer 2 or layer
3 network.
This issue is especially critical in cohosted environments
where multiple tenants share the same hardware.
2.3 Performance
Performance is an issue for the hypervisor-based model
[12] [15]. On the receive side, considerable eort is spent on
bringing the packet into the system and performing software
classication before the destination virtual machine can be
identied. Then, the trac needs to be passed to the virtual
machine through the hypervisor, which is also expensive.
In [15] the authors report substantial performance impact
and higher CPU utilization. In [24] [11] the authors present
specic performance optimizations such as using NIC Hard-
ware checksum and TCP Large Segment Ooad, and report
signicant gains. However, they point out that the perfor-
mance impact is still substantial, especially on the receive
side.
2.4 Virtual Machine Migration
The hypervisor based solutions support migration of vir-
tual machines from one host to another, but IOV-based solu-
tions don't easily lend themselves to virtual machine migra-
tion, since the virtual machine has a state associated with
the bare metal.
3. HARDWARE ADVANCEMENTS
Most modern NICs [7] [20] [13] support a combination of
hardware features, which are described in this section.
3.1 Multiple Receive and Transmit Rings and
Groups
One of the main features of NICs is the support of multiple
receive and transmit rings. Each hardware ring or queue has
its own descriptors, and can be assigned its own resources on
the bus (DMA channels, MSI-X interrupts [17]) and on the
system (driver buers, interrupted CPU). Multiple CPUs
can therefore cooperate in receiving and sending packets,
which is particularly needed when a single CPU is too slow.
Rings can be bundled in statically or dynamically formed
groups.
On the receive side, one or more MAC addresses, VLAN
tags, or both, can be associated with a ring group. A steering
logic on the NIC can deposit incoming packets to any ring
of the group that matches the packets' address or VLAN.
A load balancer (also known as Receive-Side Scaling (RSS)
engine) or a higher level classication engine determines the
actual recipient ring based on a matching hash value or spe-
cic L3/L4 classication rules. Packets that do not match
any programmed MAC address (for example broadcasts) or
classication rule are delivered to a default ring on the NIC.
On the transmit side, multiple rings enable the sending of
multiple packet
ows through the same device in parallel.
Similar to receive rings, transmit rings can be bundled to-
gether. A transmit rings group is a set of transmit rings with
the same capabilities (hardware checksum, LSO, and so on)
and a transmit load balancer. As of the time of writing this
paper, only few vendors have announced future hardware
support of transmit ring grouping.
An advanced virtualization feature is the capability to cre-
ate a partition on the NIC with its own receive and transmit
ring groups, as implemented by Intel's Virtual Machine De-
vice Queues (VMDq) [8].
3.2 SR I/O Virtualization
The PCI-SIG consortium developed the Single Root I/O
Virtualization and Sharing (SR-IOV) [16] specication pri-
marily to address the issue of platform resource overhead
from the hypervisor trap imposed on all I/O operations be-
tween virtual machines and devices, while preserving the
capability to share I/O devices among multiple virtual ma-
chines. A device maps to a single physical function (PF)
on the bus, and can be partitioned into multiple virtual
functions (VFs). Independent interrupts, rings, and ad-
dress translation services allow virtual domains to control
their dedicated VF directly. Some NICs also oer on-board
switching of trac between VFs.
4. CROSSBOW ARCHITECTURE
One of the key requirements of network virtualization is to
ensure that virtual machines are insulated from each other's
trac.
As mentioned in Section 1, true isolation of a virtual ma-
chine and its network trac can be achieved by dedicating
a physical NIC and its associated network cable and port on
the switch to the virtual machine itself. If the MAC layer
can have dedicated resources for each physical NIC (without
any shared locks, queues, and CPUs) and the switch ensures
fairness on a per port basis, the trac for one virtual ma-
chine will not interfere with the trac of another virtual
machine.
In the case where the physical resources, and in particular
the physical NIC, needs to be shared among multiple virtual
machines, the next best option is to virtualize the NIC hard-
ware and the layer 2 stack such that the sharing is fair and
without any interference. The Crossbow architecture in the
OpenSolaris OS does exactly that by virtualizing the MAC
layer and taking advantage of NIC hardware capabilities to
ensure trac separation between multiple virtual machines.
4.1 Virtualization lanes
A key tenet of the Crossbow design is the concept of vir-
tualization lanes. A lane consists of dedicated hardware and
software resources that can be assigned to a particular type
of trac. Specically, it consists of the following:
NIC resources such as receive and transmit rings, in-
terrupts, and MAC address slots.
Driver resources such as DMA bindings.
MAC layer resources such as data structures, locks,
kernel queues and execution threads to process the
54
with I/O-intensive workloads as they do with CPU-intensive
workloads.
2.2 Security
In the IOV model, a virtual machine is granted direct and
uncontrolled access to partitions of resources on the NIC,
thus maximizing the performance. However, this model has
security implications. In particular, rewall rules are not
handled by the IOV Virtual Functions (VFs), which opens
the door for a virtual machine to spoof its MAC or IP ad-
dresses, generate bridge protocol data units (PDUs), or fake
routing messages and bring down the entire layer 2 or layer
3 network.
This issue is especially critical in cohosted environments
where multiple tenants share the same hardware.
2.3 Performance
Performance is an issue for the hypervisor-based model
[12] [15]. On the receive side, considerable eort is spent on
bringing the packet into the system and performing software
classication before the destination virtual machine can be
identied. Then, the trac needs to be passed to the virtual
machine through the hypervisor, which is also expensive.
In [15] the authors report substantial performance impact
and higher CPU utilization. In [24] [11] the authors present
specic performance optimizations such as using NIC Hard-
ware checksum and TCP Large Segment Ooad, and report
signicant gains. However, they point out that the perfor-
mance impact is still substantial, especially on the receive
side.
2.4 Virtual Machine Migration
The hypervisor based solutions support migration of vir-
tual machines from one host to another, but IOV-based solu-
tions don't easily lend themselves to virtual machine migra-
tion, since the virtual machine has a state associated with
the bare metal.
3. HARDWARE ADVANCEMENTS
Most modern NICs [7] [20] [13] support a combination of
hardware features, which are described in this section.
3.1 Multiple Receive and Transmit Rings and
Groups
One of the main features of NICs is the support of multiple
receive and transmit rings. Each hardware ring or queue has
its own descriptors, and can be assigned its own resources on
the bus (DMA channels, MSI-X interrupts [17]) and on the
system (driver buers, interrupted CPU). Multiple CPUs
can therefore cooperate in receiving and sending packets,
which is particularly needed when a single CPU is too slow.
Rings can be bundled in statically or dynamically formed
groups.
On the receive side, one or more MAC addresses, VLAN
tags, or both, can be associated with a ring group. A steering
logic on the NIC can deposit incoming packets to any ring
of the group that matches the packets' address or VLAN.
A load balancer (also known as Receive-Side Scaling (RSS)
engine) or a higher level classication engine determines the
actual recipient ring based on a matching hash value or spe-
cic L3/L4 classication rules. Packets that do not match
any programmed MAC address (for example broadcasts) or
classication rule are delivered to a default ring on the NIC.
On the transmit side, multiple rings enable the sending of
multiple packet
ows through the same device in parallel.
Similar to receive rings, transmit rings can be bundled to-
gether. A transmit rings group is a set of transmit rings with
the same capabilities (hardware checksum, LSO, and so on)
and a transmit load balancer. As of the time of writing this
paper, only few vendors have announced future hardware
support of transmit ring grouping.
An advanced virtualization feature is the capability to cre-
ate a partition on the NIC with its own receive and transmit
ring groups, as implemented by Intel's Virtual Machine De-
vice Queues (VMDq) [8].
3.2 SR I/O Virtualization
The PCI-SIG consortium developed the Single Root I/O
Virtualization and Sharing (SR-IOV) [16] specication pri-
marily to address the issue of platform resource overhead
from the hypervisor trap imposed on all I/O operations be-
tween virtual machines and devices, while preserving the
capability to share I/O devices among multiple virtual ma-
chines. A device maps to a single physical function (PF)
on the bus, and can be partitioned into multiple virtual
functions (VFs). Independent interrupts, rings, and ad-
dress translation services allow virtual domains to control
their dedicated VF directly. Some NICs also oer on-board
switching of trac between VFs.
4. CROSSBOW ARCHITECTURE
One of the key requirements of network virtualization is to
ensure that virtual machines are insulated from each other's
trac.
As mentioned in Section 1, true isolation of a virtual ma-
chine and its network trac can be achieved by dedicating
a physical NIC and its associated network cable and port on
the switch to the virtual machine itself. If the MAC layer
can have dedicated resources for each physical NIC (without
any shared locks, queues, and CPUs) and the switch ensures
fairness on a per port basis, the trac for one virtual ma-
chine will not interfere with the trac of another virtual
machine.
In the case where the physical resources, and in particular
the physical NIC, needs to be shared among multiple virtual
machines, the next best option is to virtualize the NIC hard-
ware and the layer 2 stack such that the sharing is fair and
without any interference. The Crossbow architecture in the
OpenSolaris OS does exactly that by virtualizing the MAC
layer and taking advantage of NIC hardware capabilities to
ensure trac separation between multiple virtual machines.
4.1 Virtualization lanes
A key tenet of the Crossbow design is the concept of vir-
tualization lanes. A lane consists of dedicated hardware and
software resources that can be assigned to a particular type
of trac. Specically, it consists of the following:
NIC resources such as receive and transmit rings, in-
terrupts, and MAC address slots.
Driver resources such as DMA bindings.
MAC layer resources such as data structures, locks,
kernel queues and execution threads to process the
54
Sign up today - FREE
Mendeley saves you time finding and organizing research. Learn more
- All your research in one place
- Add and import papers easily
- Access it anywhere, anytime
Start using Mendeley in seconds!
Readership Statistics
18 Readers on Mendeley
by Discipline
6% Engineering
by Academic Status
44% Ph.D. Student
17% Student (Master)
11% Researcher (at a non-Academic Institution)
by Country
28% Japan
22% China
11% United States


