Sign up & Download
Sign in

AC : Composable Asynchronous IO for Native Languages

by Tim Harris, Martin Abadi, Rebecca Isaacs, Ross McIlroy
In Proceedings of the 2011 Conference on ObjectOriented Programming Systems Languages and Applications OOPSLA 2011 (2011)

Cite this document (BETA)

Available from research.microsoft.com
Page 1
hidden

AC : Composable Asynchronous IO for Native Languages

AC: Composable Asynchronous IO for Native Languages
Tim Harris† Martı´n Abadi‡⋆ Rebecca Isaacs‡ Ross McIlroy†
Microsoft Research, Cambridge† Microsoft Research, Silicon Valley‡
University of California, Santa Cruz⋆ Colle`ge de France⋆
tharris@microsoft.com abadi@microsoft.com risaacs@microsoft.com rmcilroy@microsoft.com
Abstract
This paper introduces AC, a set of language constructs for
composable asynchronous IO in native languages such as
C/C++. Unlike traditional synchronous IO interfaces, AC
lets a thread issue multiple IO requests so that they can
be serviced concurrently, and so that long-latency opera-
tions can be overlapped with computation. Unlike tradi-
tional asynchronous IO interfaces, AC retains a sequential
style of programming without requiring code to use multi-
ple threads, and without requiring code to be “stack-ripped”
into chains of callbacks. AC provides an async statement to
identify opportunities for IO operations to be issued concur-
rently, a do..finish block that waits until any enclosed
async work is complete, and a cancel statement that
requests cancellation of unfinished IO within an enclosing
do..finish. We give an operational semantics for a core
language. We describe and evaluate implementations that are
integrated with message passing on the Barrelfish research
OS, and integrated with asynchronous file and network IO
on Microsoft Windows. We show that AC offers comparable
performance to existing C/C++ interfaces for asynchronous
IO, while providing a simpler programming model.
Categories and Subject Descriptors D.1.3 [Programming
Techniques]: Concurrent Programming; D.3.3 [Program-
ming Languages]: Language Constructs and Features—
Input/output; D.4.4 [Operating Systems]: Communications
Management—Message sending
General Terms Languages, Performance
1. Introduction
In the future, processors are likely to provide a heteroge-
neous mix of core types without hardware cache coherence
across a whole machine. In the Barrelfish project we are in-
vestigating how to design an operating system (OS) for this
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
OOPSLA’11, October 22–27, 2011, Portland, Oregon, USA.
Copyright c© 2011 ACM 978-1-4503-0940-0/11/10. . . $10.00
kind of hardware, in which we can no longer rely on tradi-
tional shared-memory within the OS [4].
The approach we are taking is to construct the OS around
separate per-core kernels, and to use message passing for
communication between system processes running on dif-
ferent cores. Other contemporary OS research projects take
a similar approach [38]. Our hypothesis is that systems built
on message passing can be mapped to a wide variety of pro-
cessor architectures without large-scale re-implementation.
Using message passing lets us accommodate machines with
heterogeneous core types, and machines without cache-
coherence; we can map message passing operations onto
specialized messaging instructions [18, 34], and we can map
them onto shared-memory buffers on current hardware [4].
However, it is difficult to write scalable low-level soft-
ware using message passing. Existing systems focus either
on ease-of-programming (by providing simple synchronous
send/receive operations), or on performance (typically by
providing asynchronous operations that execute a callback
function once a message has been sent or received). The
same tension exists in IO interfaces more generally [27, 35].
For example, the Microsoft Windows APIs require software
to choose between synchronous operations which allow only
one concurrent IO request per thread, and complex asyn-
chronous operations which allow multiple IO requests.
We believe that the inevitable and disruptive evolution of
hardware to non-cache-coherent, heterogeneous, multi-core
systems makes support for asynchronous IO in low-level
languages such as C/C++ both essential and timely.
In this paper we introduce AC (“Asynchronous C”), a new
approach to writing programs using asynchronous IO (AIO).
AC provides a lightweight form of AIO that can be added in-
crementally to software, without the use of callbacks, events,
or multiple threads.
Our overall approach is for the programmer to start out
with simple synchronous IO operations, and to use new
language constructs to identify opportunities for the lan-
guage runtime system to start multiple IO operations asyn-
chronously.
As a running example, consider a Lookup function that
sends a message to a name-service process, and then receives
back an address that the name maps to. Figure 1 shows this
function written using Barrelfish’s callback-based interface.
Page 2
hidden
void Lookup(NSChannel_t *c, char *name) {
OnRecvLookupResponse(c, &ResponseHandler);
// Store state needed by send handler
c->st = name;
OnSend(c, &SendHandler);
}
void ResponseHandler(NSChannel_t *c, int addr) {
printf("Got response %d\n", addr);
}
void SendHandler(NSChannel_t *c) {
if (OnSendLookupRequest(c, (char*)(c->st)) == BUSY) {
OnSend(c, &SendHandler);
} }
Figure 1. Querying a set of name server using Barrelfish’s
callback-based interface for message passing.
The Lookup function takes a reference to a channel (c).
The function registers a ResponseHandler callback to
execute when a LookupResponse reply is received. It
then registers a SendHandler callback to execute when
channel c has space for the outgoing message. (Many hard-
ware implementations of message passing provide bounded-
size message channels, and so it can be impossible to send a
message immediately.) In addition, Lookup needs to record
name in a temporary data structure so that it is available to
SendHandler. The On* functions are generated automat-
ically from an interface definition for the NSChannel t
channel.
With AC, the “lookup” example becomes a single func-
tion using synchronous Send/Recv operations: (We omit
some details to do with cancellation of unfinished IO opera-
tions; we return to cancellation in Section 2.)
// Caution: functions ending in AC may block
void LookupAC(NSChannel_t *c, char *name) {
int addr;
SendLookupRequestAC(c, name);
RecvLookupResponseAC(c, &addr);
printf("Got response %d\n", addr);
}
Compared with the callback-based implementation, this
LookupAC function is clearly much simpler: it avoids the
need for “stack-ripping” [3] in which the logical flow be-
tween operations is split across a series of callbacks. AC
leads to a form of composability that is lost with stack-
ripping. A function can simply call into other functions
using AC, and it can start multiple AC operations concur-
rently. For instance, to communicate with two name servers,
one can write:
void TwinLookupAC(NSChannel_t *c1,
NSChannel_t *c2,
char *name) {
do {
async LookupAC(c1, name); // S1
async LookupAC(c2, name); // S2
} finish;
printf("Got both responses\n"); // S3
}
The async at statement S1 indicates that execution can
continue to statement S2 if the first lookup needs to block.
The do..finish construct indicates that execution cannot
continue to statement S3 until both S1 and S2 have been
executed to completion.
Throughout AC, we keep the abstractions used for asyn-
chrony separate from the abstractions used for parallel pro-
gramming; code remains single-threaded unless the pro-
grammer explicitly introduces parallelism. The async and
do..finish constructs are solely there to identify op-
portunities for multiple messages to be issued concurrently;
unlike the async construct in X10 [7], our async does
not introduce parallelism. Consequently, many of our exam-
ples can be written with no concurrency-control beyond the
block-structured synchronization of do..finish.
We make a number of additional contributions beyond
the core design of AC. We introduce a new block-structured
cancellation mechanism. This approach to cancellation pro-
vides a modular way for a program to start asynchrony op-
erations and then to cancel them if they have not yet com-
pleted; e.g., adding a timeout around a function that is called.
In TwinLookupAC, cancellation could be used to abandon
one lookup as soon as the other lookup is complete. In con-
trast to our approach, traditional cancellation mechanisms
are directed at individual IO operations [1], or at groups of
operations on the same file, or at a complete thread (e.g.,
alerting in Modula-2+ [5]).
We introduce AC in more detail in Section 2. In Section 3
we present a formal operational semantics for a core lan-
guage modeling AC, including cancellation. We give the se-
mantics and discuss properties that it satisfies.
Section 4 describes two implementations of AC. The first
implementation uses a modified Clang/LLVM tool-chain to
add the AC operations to C/C++. The second implemen-
tation operates with Clang, or with GCC, and defines the
AC constructs using a combination of macros and existing
C/C++ extensions provided by these compilers. The second
implementation has slightly higher overheads than the first.
In Section 5 we look at the performance of implemen-
tations that are integrated with message passing on Bar-
relfish, and also at implementations that are integrated with
asynchronous IO on Microsoft Windows. In each case, AC
achieves most of the performance of manually written stack-
ripped code while providing a programming model that is
comparable to basic synchronous IO (and comparable to the
recent C# and F#-based abstractions for performing asyn-
chronous IO [32]). We discuss related work and conclude in
Sections 6 and 7.
2. Composable Asynchronous IO
In this section we introduce AC informally. We continue
with the example of a name-service lookup from the intro-
duction. We use it to illustrate the behavior of AC operations
in more detail, and to motivate our design choices.
Throughout this section our design choices are moti-
vated by providing two properties. First, a “serial elision”
property: if the IO operations in a piece of software com-

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

7 Readers on Mendeley
by Discipline
 
by Academic Status
 
57% Ph.D. Student
 
29% Student (Master)
 
14% Researcher (at an Academic Institution)
by Country
 
29% Switzerland
 
14% Sweden
 
14% Japan

Groups

pool