Sign up & Download
Sign in

Combining events and threads for scalable network services implementation and evaluation of monadic, application-level concurrency primitives

by Peng Li, Steve Zdancewic
Proceedings of the 2007 ACM SIGPLAN conference on Programming language design and implementation (2007)

Abstract

Background:The most common human immunodeficiency virus (HIV)-related cause of death in persons with transfusion-acquired IDS has been Pneumocystis (TA) Al carinii pneumonia (PCP), While better treatment for PCP accounts for improved survival among HIV-infected homosexual or bisexual men, the extent to which others have benefitted from these developments is unknown. Study Design and Methods: Patterns of PCP care among persons with TA-AIDS, intravenous drug users, and homosexual or bisexual men are compared, Results: TA-AIDS patients were older (mean, 46 years vs. 48.5 torr vs. 41% of others, p80% for others, p<0.05), more likely to be intubated (22% vs, 9-13% of others, p<0.05), and more likely to die in hospital (26% vs. 13-22% of others, p<0.05). After controlling for differences in severity of illness, insurance, age, and hospital characteristics, TA-AIDS patients were 45 percent as likely to have early PCP therapy (95% Cl, 22%, 91%) as were persons In high-risk groups. Conclusion: For persons whose only risk factor was transfusion, recognition of the HIV infection and its complications appears to be problematic, which may help explain poorer outcomes in persons with HIV-related PCP,

Cite this document (BETA)

Available from doi.acm.org
Page 1
hidden

Combining events and threads for scalable network services implementation and evaluation of monadic, application-level concurrency primitives

Combining Events And Threads For Scalable Network Services
Implementation And Evaluation Of Monadic,
Application-level Concurrency Primitives
Peng Li
University of Pennsylvania
lipeng@cis.upenn.edu
Steve Zdancewic
University of Pennsylvania
stevez@cis.upenn.edu
Abstract
This paper proposes to combine two seemingly opposed program-
ming models for building massively concurrent network services:
the event-driven model and the multithreaded model. The result is
a hybrid design that offers the best of both worlds—the ease of use
and expressiveness of threads and the flexibility and performance
of events.
This paper shows how the hybrid model can be implemented en-
tirely at the application level using concurrency monads in Haskell,
which provides type-safe abstractions for both events and threads.
This approach simplifies the development of massively concurrent
software in a way that scales to real-world network services. The
Haskell implementation supports exceptions, symmetrical mul-
tiprocessing, software transactional memory, asynchronous I/O
mechanisms and application-level network protocol stacks. Ex-
perimental results demonstrate that this monad-based approach has
good performance: the threads are extremely lightweight (scaling
to ten million threads), and the I/O performance compares favor-
ably to that of Linux NPTL.
Categories and Subject Descriptors D.1.1 [Programming tech-
niques]: Applicative (Functional) Programming; D.1.3 [Program-
ming techniques]: Concurrent Programming; D.2.11 [Software
Engineering]: Software Architectures—Domain-specific architec-
tures; D.3.3 [Programming Languages]: Language Constructs and
Features—Concurrent programming structures, Control structures,
Frameworks; D.4.1 [Operating Systems]: Process Management—
Concurrency, Multiprocessing / multiprogramming / multitasking,
Scheduling, Threads.
General Terms Languages, Design, Experimentation, Perfor-
mance, Measurement.
Keywords Event, Thread, Concurrency, Networking, Program-
ming, Scalability, Implementation, Monad, Haskell.
1. Introduction
Modern network services present software engineers with a num-
ber of design challenges. Peer-to-peer systems, multiplayer games,
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
PLDI’07 June 11–13, 2007, San Diego, California, USA.
Copyright c© 2007 ACM 978-1-59593-633-2/07/0006. . . $5.00
and Internet-scale data storage applications must accommodate
tens of thousands of simultaneous, mostly-idle client connections.
Such massively-concurrent programs are difficult to implement, es-
pecially when other requirements, such as high performance and
strong security, must also be met.
Events vs. threads: Two implementation strategies for building
such inherently concurrent systems have been successful. Both the
multithreaded and event-driven approaches have their proponents
and detractors. The debate over which model is “better” has waged
for many years, with little resolution. Ousterhout [19] has argued
that “threads are a bad idea (for most purposes),” citing the difficul-
ties of ensuring proper synchronization and debugging with thread-
based approaches. A counter argument, by von Behren, Condit,
and Brewer [25], argues that “events are a bad idea (for high-
concurrency servers),” essentially because reasoning about control
flow in event-based systems is difficult and the apparent perfor-
mance wins of the event-driven approach can be completely re-
couped by careful engineering [26].
The debate over threads and events never seems to end be-
cause a programmer often has to choose one model and give up the
other. For example, if a Linux C programmer uses POSIX threads
to write a web server, it is difficult to use asynchronous I/O. On
the other hand, if the programmer uses epoll and AIO in Linux to
write a web server, it will be inconvenient to represent control flow
for each client. The reason for this situation is that conventional
thread abstraction mechanisms are too rigid: threads are imple-
mented in the OS and runtime libraries and the user cannot easily
customize these components and integrate them with the applica-
tion. Although threads can be made lightweight and efficient, an
event-driven system still has the advantage on flexibility and cus-
tomizability: it can always be tailored and optimized to the applica-
tion’s specific needs.
The hybrid model: Ideally, the programmer would design parts
of the application using threads, where threads are the appropri-
ate abstraction (for per-client code), and parts of the system using
events, where they are more suitable (for asynchronous I/O inter-
faces). To make this hybrid model feasible, not only should the sys-
tem provide threads, but the thread scheduler interface must also
provide a certain level of abstraction, in the form of event handlers,
which hides the implementation details of threads and can be used
by the programmer to construct a modular event-driven system.
Many existing systems implement the hybrid model to various
degrees; most of them have a bias either toward threads or toward
events. For example, Capriccio [26] is a user-level, cooperative
thread library with a thread scheduler that looks very much like an
event-driven application. However, it provides no abstraction on the
event-driven side: the scheduler uses a low-level, unsafe program-
ming interface that is completely hidden from the programmer.
189
Page 2
hidden
On the other hand, many event-driven systems uses continuation-
passing style (CPS) programming to represent the control flow for
each client; the problem is that CPS programs are often difficult to
write and understand. Although CPS is a scalable design, it is not
as intuitive as conventional multithreaded programming styles (for
most programmers).
Application-level implementation: The hybrid model adopted
here encourages both the multithreaded components and the event-
driven components of the application be developed in a uniform
programming environment. To do so, it is most convenient to im-
plement the concurrency abstractions (both thread abstractions and
event abstractions) entirely inside the application, using standard
programming language idioms. We call this idea application-level
implementation.
Programming concurrency abstractions entirely inside the ap-
plication is a significant challenge on legacy programming lan-
guage tools: for languages like C, implementing user-level threads
and schedulers involves a lot of low-level, unsafe programming in-
terfaces. Nevertheless, it is indeed feasible to implement the hybrid
model directly at application-level using modern functional pro-
gramming languages such as Haskell. In 1999, Koen Claessen [8]
showed that (cooperative) threads can be implemented using only
a monad, without any change to the programming language itself.
A hybrid framework for network services: This paper uses
Claessen’s technique to implement the hybrid model entirely inside
the application and develop a framework for building massively
concurrent network services. Our implementation is based on Con-
current Haskell [14], supported by the latest version of GHC [11].
Improving on Claessen’s original, proof-of-concept design, our im-
plementation offers the following:
 True Parallelism: Application-level threads are mapped to mul-
tiple OS threads and take advantage of SMP systems.
 Modularity and Flexibility: The scheduler is a customizable
event-driven system that uses high-performance, asynchronous
I/O mechanisms. We implemented support for Linux epoll and
AIO; we even plugged a TCP stack to our system.
 Exceptions: Multithreaded code can use exceptions to handle
failure, which is common in network programming.
 Thread synchronization: Non-blocking synchronization comes
almost for free—software transactional memory (STM) in GHC
can be transparently used in application-level threads. We also
implemented blocking synchronization mechanisms such as
mutexes.
Evaluation: Our hybrid concurrency framework in Haskell
competes favorably against most existing thread-based or event-
based systems for building network services. It provides elegant in-
terfaces for both multithreaded and event-driven programming, and
it is all type-safe! Experiments suggest that, although everything is
written in Haskell, a pure, lazy, functional programming language,
the performance is acceptable in practice:
 I/O performance: Our implementation delivers performance
comparable to Linux NPTL in disk and FIFO pipe performance
tests, even when tens of thousands of threads are used.
 Scalability: Besides bounds on memory size and other system
resources, there is no limit on the number of concurrent clients
that our implementation can handle. In the I/O tests, our im-
plementation scaled to far more threads than Linux NPTL did.
From the OS point of view, it is just as scalable as an event-
driven system.
 Memory utilization: Our implementation has extremely effi-
cient memory utilization. All the thread-local state is explicitly
controlled by the programmer. As we have tested, each monadic
thread consumes as little as 48 bytes at run time, and our sys-
tem is capable of running 10,000,000 such threads on a real
machine.
Summary of contributions: The idea of the concurrency monad
is not new at all—we are building on work done by functional
programming researchers. Our contribution is to experiment with
this elegant design in real-world systems programming and evalu-
ate this technique, both qualitatively and quantitatively:
1. We scaled up the design of the concurrency monad to a real-
world implementation, providing elegant and flexible interfaces
for building massively concurrent network services using effi-
cient asynchronous I/O.
2. We proved that the monad-based design has good performance:
it delivers optimal I/O performance; it has efficient memory
utilization and it scales as well as event-driven systems.
3. We demonstrated the feasibility of the hybrid programming
model in high-performance network servers, providing future
directions for both systems and programming language re-
search.
Our experience also suggests that Haskell is a reasonable lan-
guage for building scalable systems software: it is expressive, suc-
cinct, efficient and type-safe; it interacts well with C libraries and
APIs.
2. The hybrid programming model
This section gives some background on the multithreaded and
event-driven approaches for building massively concurrent network
services, and motivates the design of the hybrid model.
2.1 A comparison of events vs. threads
Programming: The primary advantage of the multithreaded
model is that the programmer can reason about the series of ac-
tions taken by a thread in the familiar way, just as for a sequential
program. This approach leads to a natural programming style in
which the control flow for a single thread is made apparent by the
program text, using ordinary language constructs like conditional
statements, loops, exceptions, and function calls.
Event-driven programming, in contrast, is hard. Most general-
purpose programming languages do not provide appropriate ab-
stractions for programming with events. The control flow graph of
an event-driven program has to be decomposed into multiple event
handlers and represented as some form of state machine with ex-
plicit message passing or in continuation-passing style (CPS). Both
representations are difficult to program with and reason about, as
indicated by the name of Python’s popular, event-driven network-
ing framework, “Twisted” [24].
Performance: The multithreaded programming style does not
come for free: In most operating systems, a thread uses a reserved
segment of stack address space, and the virtual memory space ex-
hausts quickly on 32-bit systems. Thread scheduling and context
switching also have significant overheads. However, such perfor-
mance problems can be reduced by well engineered thread libraries
and/or careful use of cooperative multitasking—a recent example
in this vein is Capriccio [26], a user-level thread library specifically
for use in building highly scalable network services.
The event-driven approach exposes the scheduling of inter-
leaved computations explicitly to the programmer, thereby permit-
ting application-specific optimizations that significantly improve
performance. The event handlers typically perform only small
amounts of work and usually need only small amounts of local
190

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

11 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
73% Ph.D. Student
 
18% Student (Master)
 
9% Other Professional
by Country
 
18% United States
 
18% Brazil
 
9% United Kingdom

Groups

pool
Haskell