Sign up & Download
Sign in

Uniform boilerplate and list processing

by Neil Mitchell, Colin Runciman
Proceedings of the ACM SIGPLAN workshop on Haskell workshop Haskell 07 (2007)

Abstract

Generic traversals over recursive data structures are often referred to as boilerplate code. The definitions of functions involving such traversals may repeat very similar patterns, but with variations for different data types and different functionality. Libraries of operations abstracting away boilerplate code typically rely on elaborate types to make operations generic. The motivating observation for this paper is that most traversals have value-specific behaviour for just one type. We present the design of a new library exploiting this assumption. Our library allows concise expression of traversals with competitive performance.

Cite this document (BETA)

Available from portal.acm.org
Page 1
hidden

Uniform boilerplate and list processing

c° ACM, 2007. This is the author’s version of the work. It is posted here by permission of ACM for your personal use.
Not for redistribution. The definitive version was published in the Proceedings of the Haskell Workshop 2007,
ISBN 978-1-59593-674-5, (30 Sep 2007) http://doi.acm.org/10.1145/1291201.1291208
Uniform Boilerplate and List Processing
Or: Scrap Your Scary Types
Neil Mitchell
University of York, UK
ndm@cs.york.ac.uk
Colin Runciman
University of York, UK
colin@cs.york.ac.uk
Abstract
Generic traversals over recursive data structures are often referred
to as boilerplate code. The definitions of functions involving such
traversals may repeat very similar patterns, but with variations for
different data types and different functionality. Libraries of opera-
tions abstracting away boilerplate code typically rely on elaborate
types to make operations generic. The motivating observation for
this paper is that most traversals have value-specific behaviour for
just one type. We present the design of a new library exploiting
this assumption. Our library allows concise expression of traver-
sals with competitive performance.
Categories and Subject Descriptors D.3 [Software]: Program-
ming Languages
General Terms Languages, Performance
1. Introduction
Take a simple example of a recursive data type:
data Expr = Add Expr Expr j Val Int
j Sub Expr Expr j Var String
j Mul Expr Expr j Neg Expr
j Div Expr Expr
The Expr type represents a small language for integer expres-
sions, which permits free variables. Suppose we need to extract a
list of all the variable occurrences in an expression:
variables :: Expr ! [String ]
variables (Var x ) = [x ]
variables (Val x ) = [ ]
variables (Neg x ) = variables x
variables (Add x y) = variables x ++ variables y
variables (Sub x y) = variables x ++ variables y
variables (Mul x y) = variables x ++ variables y
variables (Div x y) = variables x ++ variables y
This definition has the following undesirable characteristics: (1)
adding a new constructor would require an additional equation; (2)
the code is repetitive, the last four right-hand sides are identical;
(3) the code cannot be shared with other similar operations. This
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. To copy otherwise, to republish, to post on servers or to redistribute
to lists, requires prior specific permission and/or a fee.
Haskell’07, September 30, 2007, Freiburg, Germany.
Copyright c° 2007 ACM 978-1-59593-674-5/07/0009. . . $5.00
problem is referred to as the boilerplate problem. Using the library
developed in this paper, the above example can be rewritten as:
variables :: Expr ! [String ]
variables x = [y j Var y à universe x ]
The type signature is optional, and would be inferred auto-
matically if left absent. This example assumes a Uniplate in-
stance for the Expr data type, given in x3.2. This example requires
only Haskell 98. For more advanced examples we require multi-
parameter type classes – but no functional dependencies, rank-2
types or GADTs.
The central idea is to exploit a common property of many
traversals: they only require value-specific behaviour for a single
uniform type. In the variables example, the only type of interest
is Expr. In practical applications, this pattern is common1. By
focusing only on uniform type traversals, we are able to exploit
well-developed techniques in list processing.
1.1 Contribution
Ours is far from the first technique for ‘scrapping boilerplate’. The
area has been researched extensively. But there are a number of
distinctive features in our approach:
² We require no language extensions for single-type traversals,
and only multi-parameter type classes (Jones 2000) for multi-
type traversals.
² Our choice of operations is new: we shun some traditionally
provided operations, and provide some uncommon ones.
² Our type classes can be defined independently or on top of
Typeable and Data (La¨mmel and Peyton Jones 2003), making
optional use of built-in compiler support.
² We make use of list-comprehensions (Wadler 1987) for succinct
queries.
² We compare the conciseness of operations using our library, by
counting lexemes, showing our approach leads to less boiler-
plate.
² We compare the performance of traversal mechanisms, some-
thing that has been neglected in previous papers.
The ideas behind the Uniplate library have been used exten-
sively, in projects including the Yhc compiler (Golubovsky et al.
2007), the Catch tool (Mitchell and Runciman 2007) and the Reach
tool (Naylor and Runciman 2007). In Catch there are over 100 Uni-
plate traversals.
We have implemented all the techniques reported here. We
encourage readers to download the Uniplate library and try it out.
1 Most examples in boilerplate removal papers meet this restriction, even
though the systems being discussed do not depend on it.
Page 2
hidden
It can be obtained from the website at http://www.cs.york.ac.
uk/~ndm/uniplate/. A copy of the library has also been released,
and is available on Hackage2.
1.2 Road map
x2 introduces the traversal combinators that we propose, along with
short examples. x3 discusses how these combinators are imple-
mented in terms of a single primitive. x4 extends this approach to
multi-type traversals, and x5 covers the extended implementation.
x6 investigates some performance optimisations. x7 gives compar-
isons with other approaches, using examples such as the “paradise”
benchmark. x8 presents related work, x9 makes concluding remarks
and suggests directions for future work.
2. Queries and Transformations
We define various traversals, using the Expr type defined in the
introduction as an example throughout. We divide traversals into
two categories: queries and transformations. A query is a function
that takes a value, and extracts some information of a different type.
A transformation takes a value, and returns a modified version of
the original value. All the traversals rely on the class Uniplate, an
instance of which is assumed for Expr. The definition of this class
and its instances are covered in x3.
2.1 Children
The first function in the Uniplate library serves as both a function,
and a definition of terminology:
children :: Uniplate ® ) ® ! [® ]
The function children takes a value and returns all maximal
proper substructures of the same type. For example:
children (Add (Neg (Var "x")) (Val 12)) =
[Neg (Var "x");Val 12]
The children function is occasionally useful, but is used more
commonly as an auxiliary in the definition of other functions.
2.2 Queries
The Uniplate library provides a the universe function to support
queries.
universe :: Uniplate ® ) ® ! [® ]
This function takes a data structure, and returns a list of all
structures of the same type found within it. For example:
universe (Add (Neg (Var "x")) (Val 12)) =
[Add (Neg (Var "x")) (Val 12)
;Neg (Var "x")
;Var "x"
;Val 12]
One use of this mechanism for querying was given in the in-
troduction. Using the universe function, queries can be expressed
very concisely. Using a list-comprehension to process the results of
universe is common.
Example 1
Consider the task of counting divisions by the literal 0.
countDivZero :: Expr ! Int
countDivZero x = length [() j Div (Val 0) Ã universe x ]
Here we make essential use of a feature of list comprehensions:
if a pattern does not match, then the item is skipped. In other
2 http://hackage.haskell.org/
syntactic constructs, failing to match a pattern results in a pattern-
match error. ¤
2.3 Bottom-up Transformations
Another common operation provided by many boilerplate removal
systems (La¨mmel and Peyton Jones 2003; Visser 2004; La¨mmel
and Visser 2003; Ren and Erwig 2006) applies a given function to
every subtree of the argument type. We define as standard a bottom-
up transformation.
transform :: Uniplate ® ) (® ! ®) ! ® ! ®
The result of transform f x is f x 0 where x 0 is obtained by
replacing each ®-child xi in x by transform f xi .
Example 2
Suppose we wish to remove the Sub constructor assuming the
equivalence: x¡y ´ x+(¡y). To apply this equivalence as a
rewriting rule, at all possible places in an expression, we define:
simplify x = transform f x
where f (Sub x y) = Add x (Neg y)
f x = x
This code can be read: apply the subtraction rule where you can,
and where you cannot, do nothing. Adding additional rules is easy.
Take for example: x+y = 2¤x where x ´ y . Now we can add
this new rule into our existing transformation:
simplify x = transform f x
where f (Sub x y) = Add x (Neg y)
f (Add x y) j x ´ y = Mul (Val 2) x
f x = x
Each equation corresponds to the natural Haskell translation of
the rule. The transform function manages all the required boiler-
plate. ¤
2.4 Top-Down Transformation
The Scrap Your Boilerplate approach (La¨mmel and Peyton Jones
2003) (known as SYB) provides a top-down transformation named
everywhere0. We describe this traversal, and our reasons for not
providing it, even though it could easily be defined. We instead
provide descend, based on the composOp operator (Bringert and
Ranta 2006).
The everywhere0 f transformation applies f to a value, then
recursively applies the transformation on all the children of the
freshly generated value. Typically, the intention in a transfor-
mation is to apply f to every node exactly once. Unfortunately,
everywhere0 f does not necessarily have this effect.
Example 3
Consider the following transformation:
doubleNeg (Neg (Neg x )) = x
doubleNeg x = x
The intention is clear: remove all instances of double nega-
tion. When applied in a bottom-up manner, this is the result. But
when applied top-down some nodes are missed. Consider the value
Neg (Neg (Neg (Neg (Val 1)))); only the outermost double nega-
tion will be removed. ¤
Example 4
Consider the following transformation:
reciprocal (Div n m) = Mul n (Div (Val 1) m)
reciprocal x = x

Sign up today - FREE

Mendeley saves you time finding and organizing research. Learn more

  • All your research in one place
  • Add and import papers easily
  • Access it anywhere, anytime

Start using Mendeley in seconds!

Already have an account? Sign in

Readership Statistics

8 Readers on Mendeley
by Discipline
 
 
by Academic Status
 
75% Ph.D. Student
 
13% Other Professional
 
13% Student (Postgraduate)
by Country
 
25% United States
 
13% United Kingdom
 
13% China

Groups

pool