Current text patternmatching tools are based on regular expressions. However, pure regular expressions have proven too weak a formalism for the task: many interesting patterns either are difficult to describe or cannot be described by regular expressions. Moreover, the inherent nondeterminism of regular expressions does not fit the need to capture specific parts of a match. Motivated by these reasons, most scripting languages nowadays use patternmatching tools that extend the original regularexpression formalism with a set of ad hoc features, such as greedy repetitions, lazy repetitions, possessive repetitions, 'longestmatch rule,' lookahead, etc. These ad hoc extensions bring their own set of problems, such as lack of a formal foundation and complex implementations. In this paper, we propose the use of Parsing Expression Grammars (PEGs) as a basis for pattern matching. Following this proposal, we present LPEG, a patternmatching tool based on PEGs for the Lua scripting language. LPEG unifies the ease of use of patternmatching tools with the full expressive power of PEGs. Because of this expressive power, it can avoid the myriad of ad hoc constructions present in several current patternmatching tools. We also present a Parsing Machine that allows a small and efficient implementation of PEGs for pattern matching. © 2008 John Wiley & Sons, Ltd.
CITATION STYLE
Ierusalimschy, R. (2009). A text patternmatching tool based on parsing expression grammars. Software - Practice and Experience, 39(3), 221–258. https://doi.org/10.1002/spe.892
Mendeley helps you to discover research relevant for your work.