A text patternmatching tool based on parsing expression grammars

  • Ierusalimschy R
  • 45

    Readers

    Mendeley users who have this article in their library.
  • 17

    Citations

    Citations of this article.

Abstract

Current text pattern-matching tools are based on regular expressions. However, pure regular expressions have proven too weak a formalism for the task: many interesting patterns either are difficult to describe or cannot be described by regular expressions. Moreover, the inherent non- determinism of regular expressions does not fit the need to capture specific parts of a match. Motivated by these reasons, most scripting languages nowadays use pattern-matching tools that extend the original regular-expression formalism with a set of ad-hoc features, such as greedy repetitions, lazy repetitions, possessive repetitions, “longest match rule”, lookahead, etc. These ad-hoc extensions bring their own set of problems, such as lack of a formal foundation and complex implementations. In this paper, we propose the use of Parsing Expression Grammars (PEGs) as a basis for pattern matching. Following this proposal, we present LPEG, a pattern-matching tool based on PEGs for the Lua scripting language. LPEG unifies the ease of use of pattern-matching tools with the full expressive power of PEGs. Because of this expressive power, it can avoid the myriad of ad-hoc constructions present in several current pattern-matching tools. We also present a Parsing Machine that allows a small and efficient implementation of PEGs for pattern matching.

Author-supplied keywords

  • Parsing expression grammars
  • Pattern matching
  • Scripting languages

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Authors

  • Roberto Ierusalimschy

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free