Island grammar-based parsing using gll and tom

7Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Extending a language by embedding within it another language presents significant parsing challenges, especially if the embedding is recursive. The composite grammar is likely to be nondeterministic as a result of tokens that are valid in both the host and the embedded language. In this paper we examine the challenges of embedding the Tom language into a variety of general-purpose high level languages. Tom provides syntax and semantics for advanced pattern matching and tree rewriting facilities. Embedded Tom constructs are translated into the host language by a preprocessor, the output of which is a composite program written purely in the host language. Tom implementations exist for Java, C, C#, Python and Caml. The current parser is complex and difficult to maintain. In this paper, we describe how Tom can be parsed using island grammars implemented with the Generalised LL (GLL) parsing algorithm. The grammar is, as might be expected, ambiguous. Extracting the correct derivation relies on our disambiguation strategy which is based on pattern matching within the parse forest. We describe different classes of ambiguity and propose patterns for resolving them. © 2013 Springer-Verlag Berlin Heidelberg.

Cite

CITATION STYLE

APA

Afroozeh, A., Bach, J. C., Van Den Brand, M., Johnstone, A., Manders, M., Moreau, P. E., & Scott, E. (2013). Island grammar-based parsing using gll and tom. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7745 LNCS, pp. 224–243). https://doi.org/10.1007/978-3-642-36089-3_13

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free