A simple, possibly correct LR parser for C11

5Citations
Citations of this article
11Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The syntax of the C programming language is described in the C11 standard by an ambiguous context-free grammar, accompanied with English prose that describes the concept of "scope" and indicates how certain ambiguous code fragments should be interpreted. Based on these elements, the problem of implementing a compliant C11 parser is not entirely trivial. We review the main sources of dificulty and describe a relatively simple solution to the problem. Our solution employs the well-known technique of combining an LALR(1) parser with a "lexical feedback" mechanism. It draws on folklore knowledge and adds several original aspects, including a twist on lexical feedback that allows a smooth interaction with lookahead; a simplified and powerful treatment of scopes; and a few amendments in the grammar. Although not formally verified, our parser avoids several pitfalls that other implementations have fallen prey to. We believe that its simplicity, its mostly declarative nature, and its high similarity with the C11 grammar are strong informal arguments in favor of its correctness. Our parser is accompanied with a small suite of "tricky" C11 programs. We hope that it may serve as a reference or a starting point in the implementation of compilers and analysis tools.

Cite

CITATION STYLE

APA

Jourdan, J. H., & Pottier, F. (2017). A simple, possibly correct LR parser for C11. ACM Transactions on Programming Languages and Systems, 39(4). https://doi.org/10.1145/3064848

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free