For decades, Internet protocols have been specified using natural language. Given the ambiguity inherent in such text, it is not surprising that protocol implementations have long exhibited bugs. In this paper, we apply natural language processing (NLP) to effect semi-automated generation of protocol implementations from specification text. Our system, Sage, can uncover ambiguous or under-specified sentences in specifications; once these are clarified by the author of the protocol specification, Sage can generate protocol code automatically. Using Sage, we discover 5 instances of ambiguity and 6 instances of under-specification in the ICMP RFC; after fixing these, Sage is able to automatically generate code that interoperates perfectly with Linux implementations. We show that Sage generalizes to sections of BFD, IGMP, and NTP and identify additional conceptual components that Sage needs to support to generalize to complete, complex protocols like BGP and TCP.
CITATION STYLE
Yen, J., Lévai, T., Ye, Q., Ren, X., Govindan, R., & Raghavan, B. (2021). Semi-automated protocol disambiguation and code generation. In SIGCOMM 2021 - Proceedings of the ACM SIGCOMM 2021 Conference (pp. 272–286). Association for Computing Machinery, Inc. https://doi.org/10.1145/3452296.3472910
Mendeley helps you to discover research relevant for your work.