Verification of distributed software is a challenging task. This paper reports on modeling and verification of a consensus algorithm developed by Thales. The algorithm has an arbitrary number of processes (nodes), which can possibly fail and restart at any time. Communications between nodes are periodic, but completely asynchronous. The goal of this algorithm is that, after a given amount of time since the last status change, the network of nodes agrees on a list of working nodes. Our verification approach is based on modeling both the source code of the algorithm and the possible interleavings of executions. We present how we were able to scale up to 100 processes using the rely-guarantee based technique. Some of the initially expected properties did not hold, and generated counter-examples helped to fix and prove them. We also successfully verified other consensus algorithms at Thales with the same approach. We describe our experiments on applying several model-checking tools and a symbolic execution tool, and present some lessons learned.
CITATION STYLE
Kosmatov, N., Longuet, D., & Soulat, R. (2020). Formal Verification of an Industrial Distributed Algorithm: An Experience Report. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 12476 LNCS, pp. 525–542). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-030-61362-4_30
Mendeley helps you to discover research relevant for your work.