Modern programming languages often provide functions to manipulate regular expressions in standard libraries. If they offer support for advanced features, the matching algorithm has an exponential worst-case time complexity: for some so-called vulnerable regular expressions, an attacker can craft ad hoc strings to force the matcher to exhibit an exponential behaviour and perform a Regular Expression Denial of Service (ReDoS) attack. In this paper, we introduce a framework based on a tree semantics to statically identify ReDoS vulnerabilities. In particular, we put forward an algorithm to extract an overapproximation of the set of words that are dangerous for a regular expression, effectively catching all possible attacks. We have implemented the analysis in a tool called rat, and testing it on a dataset of 74,670 regular expressions, we observed that in 99.47% of the instances the analysis terminates in less than one second. We compared rat to four other ReDoS detectors, and we found that our tool is faster, often by orders of magnitude, than most other tools. While raising a low number of false positives, rat is the only ReDoS detector that does not report false negatives.
CITATION STYLE
Parolini, F., & Miné, A. (2022). Sound Static Analysis of Regular Expressions for Vulnerabilities to Denial of Service Attacks. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 13299 LNCS, pp. 73–91). Springer Science and Business Media Deutschland GmbH. https://doi.org/10.1007/978-3-031-10363-6_6
Mendeley helps you to discover research relevant for your work.