New Tricks to Old Codes: Can AI Chatbots Replace Static Code Analysis Tools?

Omer Said Ozturk; Emre Ekmekcioglu; Orcun Cetin; Budi Arief; Julio Hernandez-Castro

Conference ProceedingsOPEN ACCESS

New Tricks to Old Codes: Can AI Chatbots Replace Static Code Analysis Tools?

ACM International Conference Proceeding Series (2023) 13-18

DOI: 10.1145/3590777.3590780

9Citations

29Readers

Abstract

The prevalence and significance of web services in our daily lives make it imperative to ensure that they are - as much as possible - free from vulnerabilities. However, developing a complex piece of software free from any security vulnerabilities is hard, if not impossible. One way to progress towards achieving this holy grail is by using static code analysis tools to root out any common or known vulnerabilities that may accidentally be introduced during the development process. Static code analysis tools have significantly contributed to addressing the problem above, but are imperfect. It is conceivable that static code analysis can be improved by using AI-powered tools, which have recently increased in popularity. However, there is still very little work in analysing both types of tools' effectiveness, and this is a research gap that our paper aims to fill. We carried out a study involving 11 static code analysers, and one AI-powered chatbot named ChatGPT, to assess their effectiveness in detecting 92 vulnerabilities representing the top 10 known vulnerability categories in web applications, as classified by OWASP. We particularly focused on PHP vulnerabilities since it is one of the most widely used languages in web applications. However, it has few security mechanisms to help its software developers. We found that the success rate of ChatGPT in terms of finding security vulnerabilities in PHP is around 62-68%. At the same time, the best traditional static code analyser tested has a success rate of 32%. Even combining several traditional static code analysers (with the best features on certain aspects of detection) would only achieve a rate of 53%, which is still significantly lower than ChatGPT's success rate. Nonetheless, ChatGPT has a very high false positive rate of 91%. In comparison, the worst false positive rate of any traditional static code analyser is 82%. These findings highlight the promising potential of ChatGPT for improving the static code analysis process but reveal certain caveats (especially regarding accuracy) in its current state. Our findings suggest that one interesting possibility to explore in future works would be to pick the best of both worlds by combining traditional static code analysers with ChatGPT to find security vulnerabilities more effectively.

Author supplied keywords

ChatGPT · AI · Static code analysis · PHP vulnerabilities · Tools evaluation · Vulnerability detection · AI in cyber security

Cite

CITATION STYLE

APA

Ozturk, O. S., Ekmekcioglu, E., Cetin, O., Arief, B., & Hernandez-Castro, J. (2023). New Tricks to Old Codes: Can AI Chatbots Replace Static Code Analysis Tools? In ACM International Conference Proceeding Series (pp. 13–18). Association for Computing Machinery. https://doi.org/10.1145/3590777.3590780

New Tricks to Old Codes: Can AI Chatbots Replace Static Code Analysis Tools?

Abstract

Author supplied keywords

Cite

Register to see more suggestions