Aspect-level Information Discrepancies across Heterogeneous Vulnerability Reports: Severity, Types and Detection Methods

2Citations
Citations of this article
14Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Vulnerable third-party libraries pose significant threats to software applications that reuse these libraries. At an industry scale of reuse, manual analysis of third-party library vulnerabilities can be easily overwhelmed by the sheer number of vulnerabilities continually collected from diverse sources for thousands of reused libraries. Our study of four large-scale, actively maintained vulnerability databases (NVD, IBM X-Force, ExploitDB, and Openwall) reveals the wide presence of information discrepancies, in terms of seven vulnerability aspects, i.e., product, version, component, vulnerability type, root cause, attack vector, and impact, between the reports for the same vulnerability from heterogeneous sources. It would be beneficial to integrate and cross-validate multi-source vulnerability information, but it demands automatic aspect extraction and aspect discrepancy detection. In this work, we experimented with a wide range of NLP methods to extract named entities (e.g., product) and free-form phrases (e.g., root cause) from textual vulnerability reports and to detect semantically different aspect mentions between the reports. Our experiments confirm the feasibility of applying NLP methods to automate aspect-level vulnerability analysis and identify the need for domain customization of general NLP methods. Based on our findings, we propose a discrepancy-aware, aspect-level vulnerability knowledge graph and a KG-based web portal that integrates diversified vulnerability key aspect information from heterogeneous vulnerability databases. Our conducted user study proves the usefulness of our web portal. Our study opens the door to new types of vulnerability integration and management, such as vulnerability portraits of a product and explainable prediction of silent vulnerabilities.

Cite

CITATION STYLE

APA

Sun, J., Xing, Z., Xia, X., Lu, Q., Xu, X., & Zhu, L. (2023). Aspect-level Information Discrepancies across Heterogeneous Vulnerability Reports: Severity, Types and Detection Methods. ACM Transactions on Software Engineering and Methodology, 33(2). https://doi.org/10.1145/3624734

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free