Recently, underground forums play a crucial role in trading and exchanging leaked personal information. Meanwhile, the forums have been gradually used as data breaches' information sources. Therefore, it shows an upward trend in announcing the results of data theft by posting in the forums. Identifying these threads can make the compromised third-party respond quickly to the data breach incident. For this purpose, we presented a system to identify the threads which are related to data breaches automatically. The system can monitor and discover data breaches in underground forums in real-time. In addition, the study further revealed the wording characteristics of the threads by applying the feature extraction method based on LDA topic model. In this paper, the data set was collected from the surface web and the dark web. Besides, to improve the performance of the system, we compared various supervised classification algorithms in this application scenario and selected the best method for the classifier. Through the system, we identified more than 92% of data breach threads on the experimental data set.
CITATION STYLE
Fang, Y., Guo, Y., Huang, C., & Liu, L. (2019). Analyzing and Identifying Data Breaches in Underground Forums. IEEE Access, 7, 48770–48777. https://doi.org/10.1109/ACCESS.2019.2910229
Mendeley helps you to discover research relevant for your work.