How to Count Bots in Longitudinal Datasets of IP Addresses

Leon Böck; Dave Levin; Ramakrishna Padmanabhan; Christian Doerr; Max Mühlhäuser

Conference ProceedingsOPEN ACCESS

How to Count Bots in Longitudinal Datasets of IP Addresses

30th Annual Network and Distributed System Security Symposium, NDSS 2023 (2023)

DOI: 10.14722/ndss.2023.24002

2Citations

9Readers

Get full text

Abstract

—Estimating the size of a botnet is one of the most basic and important queries one can make when trying to understand the impact of a botnet. Surprisingly and unfortunately, this seemingly simple task has confounded many measurement efforts. While it may seem tempting to simply count the number of IP addresses observed to be infected, it is well-known that doing so can lead to drastic overestimates, as ISPs commonly assign new IP addresses to hosts. As a result, estimating the number of infected hosts given longitudinal datasets of IP addresses has remained an open problem. In this paper, we present a new data analysis technique, CARDCount, that provides more accurate size estimations by accounting for IP address reassignments. CARDCount can be applied on longer windows of observations than prior approaches (weeks compared to hours), and is the first technique of its kind to provide confidence intervals for its size estimations. We evaluate CARDCount on three real world datasets and show that it performs equally well to existing solutions on synthetic ideal situations, but drastically outperforms all previous work in realistic botnet situations. For the Hajime and Mirai botnets, we estimate that CARDCount, is 51.6% and 69.1% more accurate than the state of the art techniques when estimating the botnet size over a 28-day window.

Cite

CITATION STYLE

APA

Böck, L., Levin, D., Padmanabhan, R., Doerr, C., & Mühlhäuser, M. (2023). How to Count Bots in Longitudinal Datasets of IP Addresses. In 30th Annual Network and Distributed System Security Symposium, NDSS 2023. The Internet Society. https://doi.org/10.14722/ndss.2023.24002

How to Count Bots in Longitudinal Datasets of IP Addresses

Abstract

Cite

Register to see more suggestions