How to Count Bots in Longitudinal Datasets of IP Addresses

2Citations
Citations of this article
9Readers
Mendeley users who have this article in their library.
Get full text

Abstract

—Estimating the size of a botnet is one of the most basic and important queries one can make when trying to understand the impact of a botnet. Surprisingly and unfortunately, this seemingly simple task has confounded many measurement efforts. While it may seem tempting to simply count the number of IP addresses observed to be infected, it is well-known that doing so can lead to drastic overestimates, as ISPs commonly assign new IP addresses to hosts. As a result, estimating the number of infected hosts given longitudinal datasets of IP addresses has remained an open problem. In this paper, we present a new data analysis technique, CARDCount, that provides more accurate size estimations by accounting for IP address reassignments. CARDCount can be applied on longer windows of observations than prior approaches (weeks compared to hours), and is the first technique of its kind to provide confidence intervals for its size estimations. We evaluate CARDCount on three real world datasets and show that it performs equally well to existing solutions on synthetic ideal situations, but drastically outperforms all previous work in realistic botnet situations. For the Hajime and Mirai botnets, we estimate that CARDCount, is 51.6% and 69.1% more accurate than the state of the art techniques when estimating the botnet size over a 28-day window.

Cite

CITATION STYLE

APA

Böck, L., Levin, D., Padmanabhan, R., Doerr, C., & Mühlhäuser, M. (2023). How to Count Bots in Longitudinal Datasets of IP Addresses. In 30th Annual Network and Distributed System Security Symposium, NDSS 2023. The Internet Society. https://doi.org/10.14722/ndss.2023.24002

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free