Does an outage impact any users? Can a geolocation database known to be good at locating users and bad at infrastructure be trusted for a particular prefix? Is a content-heavy network likely to peer with a particular network? For these questions and many more, knowing which prefixes contain Internet users aids in interpreting Internet analysis. However, existing datasets of Internet activity are out of date, unvalidated, based on privileged data, or too coarse. As a step towards identifying which IP prefixes contain users, we present multiple novel techniques to identify which IP prefixes host web clients without relying on privileged data. Our techniques identify client activity in ASes responsible for 98.8% of Microsoft CDN traffic and in prefixes responsible for 95.2% of Microsoft CDN traffic. Less than 1% of prefixes identified by our technique as active do not contact Microsoft at all. We present measurements of Internet usage worldwide and sketch future directions for extending the techniques to measure relative activity levels across prefixes.
CITATION STYLE
Jiang, W., Luo, T., Koch, T., Zhang, Y., Katz-Bassett, E., & Calder, M. (2021). Towards identifying networks with internet clients using public data. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC (pp. 753–762). Association for Computing Machinery. https://doi.org/10.1145/3487552.3487844
Mendeley helps you to discover research relevant for your work.