Toppling Top Lists: Evaluating the Accuracy of Popular Website Lists

34Citations
Citations of this article
16Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Researchers rely on lists of popular websites like the Alexa Top Million both to measure the web and to evaluate proposed protocols and systems. Prior work has questioned the correctness and consistency of these lists, but without ground truth data to compare against, there has been no direct evaluation of list accuracy. In this paper, we evaluate the relative accuracy of the most popular top lists of websites. We derive a set of popularity metrics from server-side requests seen at Cloudflare, which authoritatively serves a significant portion of the most popular websites. We evaluate top lists against these metrics and show that most lists capture web popularity poorly, with the exception of the Chrome User Experience Report (CrUX) dataset, which is the most accurate top list compared to Cloudflare across all metrics. We explore the biases that lower the accuracy of other lists, and we conclude with recommendations for researchers studying the web in the future.

Cite

CITATION STYLE

APA

Ruth, K., Kumar, D., Wang, B., Valenta, L., & Durumeric, Z. (2022). Toppling Top Lists: Evaluating the Accuracy of Popular Website Lists. In Proceedings of the ACM SIGCOMM Internet Measurement Conference, IMC (pp. 374–387). Association for Computing Machinery. https://doi.org/10.1145/3517745.3561444

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free