We study the problem of estimating the total volume of queries of a specific domain, which were submitted to the Google search engine in a given time period. Our statistical model assumes a Zipf's law distribution of the population in the reference domain, and a nonuniform or noisy sampling of queries. Parameters of the distribution are estimated using nonlinear least square regression. Estimations with errors are then derived for the total number of queries and for the total number of searches (volume). We apply the method on the recipes and cooking domain, where a sample of queries is collected by crawling popular Italian websites specialized on this domain. The relative volumes of queries in the sample are computed using Google Trends, and transformed to absolute frequencies after estimating a scaling factor. Our model estimates that the volume of Italian recipes and cooking queries submitted to Google in 2017 and with at least 10 monthly searches consists of 7.2B searches.
CITATION STYLE
Lillo, F., & Ruggieri, S. (2019). Estimating the total volume of queries to google. In The Web Conference 2019 - Proceedings of the World Wide Web Conference, WWW 2019 (pp. 1051–1060). Association for Computing Machinery, Inc. https://doi.org/10.1145/3308558.3313535
Mendeley helps you to discover research relevant for your work.