Different Web log studies calculate the same metrics using different search engines logs sampled during different observation periods and processed under different values of two controllable variables peculiar to the Web log analysis: a client discriminator used to exclude clients who are agents and a temporal cut-off used to segment logged client transactions into temporal sessions. How much are the results dependent on these variables? We analyze the sensitivity of the results to two controllable variables. The sensitivity analysis shows significant varying of the metrics values depending on these variables. In particular, the metrics varies up to 30-50% on the commonly assigned values. So the differences caused by controllable variables are of the same order of magnitude as the differences between the metrics reported in different studies. Thus, the direct comparison of the reported results is an unreliable approach leading to artifactual conclusions. To overcome the method-dependency of the direct comparison of the reported results we introduce and use a cross-analysis technique of the direct comparison of logs. Besides, we propose an alternative easy-accessible comparison of the reported metrics, which corrects the reported values accordingly to the controllable variables used in the studies. © Springer-Verlag Berlin Heidelberg 2006.
CITATION STYLE
Buzikashvili, N. (2006). Comparing Web logs: Sensitivity analysis and two types of cross-analysis. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4182 LNCS, pp. 508–513). Springer Verlag. https://doi.org/10.1007/11880592_39
Mendeley helps you to discover research relevant for your work.