Abstract
The paper introduces an alternative method for website analysis that combines two web mining research fields - discovering of web users’ behaviour patterns as well as discovering knowledge from the website structure. The main objective of the paper is to identify the web pages, in which the value of importance of these web pages, estimated by the website developers, does not correspond to the actual perception of these web pages by the visitors. The paper presents a case study, which used the proposed method of the identification suspicious web pages using the analysis of expected and observed probabilities of accesses to the web pages. The expected probabilities were calculated using the PageRank method and observed probabilities were obtained from the web server log file. The observed and expected data were compared using the residual analysis. The obtained results can be successfully used for the identification of potential problems with the structure of the observed website.
Author supplied keywords
Cite
CITATION STYLE
Kapusta, J., Munk, M., & Drlík, M. (2014). Analysis of differences between expected and observed probability of accesses to web pages. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8733, 673–683. https://doi.org/10.1007/978-3-319-11289-3_68
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.