Analysis of differences between expected and observed probability of accesses to web pages

Jozef Kapusta; Michal Munk; Martin Drlík

Journal Article

Analysis of differences between expected and observed probability of accesses to web pages

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2014) 8733 673-683

DOI: 10.1007/978-3-319-11289-3_68

5Citations

7Readers

Get full text

Abstract

The paper introduces an alternative method for website analysis that combines two web mining research fields - discovering of web users’ behaviour patterns as well as discovering knowledge from the website structure. The main objective of the paper is to identify the web pages, in which the value of importance of these web pages, estimated by the website developers, does not correspond to the actual perception of these web pages by the visitors. The paper presents a case study, which used the proposed method of the identification suspicious web pages using the analysis of expected and observed probabilities of accesses to the web pages. The expected probabilities were calculated using the PageRank method and observed probabilities were obtained from the web server log file. The observed and expected data were compared using the residual analysis. The obtained results can be successfully used for the identification of potential problems with the structure of the observed website.

Author supplied keywords

Cite

CITATION STYLE

APA

Kapusta, J., Munk, M., & Drlík, M. (2014). Analysis of differences between expected and observed probability of accesses to web pages. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8733, 673–683. https://doi.org/10.1007/978-3-319-11289-3_68

Analysis of differences between expected and observed probability of accesses to web pages

Abstract

Author supplied keywords

Cite

Register to see more suggestions