Web usage mining: A survey on preprocessing of web log file

  • Hussain T
  • Asghar S
  • Masood N
  • 31


    Mendeley users who have this article in their library.
  • 31


    Citations of this article.


Web applications are increasing at an enormous speed and its users are increasing at exponential speed. The evolutionary changes in technology have made it possible to capture the users' essence and interactions with web applications through web server log file. Web log file is saved as text (.txt) file. Due to large amount of “irrelevant information” in the web log, the original log file can not be directly used in the web usage mining (WUM) procedure. Therefore the preprocessing of web log file becomes imperative. The proper analysis of web log file is beneficial to manage the web sites effectively for administrative and users' prospective. Web log preprocessing is initial necessary step to improve the quality and efficiency of the later steps of WUM. There are number of techniques available at preprocessing level of WUM. Different techniques are applied at preprocessing level such as data cleaning, data filtering, and data integration. In this paper, we will survey the preprocessing techniques to identify the issues and how WUM preprocessing can be improved for pattern mining and analysis.

Author-supplied keywords

  • Data mining
  • Preprocessing
  • Web usage mining

Get free article suggestions today

Mendeley saves you time finding and organizing research

Sign up here
Already have an account ?Sign in

Find this document

Get full text


  • Tasawar Hussain

  • Sohail Asghar

  • Nayyer Masood

Cite this document

Choose a citation style from the tabs below

Save time finding and organizing research with Mendeley

Sign up for free