Web applications are increasing at an enormous speed and its users are increasing at exponential speed. The evolutionary changes in technology have made it possible to capture the users' essence and interactions with web applications through web server log file. Web log file is saved as text (.txt) file. Due to large amount of “irrelevant information” in the web log, the original log file can not be directly used in the web usage mining (WUM) procedure. Therefore the preprocessing of web log file becomes imperative. The proper analysis of web log file is beneficial to manage the web sites effectively for administrative and users' prospective. Web log preprocessing is initial necessary step to improve the quality and efficiency of the later steps of WUM. There are number of techniques available at preprocessing level of WUM. Different techniques are applied at preprocessing level such as data cleaning, data filtering, and data integration. In this paper, we will survey the preprocessing techniques to identify the issues and how WUM preprocessing can be improved for pattern mining and analysis.
Mendeley saves you time finding and organizing research
Choose a citation style from the tabs below