Interval set clustering of web users with rough K-means

Pawan Lingras; Chad West

Journal Article

Interval set clustering of web users with rough K-means

Journal of Intelligent Information Systems (2004) 23(1) 5-16

DOI: 10.1023/B:JIIS.0000029668.88665.1a

465Citations

60Readers

Get full text

Abstract

Data collection and analysis in web mining faces certain unique challenges. Due to a variety of reasons inherent in web browsing and web logging, the likelihood of bad or incomplete data is higher than conventional applications. The analytical techniques in web mining need to accommodate such data. Fuzzy and rough sets provide the ability to deal with incomplete and approximate information. Fuzzy set theory has been shown to be useful in three important aspects of web and data mining, namely clustering, association, and sequential analysis. There is increasing interest in research on clustering based on rough set theory. Clustering is an important part of web mining that involves finding natural groupings of web resources or web users. Researchers have pointed out some important differences between clustering in conventional applications and clustering in web mining. For example, the clusters and associations in web mining do not necessarily have crisp boundaries. As a result, researchers have studied the possibility of using fuzzy sets in web mining clustering applications. Recent attempts have used genetic algorithms based on rough set theory for clustering. However, the genetic algorithms based clustering may not be able to handle the large amount of data typical in a web mining application. This paper proposes a variation of the K-means clustering algorithm based on properties of rough sets. The proposed algorithm represents clusters as interval or rough sets. The paper also describes the design of an experiment including data collection and the clustering process. The experiment is used to create interval set representations of clusters of web visitors.

Author supplied keywords

Cite

CITATION STYLE

APA

Lingras, P., & West, C. (2004). Interval set clustering of web users with rough K-means. Journal of Intelligent Information Systems, 23(1), 5–16. https://doi.org/10.1023/B:JIIS.0000029668.88665.1a

Interval set clustering of web users with rough K-means

Abstract

Author supplied keywords

Cite

Register to see more suggestions