The main task in consensus clustering is to produce an optimal output clustering based on a set of input clusterings. The co-association matrix based consensus clustering methods are easy to understand and implement. However, they usually have high computational cost with big datasets, which restricts their applications. We propose a sequential three-way approach to constructing the co-association matrix progressively in multiple stages. In each stage, based on a set of input clusterings, we evaluate how likely two data points are associated and accordingly, divide a set of data-point pairs into three disjoint positive, negative and boundary regions. A data-point pair in the positive region is associated with a definite decision of clustering the two data points together. A pair in the negative region is associated with a definite decision of separating the two data points into different clusters. For a pair in the boundary region, we do not have sufficient information to make a definite decision. The decision on such a pair is deferred into the next stage where more input clusterings will be involved. By making quick decisions on early stages, the overall computational cost of constructing the matrix and the consensus clustering may be reduced.
CITATION STYLE
Hu, M., Deng, X., & Yao, Y. (2018). A Sequential Three-Way Approach to Constructing a Co-association Matrix in Consensus Clustering. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11103 LNAI, pp. 599–613). Springer Verlag. https://doi.org/10.1007/978-3-319-99368-3_47
Mendeley helps you to discover research relevant for your work.