Double-Constrained Consensus Clustering with Application to Online Anti-Counterfeiting

0Citations
Citations of this article
8Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Semi-supervised consensus clustering is a promising strategy to compensate for the subjectivity of clustering and its sensitivity to design factors, with various techniques being recently proposed to integrate domain knowledge and multiple clustering partitions. In this article, we present a new approach that makes double use of domain knowledge, namely to build the initial partitions, as well as to combine them. In particular, we show how to model and integrate must-link and cannot-link constraints into the objective function of a generic consensus clustering ((Formula presented.)) framework that maximizes the similarity between the consensus partition and the input partitions, which have, in turn, been enriched with the same constraints. In addition, borrowing from the theory of functional dependencies, the integrated framework exploits the notions of deductive closure and minimal cover to take full advantage of the logical implication between constraints. Using standard UCI benchmarks, we found that the resulting algorithm, termed (Formula presented.) double-constrained consensus clustering), was more effective than plain (Formula presented.) at combining base-constrained partitions, with an average performance improvement of 5.54%. We then argue that (Formula presented.) is especially well-suited for profiling counterfeit e-commerce websites, as constraints can be acquired by leveraging specific domain features, and demonstrate its potential for detecting affiliate marketing programs. Taken together, our experiments suggest that (Formula presented.) makes the process of clustering more robust and able to withstand changes in clustering algorithms, datasets, and features, with a remarkable improvement in average performance.

Cite

CITATION STYLE

APA

Carpineto, C., & Romano, G. (2023). Double-Constrained Consensus Clustering with Application to Online Anti-Counterfeiting. Applied Sciences (Switzerland), 13(18). https://doi.org/10.3390/app131810050

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free