Enormous amount of e-data is collected world-wide by organizations for the purpose of their research and decision making. The availability of this heterogeneous, sensitive information in e-databases poses a threat to the privacy of the individual or organization on which the data is collected. Privacy Preserving Data Mining [PPDM] is a field of research which concentrates on preserving data privacy during the process of data mining. This paper proposes a two level partition and perturbation frame work to release multiple copies of privacy preserved datasets in Multi Trust Level [MTL] scenario that can prevent linking and diversity attack. The framework proposes two methods namely, Entropy based Attribute Privacy Preservation [EAPP] and Information Gain based Attribute Privacy Preservation [IGAPP] for privacy preservation in MTL environment. The two methods perform vertical and horizontal partitioning of data for privacy preservation. Simple K-Means clustering algorithm with cluster size 2 using both Euclidean and Manhattan distance functions are used for horizontal partitioning. The vertical partitioning of attributes within the cluster is performed based on their entropy value that indicates its one way association with its class in EAPP method and Information Gain [IG] value of the attributes that indicates the two way associations with class in IGAPP method. The attributes in the clusters are subjected to privacy preservation technique based on their entropy and IG values in EAPP and IGAPP methods, respectively. The effect of distance in clustering the data points on privacy preservation and the ability of the privacy preserved datasets generated using the proposed methods to prevent privacy attacks are studied using variance, rank distortion and utility metrics. Real life medical and bench mark adult data sets have been used here for experimentation. The results show that the generated datasets exhibit good variance and rank distortion values and hence can prevent diversity and linking attacks in MTL environment. Also, the privacy preserved datasets have comparable utility on selected classification and clustering algorithms with original and L-Diversified datasets.
CITATION STYLE
Priyadarsini, R. P., Valarmathi, M. L., & Sivakumari, S. (2015). Attribute association based privacy preservation for multi trust level environment. Sadhana - Academy Proceedings in Engineering Sciences, 40(6), 1769–1792. https://doi.org/10.1007/s12046-015-0412-4
Mendeley helps you to discover research relevant for your work.