Parallel ward clustering for chemical compounds using mapreduce

N/ACitations
Citations of this article
3Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The availability of chemical libraries with millions of compounds makes the process of identifying similar chemical compounds more challengeable. Compounds with similar structure are likely to exhibit similar biological activity. So, the identification of these compounds is a key step in the drug discovery process. Hierarchical clustering is developed for that purpose. One of the most popular hierarchical clustering algorithms that are used in many applications in the drug discovery process is ward clustering algorithm. A fundamental problem with the previous implementations of this clustering method is its limitation to handle large data sets within a reasonable time and memory resources. In this paper, MapReduce framework is used to run ward clustering algorithm in parallel manner. The results show considerable reduction in computational time. The parallel ward algorithm saves 17% of time using 3 map instances and saves 58% of time using 6 map instances.

Cite

CITATION STYLE

APA

Malhat, M. G., Mousa, H. M., & El-Sisi, A. B. (2014). Parallel ward clustering for chemical compounds using mapreduce. In Communications in Computer and Information Science (Vol. 488, pp. 258–267). Springer Verlag. https://doi.org/10.1007/978-3-319-13461-1_25

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free