Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches

6Citations
Citations of this article
6Readers
Mendeley users who have this article in their library.
Get full text

Abstract

Similarity searches in the databases of chemical fingerprints are a fundamental task in discovering novel drug-like molecules. Multibit trees have a data structure that enables fast similarity searches of chemical fingerprints (Kristensen et al., WABI'09). A standard pointer-based representation of multibit trees consumes a large amount of memory to index large-scale fingerprint databases. To make matters worse, original fingerprint databases need to be stored in memory to filter out false positives. A succinct data structure is compact and enables fast operations. Many succinct data structures have been proposed thus far, and have been applied to many fields such as full text indexing and genome mapping. We present compact representations of both multibit trees and fingerprint databases by applying these data structures. Experiments revealed that memory usage in our representations was much smaller than that of the standard pointer-based representation. Moreover, our representations enabled us to efficiently perform PubChem-scale similarity searches. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Tabei, Y. (2012). Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7534 LNBI, pp. 201–213). https://doi.org/10.1007/978-3-642-33122-0_16

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free