Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches

Yasuo Tabei

Conference Proceedings

Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches

Tabei Y

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2012) 7534 LNBI 201-213

DOI: 10.1007/978-3-642-33122-0_16

6Citations

6Readers

Get full text

Abstract

Similarity searches in the databases of chemical fingerprints are a fundamental task in discovering novel drug-like molecules. Multibit trees have a data structure that enables fast similarity searches of chemical fingerprints (Kristensen et al., WABI'09). A standard pointer-based representation of multibit trees consumes a large amount of memory to index large-scale fingerprint databases. To make matters worse, original fingerprint databases need to be stored in memory to filter out false positives. A succinct data structure is compact and enables fast operations. Many succinct data structures have been proposed thus far, and have been applied to many fields such as full text indexing and genome mapping. We present compact representations of both multibit trees and fingerprint databases by applying these data structures. Experiments revealed that memory usage in our representations was much smaller than that of the standard pointer-based representation. Moreover, our representations enabled us to efficiently perform PubChem-scale similarity searches. © 2012 Springer-Verlag.

Cite

CITATION STYLE

APA

Tabei, Y. (2012). Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 7534 LNBI, pp. 201–213). https://doi.org/10.1007/978-3-642-33122-0_16

Succinct multibit tree: Compact representation of multibit trees by using succinct data structures in chemical fingerprint searches

Abstract

Cite

Register to see more suggestions