A content-addressable DNA database with learned sequence encodings

Kendall Stewart; Yuan Jyue Chen; David Ward; Xiaomeng Liu; Georg Seelig; Karin Strauss; Luis Ceze

Conference Proceedings

A content-addressable DNA database with learned sequence encodings

Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (2018) 11145 LNCS 55-70

DOI: 10.1007/978-3-030-00030-1_4

22Citations

48Readers

Get full text

Abstract

We present strand and codeword design schemes for a DNA database capable of approximate similarity search over a multidimensional dataset of content-rich media. Our strand designs address cross-talk in associative DNA databases, and we demonstrate a novel method for learning DNA sequence encodings from data, applying it to a dataset of tens of thousands of images. We test our design in the wetlab using one hundred target images and ten query images, and show that our database is capable of performing similarity-based enrichment: on average, visually similar images account for 30% of the sequencing reads for each query, despite making up only 10% of the database.

Cite

CITATION STYLE

APA

Stewart, K., Chen, Y. J., Ward, D., Liu, X., Seelig, G., Strauss, K., & Ceze, L. (2018). A content-addressable DNA database with learned sequence encodings. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 11145 LNCS, pp. 55–70). Springer Verlag. https://doi.org/10.1007/978-3-030-00030-1_4

A content-addressable DNA database with learned sequence encodings

Abstract

Cite

Register to see more suggestions