Abstract
Searching and mining nuclear magnetic resonance (NMR)-spectra of naturally occurring products is an important task to investigate new potentially useful chemical compounds. We develop a set-based similarity function, which, however, does not sufficiently capture more abstract aspects of similarity. NMR-spectra are like documents, but consists of continuous multi-dimensional points instead of words. Probabilistic semantic indexing (PLSI) is an retrieval method, which learns hidden topics. We develop several mappings from continuous NMR-spectra to discrete text-like data. The new mappings include redundancies into the discrete data, which proofs helpful for the PLSI-model used afterwards. Our experiments show that PLSI, which is designed for text data created by humans, can effectively handle the mapped NMR-data originating from natural products. Additionally, PLSI combined with the new mappings is able to find meaningful "topics" in the NMR-data. © Springer-Verlag Berlin Heidelberg 2006.
Cite
CITATION STYLE
Wolfram, K., Porzel, A., & Hinneburg, A. (2006). Similarity search for multi-dimensional NMR-spectra of natural products. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Vol. 4213 LNAI, pp. 650–658). Springer Verlag. https://doi.org/10.1007/11871637_67
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.