Chemical informatics functionality in R

245Citations
Citations of this article
189Readers
Mendeley users who have this article in their library.

Abstract

The flexibility and scope of the R programming environment has made it a popular choice for statistical modeling and scientific prototyping in a number of fields. In the field of chemistry, R provides several tools for a variety of problems related to statistical modeling of chemical information. However, one aspect common to these tools is that they do not have direct access to the information that is available from chemical structures, such as contained in molecular descriptors. We describe the rcdk package that provides the R user with access to the CDK, a Java framework for cheminformatics. As a result, it is possible to read in a variety of molecular formats, calculate molecular descriptors and evaluate fingerprints. In addition, we describe the rpubchem that will allow access to the data in PubChem, a public repository of molecular structures and associated assay data for approximately 8 million compounds. Currently, the package allows access to structural information as well as some simple molecular properties from PubChem. In addition the package allows access to bio-assay data from the PubChem FTP servers.

Author supplied keywords

Cite

CITATION STYLE

APA

Guha, R. (2007). Chemical informatics functionality in R. Journal of Statistical Software, 18(5), 1–16. https://doi.org/10.18637/jss.v018.i05

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free