BioPandas: Working with molecular structures in pandas DataFrames

  • Raschka S
N/ACitations
Citations of this article
51Readers
Mendeley users who have this article in their library.

Abstract

BioPandas is a Python library that reads molecular structures from 3D-coordinate files, such as PDB (H. M. Berman 2000) (H. Berman, Henrick, and Nakamura 2003) and MOL2, into pandas DataFrames (McKinney and Others 2010) for convenient data analysis and data mining related tasks. In addition to parsing protein and small molecule data into a data frame format, BioPan-das provides additional utility functions for structure analysis. These functions include common computations such as computing the root-mean-squared-deviation between structures and converting protein structures into primary amino acid sequence formats. Furthermore, useful small-molecule related functions are provided for reading and parsing millions of small molecule structures (from multi-MOL2 files (Tripos 2007)) fast and efficiently in virtual screening applications. Inbuilt functions for filtering molecules by the presence of functional groups and their pair-wise distances to each other make BioPandas a particularly attractive utility library for virtual screening and protein-ligand docking applications.

Cite

CITATION STYLE

APA

Raschka, S. (2017). BioPandas: Working with molecular structures in pandas DataFrames. The Journal of Open Source Software, 2(14), 279. https://doi.org/10.21105/joss.00279

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free