Data mining of macromolecular structures

Bart van Beusekom; Anastassis Perrakis; Robbie P. Joosten

Book Chapter

Data mining of macromolecular structures

Humana Press Inc., (2016), 107-138

DOI: 10.1007/978-1-4939-3572-7_6

9Citations

20Readers

Get full text

Abstract

The use of macromolecular structures is widespread for a variety of applications, from teaching protein structure principles all the way to ligand optimization in drug development. Applying data mining techniques on these experimentally determined structures requires a highly uniform, standardized structural data source. The Protein Data Bank (PDB) has evolved over the years toward becoming the standard resource for macromolecular structures. However, the process selecting the data most suitable for specific applications is still very much based on personal preferences and understanding of the experimental techniques used to obtain these models. In this chapter, we will first explain the challenges with data standardization, annotation, and uniformity in the PDB entries determined by X-ray crystallography. We then discuss the specific effect that crystallographic data quality and model optimization methods have on structural models and how validation tools can be used to make informed choices. We also discuss specific advantages of using the PDB_REDO databank as a resource for structural data. Finally, we will provide guidelines on how to select the most suitable protein structure models for detailed analysis and how to select a set of structure models suitable for data mining.

Author supplied keywords

Cite

CITATION STYLE

APA

van Beusekom, B., Perrakis, A., & Joosten, R. P. (2016). Data mining of macromolecular structures. In Methods in Molecular Biology (Vol. 1415, pp. 107–138). Humana Press Inc. https://doi.org/10.1007/978-1-4939-3572-7_6

Data mining of macromolecular structures

Abstract

Author supplied keywords

Cite

Register to see more suggestions