Data mining of macromolecular structures

9Citations
Citations of this article
20Readers
Mendeley users who have this article in their library.
Get full text

Abstract

The use of macromolecular structures is widespread for a variety of applications, from teaching protein structure principles all the way to ligand optimization in drug development. Applying data mining techniques on these experimentally determined structures requires a highly uniform, standardized structural data source. The Protein Data Bank (PDB) has evolved over the years toward becoming the standard resource for macromolecular structures. However, the process selecting the data most suitable for specific applications is still very much based on personal preferences and understanding of the experimental techniques used to obtain these models. In this chapter, we will first explain the challenges with data standardization, annotation, and uniformity in the PDB entries determined by X-ray crystallography. We then discuss the specific effect that crystallographic data quality and model optimization methods have on structural models and how validation tools can be used to make informed choices. We also discuss specific advantages of using the PDB_REDO databank as a resource for structural data. Finally, we will provide guidelines on how to select the most suitable protein structure models for detailed analysis and how to select a set of structure models suitable for data mining.

Cite

CITATION STYLE

APA

van Beusekom, B., Perrakis, A., & Joosten, R. P. (2016). Data mining of macromolecular structures. In Methods in Molecular Biology (Vol. 1415, pp. 107–138). Humana Press Inc. https://doi.org/10.1007/978-1-4939-3572-7_6

Register to see more suggestions

Mendeley helps you to discover research relevant for your work.

Already have an account?

Save time finding and organizing research with Mendeley

Sign up for free