Scoring functions

Luca A. Fenu; Richard A. Lewis; Andrew C. Good; Michael Bodkin; Jonathan W. Essex

Book Chapter

Scoring functions

Springer Netherlands, (2007), 223-245

DOI: 10.1007/1-4020-4407-0_9

3Citations

12Readers

Get full text

Abstract

In Medicinal Chemistry, the potency of a drug is often characterised by its association constant with the protein target, which is in turn related to the free energy of binding. There are currently a number of ways to obtain estimates of this value, of which the most physically realistic are techniques involving methods such as molecular dynamics (MD) or Monte Carlo (MC) simulations, and statistical mechanics tools such as free energy perturbation (FEP) (Kollman 1993) to extract the relevant information from the simulation trajectories. However, these methods are expensive and, even with the modern availability of cheap supercomputing power, require running times in the order of days per each ligand. Docking is the main instrument in structure-based virtual screening (SVS), where datasets of hundreds of thousands or even millions of lead-like molecules are 'screened' against a protein target to allow a subset to be identified for future synthesis and/or testing. It is worth mentioning that not all virtual screening is structure-based, as the parallel field of ligand-based virtual screening (LVS) is also well developed. In the case where there are known ligands for a protein, but no information is available on the threedimensional structure for the protein, LVS is the obvious choice. LVS makes use of various descriptors, which can be one-dimensional, for example molecular weight, two-dimensional, such as a molecule's substructure, or three-dimensional, such as electronic or shape features. By means of these descriptors, LVS is able to select molecules which have chemical features similar to known ligands, and are therefore likely to bind to the same protein. The interested reader can find more information can find more information in the reviews by Kubinyi (Kubinyi 1997a; Kubinyi 1997b). In pharmaceutical research, virtual screening experiments have a runtime of between a few days and a few weeks. Even with a dedicated Linux cluster or other highperformance computation facilities, the time available to consider each ligand is limited to a few minutes, at the most. Docking fits nicely in this timeframe, although most algorithms can give more accurate results at the expense of longer computation time, essentially spent in more exhaustive sampling. Even if, in principle, calculating accurate binding affinities for the docked structures, using for example the simulation methods described above, is desirable, our primary interest is in obtaining a ranking of the binding affinities, allowing us to prioritise testing and synthesis of the most promising molecules. Docking is customarily divided into two parts. During the first, we sample the conformational space of a ligand/protein complex, trying to identify the 'true' mode of binding for the ligand. To achieve this, a great number of conformations are generated for each molecule under consideration. These are then 'posed' inside the receptor pocket in various orientations. A function able to return an energy value for each threedimensional structure is used to estimate the interaction between ligand and receptor, and is called a scoring function. Orientations which score well are kept, whereas the remainder are dropped. In the second phase, the different poses, belonging to the same or different ligands, are ordered according to their computed score. The scoring function used here may be more elaborate than that used in the first phase. In general, we expect the resulting scores to be correlated with binding free energies. Further information regarding small molecule docking algorithms and programs may be found in this book, or in other reviews (Halperin et al 2002; Sotriffer et al 2004; Taylor et al 2002). Scoring functions obviously play an important role in both phases of docking. In the first, the scoring function drives the conformational and orientational sampling toward the minimum of the underlying energy surface. Because of this, we would like to have a function able to select the right binding modes, and allow us to explore efficiently the search space. Ideally, in the second phase, to obtain a correct ranking, we would like the scores returned to correlate well with experimental free energies of binding. Unfortunately, although the sampling and selection of poses is generally handled with a remarkable degree of accuracy, the final ranking of the ligands is more difficult. This is partly due to the fact that whereas the insertion of a ligand can be effectively modelled with simple functions that take into account shape complementarity and some directional interactions, such as hydrogen bonds, free energies of binding depend strongly on competition with the solvent. Some consideration of the entropy changes on ligand binding is also needed. Because of this, the use of two different scoring functions during the two phases is often advocated. The components required for a scoring function can be developed by considering the fundamental physics of intermolecular interactions (Atkins 1998; Leach 2001; Ajay and Murcko 1995). The important electrostatic interactions between the charges on the protein and ligand (Sotriffer et al 2004 may be modelled using a simple Coulomb expression, moderated by dielectric screening arising from any intervening molecules. For more accurate work, a full multipole expansion of the electrostatics may be included. Repulsion/dispersion interactions may be treated using either a simple Lennard-Jones 12-6 potential, with the option of the more accurate Buckingham potential that incorporates an exponential repulsion term. Hydrogen bonding is the result of the interaction between an electronegative atom (acceptor) and a hydrogen atom covalently bound to another electronegative atom (donor). The hydrogen atom possesses a considerable positive partial charge, favouring electrostatic interactions with the acceptor. There may, however, be an additional covalent contribution to the hydrogen bond. Hydrogen bonds are often attributed as giving specificity to the binding process, since in general, all hydrogen bonding sites in a protein-ligand complex should be satisfied for optimum binding to be observed (Bohm and Klebe 1996). Interactions involving π electron systems, including π-stacking and those involving cations or hydrogen bond donors, may be modelled using electrostatic and repulsion-dispersion approximations, although charge transfer effects may also need to be incorporated. The hydrophobic effect is commonly invoked to explain the preferential association between non-polar molecules, or areas of molecules, to minimize water contact. It reflects a complicated interplay of enthalpic and entropic effects involving the molecules and the aqueous solvent. This list of intermolecular forces is not, of course, exhaustive, as many other forces can play a role in binding, such as the internal energy of the ligand, which in order to bind, may have to assume a different conformational from the one it holds in solution. There are examples of studies that identify the most stable conformation of free and bound species to evaluate the free energy of binding, via a statistically-mechanically correct weighting of the conformational ensemble (Mardis et al 2001; Luo and Gilson 2000). This approach is too expensive for SVS, so scoring functions usually approximate this contribution with rule-based dihedral counts or internal van der Waals-like potentials (Giodanetto et al 2004; Jones et al 1997). Different approaches to scoring have led to different kinds of scoring functions, which are usually classified in a tree-like structure, with three principal branches, plus one more 'hybrid' type. The first three branches are those of force-field, empirical and knowledge-based scoring functions. For each of these, we will give a brief explanation of their rationale, problems encountered up until now, and the most recent attempts at solving these problems. For detailed information on the actual implementation of these approaches, the reader is referred to the original papers or a number of excellent reviews (Halperin et al 2002; Sotriffer et al 2004) or comparisons (Wang et al 2004; Wang et al 2003). © 2007 Springer.

Cite

CITATION STYLE

APA

Fenu, L. A., Lewis, R. A., Good, A. C., Bodkin, M., & Essex, J. W. (2007). Scoring functions. In Structure-Based Drug Discovery (pp. 223–245). Springer Netherlands. https://doi.org/10.1007/1-4020-4407-0_9

Scoring functions

Abstract

Cite

Register to see more suggestions