Biomolecular Interactions and DOCK

By Ana Quinones ana.quinones@yale.edu

Biomolecular interactions are at the center of all regulatory and metabolic processes. The structure of a molecule gives clues not only to its function, but also to its role in biomolecular interactions. Docking refers to the way in which two molecules, for example, a ligand and a receptor, fit together or interact. As the database of known molecular structures continue to grow, computer-aided analysis of their interactions becomes increasingly important. Greater processing power has made analysis and prediction of docking molecules more manageable [2]. Consequently, automated prediction of molecular interactions has emerged as a powerful tool in the area of drug discovery and development [4]. One of the more widely used computation methods, the DOCK program is the focus of this paper.

Kuntz et al. designed the DOCK programs to find favorable orientations of a ligand in a receptor [5]. Receptors may be enzymes with a well-defined active site, or macromolecules such as structural proteins or nucleic acid strands. The starting point of all docking calculations is generally the crystal structure of the target macromolecule (receptor) [2]. The ligand structure may be taken from the crystal structure of the enzyme-ligand complex or from a database of compounds, such as the Cambridge Crystallographic Database [4]. Determination of the ligand orientation in a receptor site can be divided into three major areas:

(1) Identification of possible binding sites,

(2) Matching of the receptor and ligand, and

(3) Optimization of ligand position.

Initially, a potential site of interest on the receptor is recognized. Frequently, the active site is the site of interest and/or is described beforehand. Within this site, putative ligand positions are identified. DOCK programs identify points, called sphere centers, by generating a set of overlapping spheres that fill the site [7]. Coordinates of its center and radius represent each sphere. The sphere centers attempt to capture shape characteristics of the active site, or site of interest, with a minimum number of points and without the bias of previously known ligand binding modes [3]. The use of spheres is an attempt to limit the enormous number of possible orientations within the active site. Like ligand atoms, these spheres touch the surface of the molecule and do not intersect the molecule. The spheres, however, may intersect other spheres, since they have volumes that overlap. Regions where many spheres overlap are either pockets on the receptor or protrusions on the ligand [5].

Second, the sphere centers are matched, or paired, with a ligand atom to orient the ligand within the active site [7]. Only the coordinates of the sphere centers are used. Comparison between the sphere distances and the corresponding atom generates sets of sphere-atom pairs. Therefore, spherei is paired with atom I if and only if for every spherej in the set and for every atom J in the set,

|dij - dIJ| < e

where dij is the distance between spherei and spherej, dIJ is the distance between atom I and atom J, and e is a somewhat small user-defined value [5]. These comparisons generate many sets of atom-sphere pairs. Each set contains only a small number of sphere-atom pairs [8]. The number of possible atom-sphere pair sets is limited by the use of a longest distance heuristic function [9]. Using this function, long inter-sphere distances are approximately equal to the corresponding long inter-atomic ligand distance [8]. Orientation of the ligand within the site of interest is calculated by a set of atom-sphere pairs, often referred to as a match [2]. The translation vector and rotation matrix which minimizes the root-mean-square deviation (rmsd) of the transformed ligand atoms and matching sphere centers of the sphere-atom set are calculated and used to orient the entire ligand within the active site [7].

Ligand orientation is then evaluated with a shape scoring function and/or an energy binding function. Since all evaluations are done on (scoring) grids, overall computational time is minimized [8]. The enzyme contributions to the score are stored at each grid point. Thus, receptor contributions to the score, potentially repetitive and time consuming, are calculated only once [9]. The appropriate terms are then simply retrieved from memory. The shape scoring function is an empirical function resembling the van der Waal attractive energy [5]. To generate the shape score, the receptor terms from the grid point nearest to each non-hydrogen ligand atom are summed together. In other words, the shape score is determined simply by the position of each ligand atom on the shape scoring grid [8]. The ligand-enzyme binding energy (E) is taken to be approximately the sum of the van der Waal attractive (A), van der Waal dispersive (B), and Coulombic electrostatic energies (q). The formula is

E = sum[Ai(sum(Aj/rij) - Bi(sum(Bi/rij) + Kq(sum(qj/erij)]

The usual molecular mechanics attractive and dispersive terms are approximated for use on a grid. The ligand atom terms are combined with the receptor terms from the nearest grid point, or combined with receptor terms from a "virtual" grid point with interpolated receptor values to generate the energy score [8]. The score is the sum of over all ligand atoms for these combined terms. The energy score is determined by both ligand atom types and ligand atom positions on the energy grids [3]. As a final step, in the energy scoring scheme, the orientation of the ligand may be varied slightly to minimize the energy score. That is, after the initial orientation and evaluation (scoring) of the ligand, a grid-based rigid body simplex minimization is used to locate the nearest local energy minimum [5]. The sphere centers themselves are simply approximations to possible atom locations. The orientations generated by the sphere-atom pairing, although reasonable, may not be minimal in energy [4].

DOCK has been successful in generating lead compounds of a number of biological targets, including human immunodeficiency virus (HIV)-1 proteases [4], hemagglutinin [1] and thymidylate synthase [10]. Success of the most recent versions of DOCK (4.0) have not fully been tested, but have been designed to account for ligand flexibility [6]. This allows DOCK to extensively search all possible matches of each entry in the database. This would further aid in the understanding of biomolecular interaction. Structure-based strategies continue to show encouraging signs that may have a large impact in the development of therapeutic agents.

References

[1]Hoffman, L.R., Kuntz I.D., White J.M. Structure-based identification of an inducer of the low-pH conformational change in the influenza virus hemagglutinin: irreversible inhibition of infectivity. Journal of Virology 71: 8808-8820, 1997.

[2]Joseph-McCarthy, D. Computational approaches to structure-based ligand design. Pharmacology and Therapeutics 84: 179-191, 1999.

[3]Knegtel, R.M.A, Kuntz, I.D., Orshiro, C.M. Molecular docking to ensembles of protein structures. Journal of Molecular Biology 266: 424-440, 1997.

[4]Kuntz, I.D. Structure-based strategies for drug design and discovery. Science 257: 1078-1082, 1992.

[5]Kuntz, I.D., Blaney, J.M., Oatley, S.J., Langridge, R. and Ferrin, T.E. A geometric approach to macromolecule-ligand interactions. Journal of Molecular Biology 161: 269-288, 1982.

[6]Makino, S., Kuntz, I.D. Automated flexible ligand docking method and its application for database search. Journal of Computational Chemistry 18: 1812-1825, 1997

[7]Meng, E.C., Gschwend, D.A., Blaney, J.M. and Kuntz, I.D. Orientational sampling and rigid-body minimization in molecular docking. Proteins 17(3): 266-278, 1993.

[8]Meng, E.C., Shoichet, B.K. and Kuntz, I.D. Automated docking with grid-based energy evaluation. Journal of Computational Chemistry 13: 505-524, 1992.

[9]Shoichet, B.K., Bodian, D.L. and Kuntz, I.D. Molecular docking using shape descriptors. Journal of Computational Chemistry 13(3): 380-397, 1992.

[10]Shoichet, B.K., Stroud, R.M., Santi, D.V., Kuntz, I.D. Perry, K.M. Structure-based discovery of inhibitors of thymidylate synthase. Science 259: 1445-1450, 1993.

Informational web sites on Docking

http://www.cmpharm.ucsf.edu/kuntz/main.html

http://cmgm.stanford.edu/biochem218/20Dock.pdf