Surface Area

Why calculate?
Types: A.S., M.S., H.S.
How calculate A.S.

Volume

Why calculate?
Voronoi construction for calculating volumes
How to do this?
Problems applying it to proteins
Richard's solution
Problems with his solution and ways of dealing with this

Important Themes

Nitty gritty of how to do calculations in 3D. How to calculate with planes, lines, and so forth. This is generally useful.
Case study of issues in taking off-shelf "CS methods" and applying them to biology

Why calculate?

Protein is solid object. Surface is where action takes place.
Surface useful for docking and drug-design
Hydrophobic energy proportional to surface area

Various Types of Protein Surfaces

Accessible Surface
Molecular Surface
Hydration Surface

Roll sphere (water) on surface and look at locus of sphere centers.

Usually represented as a dot surface

Not smooth and continuously differentiable (relevant for energy calculations)

Lee & Richards algorithm (first method, 1970)

Pick an arbitrary direction from which to view the protein. Slice it into many sections perpendicular to this direction.
In each section, cycle over all the atoms. Each atom is represented as a sphere with a radius that is the sum of its VDW radius plus that of a probe solvent -- i.e. 1.4 for water.
For each atom determine the circle corresponding to the intersection of this sphere with the sectioning plane. Remove all parts (i.e. arcs) of this circle occluded by the circles of other atoms.
Multiply the total amount of non-occluded arc length by the sectioning width to get the surface area for atom. Sum over all atoms and all sections to get total area.

Shrake & Rupley algorithm (easier)

Surround each atom with sphere of uniformly spaced dots (e.g. 92).
Remove dots contained in other atoms spheres. Total number of remaining dots is accessible surface.

Problem with accessible surface: it has sharp cusps.

Solution: the smooth molecular surface.

M.S. = contact surface + re-entrant surface

C.S. = points of tangency between probe sphere and protein when probe sphere is only touching one atom
R.S. = solid angle of probe sphere when tangent to two protein atoms

First proposed by Richards, but hard to calculate. First numeric calc. by Connelly. Later analytic calculation by Connelly.

Analytic version is continuously differentiable.

Problem with molecular surface: Water does not really roll on protein surface like a sphere. This is treating water chemically like liquid argon.

Solution: calculate actual locus of real water positions in a simulation and fit a surface through the mean position of the second hydration shell.

Why Calculate?

Protein interiors are tightly packed, fitting together like a jig-saw puzzle.
Because of tight packing the various types of protein residues and atoms occupy well-defined amounts of space.
Tight packing is a driving force in ligand binding and is essential in the specificity of various recognition processes (e.g. antibodies and antigens)
Protein packing is interesting because other molecules have very different types of packing. For instance, water structure is dominated by H-bonding rather than tight packing.

In 1908 Voronoi found a way of partitioning all space amongst a collection of points using specially constructed polyhedra. Here we refer to a collection of "atom centers" rather than "points."

In 3D, each atom is surrounded by a unique limiting polyhedron such that all points within an atom's polyhedron are closer to this atom than all other atoms.

Likewise, points equidistant from 2 atoms form planes (lines in 2D). Those equidistant from 3 atoms form lines, and those equidistant form 4 centers form vertices.

If Voronoi polyhedra are constructed around atoms in a periodic system, such as in a crystal, all the volume in the unit cell will be apportioned to the atoms. There will be no gaps or cavities as there would be if one, for instance, simply drew spheres around the atoms.

An atom's packing efficiency is the volume of its VDW sphere divided by its Voronoi volume.

Voronoi volume of an atom is a weighted average of distances to all its neighbors, where the weighting factor is the contact area with the neighbor.

Voronoi polyhedra are used a wide variety of disciplines

Nearest neighbor problems. The nearest neighbor of a query point is center of the Voronoi diagram in which it resides
Largest empty circle in a collection of points has center at a Voronoi vertex
Voronoi volume of "something" often is a useful weighting factor. This fact can be used, for instance, to weight sequences in alignment to correct for over or under-representation

Dual of a Voronoi diagram is a Delaunay triangulation

Connect all centers with lines (which are perpen. bisectors to edges)
Border of D.T. is Convex Hull
D.T. produces "fatest" possible triangles which makes it convenient for things such as finite element analysis.

Put all points in system on grid.

Go to each grid-point (i.e. voxel) and add its volume to the atom center to which it is closest.

Make this faster by randomly sampling grid-points.

Useful approach for high-dimensional integration -- 20D Voronoi diagrams.

Find all the vertices associated with an atom.

Each polyhedra vertex sits at the center of sphere that includes 4 atom centers. So using 4 sets of atom center coordinates (x,y,z), solve for the four unknowns in
(x-a)^2 + (y-b)^2 + (z-c)^2 = r^2 .

Label each vertex by the indices of the 4 atoms to which it is associated.

Central atom is atom 0
Each neighboring atom has an index number (i=1,2,3...)
Planes are denoted by the indices of the 2 atoms that form them (01)
Lines are denoted by the indices of 3 atoms (012)
Vertices are denoted by 4 indices (0123)

Want to traverse the vertices associated with a particular atom center (atom 0) to find the volume of its Voronoi polyhedron.

Pick all the vertices that share two common atoms -- atom 0 and another atom, atom 1. These vertices form the edges around a face. Pick an arbitary vertex on the edge to start (e.g. vertex 012) and walk around the perimeter of the face. You can tell which vertices are neighboring on the perimeter because they will have a third atom in common (in addition to atom 0 and atom 1). With reference to the starting vertex the face can be divided into triangles, for which it is trivial to calculate areas and volumes.

The total area of the face comes from summing all its triangular areas. The volume of the pyramid from atom 0 to the face is calculated from the usual formula Ad/3, where A is the area of the face and d is distance to the face (half the distance between atom 0 and atom 1).

This sequential walking procedure also gives you a way to draw polyhedra on a graphics device.

In 2D the triangular area of two points P & Q and the origin is:

      A = 0.5   | Px Py |
                | Qx Qy |

In 3D the (oriented) volume of a tetrahedron formed by three points P, Q, & R and the origin is:

              P . Q x R     1  | Px Py Pz |
      V =     ---------  =  -  | Qx Qy Qz | 
                  6         6  | Rx Ry Rz |

If some of the neighbors around an atom are missing, the constructed polyhedron will be far too large to be physically "reasonable" and will allocate "too much" space to the atom. This is the problem with the protein surface, where one often does not have enough neighboring water atoms to construct reasonable polyhedra.

Furthermore, when some neighbors are missing, even if an apparently reasonable polyhedron can be constructed, it will often have a very "pointy" or distended shape.

Solution:

Create artificial shell of water positions around the protein.
Use molecular simulation to realistically position a waters around the protein.

In conventional Voronoi procedure, face is placed halfway between atoms.

However, as atoms are not the same size (N, C, O), bisection is not really "fair." Volume is misallocated to the smaller atom.

This unfair division is evident in the calculation of "average" atomic volumes over many crystal structures. An "unfair" allocation will give a high S.D. about the mean.

Position the dividing plane between atoms in proportion to known atom radii.

Place dividing plane farther from large atom according to the following formula:

        R
 S =  ----- D  which is the same as  R s = S r
      R + r

where R is radius of the large atom and S is distance of the plane from the large atom, r and s are the analogous quantities for the small atom, and D=s+S.

Vertices of polyhedra now are no longer at center of sphere formed by 4 atoms.

Find the vectors from the atom for which we are calculating a volume (the central atom) to each of its neigbours. Position a plane perpendicular to each vector according the predefined proportion. (As above)

Create 4 "neighbors" far from the central atom, arrange them so that the intersection of their planes forms a large tetrahedron. Use this as the initial polyhedron for the atom.

Cycle through all the neighbors of the central atom.

For each neighbor, see whether any vertices of the "current polyhedron" are on a different side of its plane than the central atom. If so, discard these vertices and recompute the polyhedron using the current neighbor's plane to chop it down. (See below)

When you are finished "chopping down the polyhedron" you have a list of vertices, which you can use as in the original Voronoi procedure to calculate a polyhedron volume. (As above)

Sort all neighboring atoms by distance from central atom and go through them one-by-one chopping down the polyhedron as you go (starting with the big tetrahedron).

Say vertex 0214 is outside of the plane formed by neighbor 6. Need to delete 0214 from the list of vertices and recompute the new vertices formed. (Remember labeling conventions. Central atom is numbered 0.)

These new vertices are formed by the intersection of 3 lines (021, 024, and 014) with plane 06. So we add the new vertices 0216, 0246, and 0146 to the new polyhedron.

However, there is a snag: need to check whether any of the 3 lines are not also outside of the plane. To do this, when we delete a vertex we push all the lines forming it (e.g. 021, 024, 014) onto a secondary list. Then when we delete another vertex, we check whether any of its lines have already been deleted. If so, we do not intersect this line with the new plane.

A plane is defined by the vector from the central atom to the neighboring atom v and a constant K so that for any point u on the plane: v . u = K

If v . u > K, u is on the wrong side of the plane, otherwise it is on the right side.

A vertex is at a point u where

v1 . u = K1  &  v2 . u = K2  &  v3 . u = K3

     |K1 v1y v1z|    / |v1x v1y v1z|
ux = |K2 v2y v2z|   /  |v2x v2y v2z|
     |K3 v3y v3z|  /   |v3x v3y v3z|

Similar equations apply for uy and uz

Polyhedra no longer partition all of space. The vertices of the polyhedron around a particular atom are not EXACTLY shared by the polyhedron around a neighboring atom.

Vertices are replaced by tiny "error tetrahedrons"
Typically these are very small, less than 1 part in 500, but they nevertheless spoil the mathematical purity of the procedure

One needs apriori sizes for atoms -- an atom typing with radii

Radical-plane method positions plane according to atom radii and has no error tetrahedrons. Unfortunately, for various reasons the plane-positioning is not as physically reasonable as method B.

Still divide space between atoms according to natural relation R d = D r. However, don't use planes but rather curved surfaces as boundaries.

-CH2-   23.681   2.037  
-CH3    36.673   3.021  
-CH=    21.126   1.938  
-O      15.972   1.909  
-OH     17.239   1.792  
=O      16.813   2.101  
-NH-    15.643   1.911  
-NH2    23.380   2.304  
-NH3    19.976   1.562  
-S-     19.349   8.296  
-SH     37.801   4.105  
>C=     10.275   0.729  
>CH-    14.570   1.217  
C        9.247   0.668  
CA      13.412   1.051  
N       13.885   1.133  
O       15.839   1.394  
Water  ~30

Accessible Surface

Lee, B. & Richards, F. M. (1971). The Interpretation of Protein Structures: Estimation of Static Accessibility. J. Mol. Biol. 55, 379-400.
Shrake, A. & Rupley, J. A. (1973). J. Mol. Biol. 79, 351.

Molecular Surface

Richards, F. M. (1977). Areas, Volumes, Packing, and Protein Structure. Ann. Rev. Biophys. Bioeng. 6, 151-76.
Connolly, M. (1983). Solvent-accessible Surfaces of Proteins and Nucleic Acids. Science 221, 709-713.

Hydration Surface plus critique of molecular surface

Gerstein, M. & Lynden-Bell, R. M. (1993). What is the natural boundary for a protein in solution? J. Mol. Biol. 230, 641-650.
- [Medline Link]

General Information about Voronoi Polyhedra

J O'Rourke. Computational Geometry in C. Cambridge U.P.

Application of Voronoi Procedure to Proteins (Method B)

Richards, F. M. (1974). The Interpretation of Protein Structures: Total Volume, Group Volume Distributions and Packing Density. J. Mol. Biol. 82, 1-14.
Richards, F. M. (1985). Calculation of Molecular Volumes and Areas for Structures ofKnown Geometry. Methods in Enzymology 115, 440-464.

Critique of Method B

Gellatly, B. J. & Finney, J. L. (1982). Calculation of Protein Volumes: An Alternative to the Voronoi Procedure. J. Mol. Biol. 161, 305-322.
Gerstein, M., Tsai, J. & Levitt, M. (1995). The volume of atoms on the protein surface: Calculated from simulation, using Voronoi polyhedra. J. Mol. Biol. 249 (in press).
- [ftp-directory with manuscript]