Surface Area
- Why calculate?
- Types: A.S., M.S., H.S.
- How calculate A.S.
Volume
- Why calculate?
- Voronoi construction for calculating volumes
- How to do this?
- Problems applying it to proteins
- Richard's solution
- Problems with his solution and ways of dealing with this
Important Themes
- Nitty gritty of how to do calculations in 3D. How to calculate with
planes, lines, and so forth. This is generally useful.
- Case study of issues in taking off-shelf "CS methods" and
applying them to biology
Why calculate?
- Protein is solid object. Surface is where action takes place.
- Surface useful for docking and drug-design
- Hydrophobic energy proportional to surface area
Various Types of Protein Surfaces
- Accessible Surface
- Molecular Surface
- Hydration Surface
Roll sphere (water) on surface and look at
locus of sphere centers.
Usually represented as a dot surface
Not smooth and continuously differentiable (relevant for energy calculations)
Lee & Richards algorithm (first method, 1970)
- Pick an arbitrary direction from which to view the protein.
Slice it into many sections perpendicular to this direction.
- In each section, cycle over all the atoms. Each atom is represented as a sphere with
a radius that is the sum of its VDW radius plus that
of a probe solvent -- i.e. 1.4 for water.
- For each atom determine the circle corresponding to the
intersection of this sphere with the sectioning plane. Remove all
parts (i.e. arcs) of this circle occluded by the circles of other
atoms.
- Multiply the total amount of non-occluded arc length by the
sectioning width to get the surface area for atom. Sum over all atoms
and all sections to get total area.
Shrake & Rupley algorithm (easier)
- Surround each atom with sphere
of uniformly spaced dots (e.g. 92).
- Remove dots contained in other atoms spheres. Total number
of remaining dots is accessible surface.
Problem with accessible surface: it has
sharp cusps.
Solution: the smooth molecular surface.
M.S. = contact surface + re-entrant surface
- C.S. = points of tangency between probe sphere and protein when probe sphere is
only touching one atom
- R.S. = solid angle of probe sphere when tangent to two protein atoms
First proposed by Richards, but hard to calculate. First numeric calc. by Connelly. Later
analytic calculation by Connelly.
Analytic version is continuously differentiable.
Problem with molecular surface:
Water does not really roll on protein surface like a sphere.
This is treating water chemically like liquid argon.
Solution: calculate actual locus of real water positions in a simulation
and fit a surface through the mean position of the second hydration shell.
Why Calculate?
- Protein interiors are tightly packed, fitting together like a jig-saw
puzzle.
- Because of tight packing
the various types of protein residues and atoms occupy well-defined amounts
of space.
- Tight packing is a driving force in ligand binding and is essential in the specificity of
various recognition processes (e.g. antibodies and antigens)
- Protein packing is interesting because
other molecules have very different types of packing. For instance, water
structure is dominated by H-bonding rather than tight packing.
In 1908 Voronoi found a way of partitioning all space amongst a
collection of points using specially constructed polyhedra. Here we
refer to a collection of "atom centers" rather than "points."
In 3D, each atom is surrounded by a unique limiting polyhedron such that
all points within an atom's polyhedron are closer to this atom than all other atoms.
Likewise, points equidistant from 2 atoms form planes (lines in 2D). Those
equidistant from 3 atoms form lines, and those
equidistant form 4 centers form vertices.
If Voronoi polyhedra are constructed around atoms in a periodic system,
such as in a crystal, all the volume in the unit cell will be
apportioned to the atoms. There will be no gaps or cavities as there
would be if one, for instance, simply drew spheres around the atoms.
An atom's packing efficiency is the volume
of its VDW sphere divided by its Voronoi volume.
Voronoi volume of an atom is a weighted average of distances to all its neighbors,
where the weighting factor is the contact area with the neighbor.
Voronoi polyhedra are used a wide variety of disciplines
- Nearest neighbor problems. The nearest neighbor of a query point is
center of the Voronoi diagram in which it resides
- Largest empty circle in a collection of points
has center at a Voronoi vertex
- Voronoi volume of "something" often is a useful weighting factor. This fact
can be used, for instance, to weight sequences in alignment to correct for
over or under-representation
Dual of a Voronoi diagram is a Delaunay triangulation
- Connect all centers with lines (which are perpen. bisectors to edges)
- Border of D.T. is Convex Hull
- D.T. produces "fatest" possible triangles which makes it
convenient for things such as finite element analysis.
Put all points in system on grid.
Go to each grid-point (i.e. voxel) and add its volume to
the atom center to which it is closest.
Make this faster by
randomly sampling grid-points.
Useful approach for high-dimensional integration -- 20D
Voronoi diagrams.
Find all the vertices associated with an atom.
- Each polyhedra vertex sits at the center of sphere that includes
4 atom centers. So using 4 sets of atom center coordinates
(x,y,z), solve for the four unknowns
in
(x-a)^2 + (y-b)^2 + (z-c)^2 = r^2 .
Label each vertex by the indices of the 4 atoms to which it is associated.
- Central atom is atom 0
- Each neighboring atom has an index number (i=1,2,3...)
- Planes are denoted by the indices of the 2
atoms that form them (01)
- Lines are denoted by the indices of 3 atoms (012)
- Vertices are denoted by 4 indices (0123)
Want to traverse the vertices associated with a particular
atom center (atom 0) to find the volume of its Voronoi polyhedron.
Pick all the vertices that share two common atoms -- atom 0 and
another atom, atom 1. These vertices form the edges around a face.
Pick an arbitary vertex on the edge to start (e.g. vertex 012) and
walk around the perimeter of the face. You can tell which vertices are
neighboring on the perimeter because they will have a third atom in
common (in addition to atom 0 and atom 1). With reference to the
starting vertex the face can be divided into triangles, for which it
is trivial to calculate areas and volumes.
The total area of the face comes from summing all its triangular
areas. The volume of the pyramid from atom 0 to the face
is calculated from the usual formula Ad/3, where A is the area of the face and
d is distance to the face (half the distance between atom 0 and atom 1).
This sequential walking procedure also gives you a way to draw
polyhedra on a graphics device.
In 2D the triangular area of two points P & Q and the origin is:
A = 0.5 | Px Py |
| Qx Qy |
In 3D the (oriented) volume of a tetrahedron formed by three points P, Q, & R and
the origin is:
P . Q x R 1 | Px Py Pz |
V = --------- = - | Qx Qy Qz |
6 6 | Rx Ry Rz |
If some of the neighbors around an atom are missing,
the constructed polyhedron will be far too large to be physically
"reasonable" and will allocate "too much" space to the atom.
This is the problem with the protein surface, where one often does not have enough neighboring water atoms to construct reasonable polyhedra.
Furthermore, when some neighbors are missing, even if an apparently reasonable polyhedron can be constructed, it will often have a very "pointy" or distended shape.
Solution:
- Create artificial shell of water positions around the protein.
- Use molecular simulation to realistically position a waters
around the protein.
In conventional Voronoi procedure, face is placed halfway between
atoms.
However, as atoms are not the same size (N, C, O), bisection is not
really "fair." Volume is misallocated to the smaller atom.
This unfair division is evident in the calculation of
"average" atomic volumes over many crystal structures. An "unfair"
allocation will give a high S.D. about the mean.
Position the dividing plane between atoms in proportion to known atom radii.
Place dividing plane farther from
large atom according to the following formula:
R
S = ----- D which is the same as R s = S r
R + r
where
R is radius of the large atom and S is distance of the plane from the large atom,
r and s are the analogous quantities for the small atom, and D=s+S.
Vertices of polyhedra now are no longer at center of sphere
formed by 4 atoms.
Find the vectors from the atom for which we are calculating a
volume (the central atom) to each of its neigbours.
Position a plane perpendicular to each
vector according the
predefined proportion. (As above)
Create 4 "neighbors" far from the central atom, arrange them
so that the intersection of their planes forms a large tetrahedron. Use this
as the initial polyhedron for the atom.
Cycle through all the neighbors of the central atom.
For each neighbor, see whether any vertices of the "current polyhedron"
are on a different side of its plane than the central atom. If so,
discard these vertices and recompute the polyhedron using the current neighbor's
plane to chop it down. (See below)
When you are finished "chopping down the polyhedron" you have a list
of vertices, which you can use as in the original Voronoi procedure to
calculate a polyhedron volume. (As above)
Sort all neighboring atoms
by distance from central atom and go through them one-by-one chopping
down the polyhedron as you go (starting with the big tetrahedron).
Say vertex 0214 is outside of the plane formed by neighbor 6. Need to delete
0214 from the list of vertices and recompute the
new vertices formed. (Remember labeling conventions.
Central atom is numbered 0.)
These new vertices
are formed by the intersection of 3 lines (021, 024, and 014) with plane 06.
So we add the new
vertices 0216, 0246, and 0146 to the new polyhedron.
However, there is a snag: need to check whether any of the 3 lines
are not also outside of the plane. To do this, when we delete a vertex
we push all the lines forming it (e.g. 021, 024, 014) onto a secondary
list. Then when we delete another vertex, we check whether any of its
lines have already been deleted. If so, we do not intersect this line
with the new plane.
A plane is defined by the vector from the central atom
to the neighboring atom v and a constant K so that for any
point u on the plane: v . u = K
If v . u > K, u is on the wrong side
of the plane, otherwise it is on the right side.
A vertex is at a point u where
v1 . u = K1 & v2 . u = K2 & v3 . u = K3
|K1 v1y v1z| / |v1x v1y v1z|
ux = |K2 v2y v2z| / |v2x v2y v2z|
|K3 v3y v3z| / |v3x v3y v3z|
Similar equations apply for uy and uz
Polyhedra no longer partition all of space. The vertices of the
polyhedron around a particular atom are not EXACTLY shared by the polyhedron
around a neighboring atom.
- Vertices are replaced by tiny "error tetrahedrons"
- Typically these are very small, less than 1 part in 500, but
they nevertheless spoil the mathematical purity of the procedure
One needs apriori sizes for atoms -- an atom typing with radii
Radical-plane method positions plane according to atom radii and
has no error tetrahedrons. Unfortunately, for various reasons the
plane-positioning is not as physically reasonable as method B.
Still divide space between atoms according to natural relation R d
= D r. However, don't use planes but rather curved surfaces as
boundaries.
-CH2- 23.681 2.037
-CH3 36.673 3.021
-CH= 21.126 1.938
-O 15.972 1.909
-OH 17.239 1.792
=O 16.813 2.101
-NH- 15.643 1.911
-NH2 23.380 2.304
-NH3 19.976 1.562
-S- 19.349 8.296
-SH 37.801 4.105
>C= 10.275 0.729
>CH- 14.570 1.217
C 9.247 0.668
CA 13.412 1.051
N 13.885 1.133
O 15.839 1.394
Water ~30
Accessible Surface
- Lee, B. & Richards, F. M. (1971). The Interpretation of Protein Structures: Estimation of Static Accessibility. J. Mol. Biol. 55, 379-400.
- Shrake, A. & Rupley, J. A. (1973). J. Mol. Biol. 79, 351.
Molecular Surface
- Richards, F. M. (1977). Areas, Volumes, Packing, and Protein Structure. Ann. Rev. Biophys. Bioeng. 6, 151-76.
- Connolly, M. (1983). Solvent-accessible Surfaces of Proteins and Nucleic Acids. Science 221, 709-713.
Hydration Surface plus critique of molecular surface
- Gerstein, M. & Lynden-Bell, R. M. (1993). What is the natural boundary for a protein in solution? J. Mol. Biol. 230, 641-650.
General Information about Voronoi Polyhedra
- J O'Rourke. Computational Geometry in C. Cambridge U.P.
Application of Voronoi Procedure to Proteins (Method B)
- Richards, F. M. (1974). The Interpretation of Protein Structures: Total Volume, Group Volume Distributions and Packing Density. J. Mol. Biol. 82, 1-14.
- Richards, F. M. (1985). Calculation of Molecular Volumes and Areas for Structures ofKnown Geometry. Methods in Enzymology 115, 440-464.
Critique of Method B
-
Gellatly, B. J. & Finney, J. L. (1982). Calculation of Protein Volumes: An Alternative to the Voronoi Procedure. J. Mol. Biol. 161, 305-322.
- Gerstein, M., Tsai, J. & Levitt, M. (1995). The volume of atoms on
the protein surface: Calculated from simulation, using Voronoi
polyhedra. J. Mol. Biol. 249 (in press).