- Why calculate?
- Types: A.S., M.S., H.S.
- How calculate A.S.
- Why calculate?
- Voronoi construction for calculating volumes
- How to do this?
- Problems applying it to proteins
- Richards' solution
- Problems with his solution and ways of dealing with this
- Nitty gritty of how to do calculations in 3D. How to calculate with
planes, lines, and so forth. This is generally useful.
- Case study of issues in taking off-shelf mathematical methods and
applying them to proteins.
Various Types of Protein Surfaces
- Protein is solid object. Surface is where action takes place.
- Surface useful for docking and drug-design
- Hydrophobic energy proportional to surface area
Roll sphere (water) on surface and look at locus of sphere
Usually represented as a dot surface
Not smooth and continuously differentiable (relevant for energy
calculations). It has sharp cusps.
Lee & Richards algorithm (first method, 1970)
- Accessible Surface
- Molecular Surface
- Hydration Surface
Shrake & Rupley algorithm (easier)
- Pick an arbitrary direction from which to view the protein. Slice
it into many sections perpendicular to this direction.
- In each section, cycle over all the atoms. Each atom is
represented as a sphere with a radius that is the sum of its VDW
radius plus that of a probe solvent -- i.e. 1.4 for water.
- For each atom determine the circle corresponding to the
intersection of this sphere with the sectioning plane. Remove all
parts (i.e. arcs) of this circle occluded by the circles of other
- Multiply the total amount of non-occluded arc length by the
sectioning width to get the surface area for atom. Sum over all atoms
and all sections to get total area.
Solution: the smooth molecular surface.
M.S. = contact surface + re-entrant surface
- Surround each atom with sphere
of uniformly spaced dots (e.g. 92).
- Remove dots contained in other atoms spheres. Total number
of remaining dots is accessible surface.
First proposed by Richards, but hard to calculate. First numeric calc. by Connelly. Later
analytic calculation by Connelly.
Analytic version is continuously differentiable.
Problem with molecular surface:
Water does not really roll on protein surface like a sphere.
This is treating water chemically like liquid argon.
Solution: calculate actual locus of real water positions in a simulation
and fit a surface through the mean position of the second hydration shell.
- C.S. = points of tangency between probe sphere and protein when probe sphere is
only touching one atom
- R.S. = solid angle of probe sphere when tangent to two protein atoms
In 1908 Voronoi found a way of partitioning all space amongst a
collection of points using specially constructed polyhedra. Here we
refer to a collection of "atom centers" rather than "points."
In 3D, each atom is surrounded by a unique limiting polyhedron
such that all points within an atom's polyhedron are closer to this
atom than all other atoms.
Likewise, points equidistant from 2 atoms form planes (lines in
2D). Those equidistant from 3 atoms form lines, and those equidistant
form 4 centers form vertices.
If Voronoi polyhedra are constructed around atoms in a periodic
system, such as in a crystal, all the volume in the unit cell will be
apportioned to the atoms. There will be no gaps or cavities as there
would be if one, for instance, simply drew spheres around the atoms.
Voronoi volume of an atom is a weighted average of distances to
all its neighbors, where the weighting factor is the contact area with
Packing efficiency is ratio of volume of VDW envelope to volume of
space occupied (volume of voronoi cell).
Voronoi polyhedra are used a wide variety of disciplines
- Measuring packing efficiency is equivalent to quantifying
Lennard-Jones energy (U = A/r^12 - B/r^6). L-J interactions, which
are also known as Van der Waals (VDW) interactions, are the most basic
type of interaction in molecular systems.
- It is observed that protein interiors are tightly packed, fitting
together like a jig-saw puzzle. This essentially implies that VDW
interactions are very important in structuring the protein interior.
- Because of tight packing the various types of protein residues and
atoms occupy well-defined amounts of space.
- Tight packing is a driving force in ligand binding and is
essential in the specificity of various recognition processes
(e.g. antibodies and antigens)
- Protein packing is interesting because other substances have very
different types of packing. For instance, water structure is dominated
by H-bonding rather than tight packing.
Dual of a Voronoi diagram is a Delaunay triangulation
- Nearest neighbor problems. The nearest neighbor of a query point is
center of the Voronoi diagram in which it resides
- Largest empty circle in a collection of points
has center at a Voronoi vertex
- Voronoi volume of "something" often is a useful weighting factor. This fact
can be used, for instance, to weight sequences in alignment to correct for
over or under-representation
Put all points in system on grid.
Go to each grid-point (i.e. voxel) and add its volume to
the atom center to which it is closest.
Make this faster by
randomly sampling grid-points.
Useful approach for high-dimensional integration -- 20D
Find all the vertices associated with an atom.
- Connect all centers with lines (which are perpen. bisectors to edges)
- Border of D.T. is Convex Hull
- D.T. produces "fatest" possible triangles which makes it
convenient for things such as finite element analysis.
Label each vertex by the indices of the 4 atoms to which it is associated.
- Each polyhedra vertex sits at the center of sphere that goes
through 4 atom centers (with no other atom centers being closer). So
using 4 sets of atom center coordinates (x,y,z), solve for the four
(x-a)^2 + (y-b)^2 + (z-c)^2 = r^2 .
Want to traverse the vertices associated with a particular atom
center (atom 0) to find the volume of its Voronoi polyhedron.
Pick all the vertices that share two common atoms -- atom 0 and
another atom, atom 1. These vertices form the edges around a face.
Pick an arbitary vertex on the edge to start (e.g. vertex 012) and
walk around the perimeter of the face. You can tell which vertices are
neighboring on the perimeter because they will have a third atom in
common (in addition to atom 0 and atom 1). With reference to the
starting vertex the face can be divided into triangles, for which it
is trivial to calculate areas and volumes.
The total area of the face comes from summing all its triangular
areas. The volume of the pyramid from atom 0 to the face is calculated
from the usual formula Ad/3, where A is the area of the face and d is
distance to the face (half the distance between atom 0 and atom 1).
This sequential walking procedure also gives you a way to draw
polyhedra on a graphics device.
If some of the neighbors around an atom are missing, the
constructed polyhedron will be far too large to be physically
"reasonable" and will allocate "too much" space to the atom. This is
the problem with the protein surface, where one often does not have
enough neighboring water atoms to construct reasonable polyhedra.
Furthermore, when some neighbors are missing, even if an
apparently reasonable polyhedron can be constructed, it will often
have a very "pointy" or distended shape.
- Central atom is atom 0
- Each neighboring atom has an index number (i=1,2,3...)
- Planes are denoted by the indices of the 2
atoms that form them (01)
- Lines are denoted by the indices of 3 atoms (012)
- Vertices are denoted by 4 indices (0123)
In conventional Voronoi procedure, face is placed halfway between
However, as atoms are not the same size (N, C, O), bisection is not
really "fair." Volume is misallocated to the smaller atom.
This unfair division is evident in the calculation of
"average" atomic volumes over many crystal structures. An "unfair"
allocation will give a high S.D. about the mean.
Position the dividing plane between atoms in proportion to known atom radii.
Place dividing plane farther from
large atom according to the following formula:
- Create artificial shell of water positions around the protein.
- Use molecular simulation to realistically position a waters
around the protein.
S = ----- D which is the same as R s = S r
R + r
R is radius of the large atom and S is distance of the plane from the large atom,
r and s are the analogous quantities for the small atom, and D=s+S.
Vertices of polyhedra now are no longer at center of sphere
formed by 4 atoms.
Find the vectors from the atom for which we are calculating a
volume (the central atom) to each of its neigbours.
Position a plane perpendicular to each
vector according the
predefined proportion. (As above)
Create 4 "neighbors" far from the central atom, arrange them so
that the intersection of their planes forms a large tetrahedron. Use
this as the initial polyhedron for the atom.
Cycle through all the neighbors of the central atom.
For each neighbor, see whether any vertices of the "current
polyhedron" are on a different side of its plane than the central
atom. If so, discard these vertices and recompute the polyhedron using
the current neighbor's plane to chop it down. (See below)
When you are finished "chopping down the polyhedron" you have a
list of vertices, which you can use as in the original Voronoi
procedure to calculate a polyhedron volume. (As above)
Sort all neighboring atoms
by distance from central atom and go through them one-by-one chopping
down the polyhedron as you go (starting with the big tetrahedron).
Say vertex 0214 is outside of the plane formed by neighbor 6. Need
to delete 0214 from the list of vertices and recompute the new
vertices formed. (Remember labeling
conventions. Central atom is numbered 0.)
These new vertices are formed by the intersection of 3 lines (021,
024, and 014) with plane 06. So we add the new vertices 0216, 0246,
and 0146 to the new polyhedron.
However, there is a snag: need to check whether any of the 3 lines
are not also outside of the plane. To do this, when we delete a vertex
we push all the lines forming it (e.g. 021, 024, 014) onto a secondary
list. Then when we delete another vertex, we check whether any of its
lines have already been deleted. If so, we do not intersect this line
with the new plane.
Polyhedra no longer partition all of space. The vertices of the
polyhedron around a particular atom are not EXACTLY shared by the polyhedron
around a neighboring atom.
One needs apriori sizes for atoms -- an atom typing with radii
Radical-plane method positions plane according to atom radii and
has no error tetrahedrons. Unfortunately, for various reasons the
plane-positioning is not as physically reasonable as method B.
Still divide space between atoms according to natural relation R d
= D r. However, don't use planes but rather curved surfaces as
- Vertices are replaced by tiny "error tetrahedrons"
- Typically these are very small, less than 1 part in 500, but
they nevertheless spoil the mathematical purity of the procedure
-CH2- 23.681 2.037
-CH3 36.673 3.021
-CH= 21.126 1.938
-O 15.972 1.909
-OH 17.239 1.792
=O 16.813 2.101
-NH- 15.643 1.911
-NH2 23.380 2.304
-NH3 19.976 1.562
-S- 19.349 8.296
-SH 37.801 4.105
>C= 10.275 0.729
>CH- 14.570 1.217
C 9.247 0.668
CA 13.412 1.051
N 13.885 1.133
O 15.839 1.394
- Lee, B. & Richards, F. M. (1971). The Interpretation of Protein Structures: Estimation of Static Accessibility. J. Mol. Biol. 55, 379-400.
- Shrake, A. & Rupley, J. A. (1973). J. Mol. Biol. 79, 351.
Hydration Surface plus critique of molecular surface
- Richards, F. M. (1977). Areas, Volumes, Packing, and Protein Structure. Ann. Rev. Biophys. Bioeng. 6, 151-76.
- Connolly, M. (1983). Solvent-accessible Surfaces of Proteins and Nucleic Acids. Science 221, 709-713.
General Information about Voronoi Polyhedra
- Gerstein, M. & Lynden-Bell, R. M. (1993). What is the natural boundary for a protein in solution? J. Mol. Biol. 230, 641-650.
Application of Voronoi Procedure to Proteins (Method B)
- J O'Rourke. Computational Geometry in C. Cambridge U.P.
Critique of Method B
- Richards, F. M. (1974). The Interpretation of Protein Structures: Total Volume, Group Volume Distributions and Packing Density. J. Mol. Biol. 82, 1-14.
- Richards, F. M. (1985). Calculation of Molecular Volumes and Areas for Structures ofKnown Geometry. Methods in Enzymology 115, 440-464.
Standard volumes and some applications
Gellatly, B. J. & Finney, J. L. (1982). Calculation of Protein Volumes: An Alternative to the Voronoi Procedure. J. Mol. Biol. 161, 305-322.
- Gerstein, M., Tsai, J. & Levitt, M. (1995). The volume of atoms on
the protein surface: Calculated from simulation, using Voronoi
polyhedra. J. Mol. Biol. 249: 955-966.
A plane is defined by the vector from the central atom
to the neighboring atom v and a constant K so that for any
point u on the plane: v . u = K
If v . u > K, u is on the wrong side
of the plane, otherwise it is on the right side.
A vertex is at a point u where
- Y Harpaz, M Gerstein & C Chothia (1994). "Volume Changes on
Protein Folding," Structure 2: 641-649.
- J Tsai, M Gerstein & M Levitt (1996). "Keeping the Shape but
Changing the Charges: A Simulation Study of Urea and its
Iso-steric Analogues," J. Chem. Phys. (in press).
v1 . u = K1 & v2 . u = K2 & v3 . u = K3
|K1 v1y v1z| / |v1x v1y v1z|
ux = |K2 v2y v2z| / |v2x v2y v2z|
|K3 v3y v3z| / |v3x v3y v3z|
Similar equations apply for uy and uz
In 2D the triangular area of two points P & Q and the origin is:
A = 0.5 | Px Py |
| Qx Qy |
In 3D the (oriented) volume of a tetrahedron formed by three points P, Q, & R and
the origin is:
P . Q x R 1 | Px Py Pz |
V = --------- = - | Qx Qy Qz |
6 6 | Rx Ry Rz |
Talk on Surfaces and Volumes /
(C) Mark Gerstein /