Different Sets of Buried Atoms in Proteins

The Voronoi procedure can only be used to calculate the volume of atoms that are surrounded by other atoms. This means that in proteins, we can calculate only the volume of atomic groups surrounded by other protein atoms, ligands, and water molecules whose positions in the interior or on the surface have been determined in the crystallographic analyses.

To answer the question of whether different regions in a protein have different packing densities, mean atomic volumes were calculated for selected sets of protein atoms. The sets of atoms for which volumes were determined are outlined in the following list.

B:
This set contains protein atoms that are buried by other protein atoms and by ligands and/or cofactors. In selecting this set, the crystallographically determined water structure is ignored: i.e. the protein atoms used are those that have zero accessible surface area (Lee & Richards, 1971, Connolly, 1983 ) as calculated using just the atoms in the proteins, ligands, and cofactors.
BL:
This set contains atoms that are buried as defined by the B set less those whose volumes are affected by ligands and cofactors. The set was selected by removing from set B those atoms whose volumes are different when they are calculated in the presence and absence of ligands and cofactors. The L in the name of this set indicates this extra filtering of atoms.
BLW:
This set contains atoms that are buried by other protein atoms less those whose volumes are affected by ligands and cofactors and by water molecules. The set was selected by removing from set BL those atoms whose volumes are different when they are calculated in the presence and absence of ligands, cofactors and water molecules. So, the set of volumes calculated from this set of atoms is given the label BLW.
BD:
The atoms excluded from this set are (i) all those that have surface accessible to the solvent (as in set B) and (ii) all those in contact to these surface atoms. Thus both surface atoms and those that form the first layer below the surface are removed from the calculation to leave only those that are deeply buried. Therefore, the volumes produced by this set of atoms are named BD, where the D indicates that the resulting set of atoms are buried deep in the protein.
In our calculations, we used one extra step that affected the proteins volumes but did not affect the number of protein atoms in the set. For the B and BL sets, we calculated volumes with and without the water molecules whose position had been determined in the crystallographic analysis. A "+" sign added to a set's name indicates that we included the waters, and a "-" indicates that we did not. The reasons for carrying out these two calculations is described below in the next section. (Because of how the BLW and BL sets are defined, there are no differences between the volumes of the BLW+ and BLW- or BD+ and BD-.)

In the order given above, atoms in each set represent a progressively more deeply buried portion of the protein. This also means, of course, that we are selecting smaller and smaller numbers of atoms. The actual number of atoms in each set is shown in the table below. The largest set of atoms contains more than three times as many atoms as the smallest.
 

 
Protein atoms in each set
Set
Number
%
Total protein atoms 
134,689
100
B
61,786
46
BL
59,368
44
BLW
43,102
32
BD
19,510
15

Set B consists of protein atoms that are buried within the protein by other protein atoms and by ligands. Comparing the number of atoms in this set with the total number in the structures shows that the proportion of the atoms that have some access to the solvent is 100%-46%=54%.

Set BL contains atoms that are inaccessible to the solvent and whose volumes are not affected by ligand atoms. The proportion of atoms whose volumes are affected by ligands is small in the structures used here, 46-44%=2%.

Set BLW contains atoms that are inaccessible to the solvent and whose volumes are not affected by ligand atoms or by the water molecules detected in the crystallographic analysis. The proportion of atoms whose volumes are affected by water molecules is 44%-32%=12%. Given that this 12% of atoms is inaccessible to solvent, it is perhaps surprising that they have volumes affected by the water structure. How this occurs is discussed in the section below on the role of water molecules in packing density of protein interiors.

Set BD excludes both surface atoms and those that form the first layer below the surface. This means that it contains only those that are deeply buried in the protein. In the structures considered here the number of such atoms is small: about one-seventh (14%) of the total.

The Volumes of Atomic Groups and Residues in Proteins

As described above, we carried out six different calculations for atomic volumes: on the BD and BLW set of atoms and on the BL and B sets with water molecules (BL+ and B+) and without water molecules (BL- and B-). The mean volume of atomic groups and residues produced by these six calculations are listed in Table 6. Data for twenty-one types of residues are given because the oxidised and reduced forms of cysteine, Cys and Cyh, are treated separately. In all, these twenty-one residues have 173 atomic groups.

In each set of calculations, the standard deviations of the mean residue volumes are between 2.4 and 4.4%, with the exception of the following small residues: Gly where the range is 4.3 to 4.8%, Cyh where it is 4.4 to 6.0%, and Ser where it is 3.9 to 4.8%.

The large majority of the mean volumes for atomic groups have standard deviations in the range of 6 to 11%. Larger values are found for certain polar groups and a few of the adjacent carbon groups. These have standard deviations in the range of 12 to 17%. There are 17 such atomic groups in set B, 13 in set BL, 5 in set BLW, and 6 in set BD.

For aliphatic and aromatic residues, the number of each of their atomic groups is high in all six sets of atoms: up about a thousand for some groups in set B; 80 to 200 for most groups in set BD. For polar and charged residues, the situation is more complicated. The number of examples of their mainchain and aliphatic atomic groups is high in sets B, BL, and BLW. In most cases, one to several hundred. However, in set BD, it is usually between 50 and 100. The number of polar sidechain groups tends to be small and drops sharply on going from set B to set BD. This is especially the case for Lys Nz, which drops from 62 to 6; Cyh Sg which drops from 84 to 25, and Arg Nh1/Nh2 which drops from 187/145 to 28/21.

Differences in the residue and atomic volumes produced by ligand interactions

The B- and BL- sets (and the B+ and BL+ sets) differ in that the latter does not contain atomic groups whose volumes are affected by ligand interactions. For 19 residues the volumes given by the two sets are the same to within 0.4 Å3 and, in most cases, within 0.2 Å3. The exceptions are Cyh and His. In these residues the differences are mainly due to direct ligand interactions of the Sg and Ne2 atoms. The unliganded form of these atoms in the absence of water have mean volumes of 37.0 and 16.4 Å3 , respectively and the liganded atoms have mean volumes of 26.9 and 12.4 Å3 , respectively.

Residue and atomic volumes in the different sets

The six calculations produced six values for the volume of each residue and atomic group (Table 6). To a first approximation, the different values for a given residue are very similar. The values for aliphatic residues differ by no more than 0.5 to 1.6% and those for aromatic residues by no more than 0.5 to 1.1% (if His is excluded). Polar residues (together with His) and charged residues display differences that are a little larger: 1.3 to 2.5% in the first group and 2.0 to 3.5% in the second.

Inspection of the differences in volumes for individual residue shows that though they are small, they do tend to be systematic:

BD » BLW » BL+ » B+ < BL- » B-

The volumes given for atomic groups and residues from the BD, BLW, BL+ and B+ calculations differ in most cases by less than 1%. Only the volumes for Asn, Gln, Asp and Lys given by the BD set of atoms differ from those given by the BLW, BL+ and B+ sets by larger amounts: 1.9 to 2.7 %. However, the BD set possesses very few of these residues’ sidechain atoms and these differences can not be considered significant.