Most researchers are familar with sequence alignment. Structure alignment is an alternative means of produce a sequence alignment of a protein (or macromolecule, though our current server supports only proteins) that uses solved structural information (e.g., from X-ray crystallography or NMR) to align residues.
Structure alignment has the advantage over sequence alignment that the
structures involved need not actually share a similar sequence --- only
a similar structure --- to produce a valid sequence alignment. In this
way, the technique is much more robust than simple sequence alignment,
which does not use structural information in its analysis. In addition,
the technique may be used to align structures with unrelated sequences
that are suspected of having evolved through convergent evolution.
You do not actually need to have the structures on your local hard disk
if the structures have been assigned an ID code through the Brookhaven
PDB; the server can use the ID code
to auto-magically fetch these from the Brookhaven database.
Once you have selected the structures, select the appropriate chain-ID for each structure from the pull-down menu on the left-hand side of the screen.
For correct behavior, it is important that you set the correct chain ID in the structure files.
If the chain ID is left blank for a structure, the "A" chain or, if the structure lacks an "A" chain, the blank (usually, only) chain is assumed.
Note that here are some anomalous PDB files that have only a single "E" chain or other weirdness; you will need to select the correct chain identifier in these cases.
After less than a minute or so, an alignment should appear below this.
The top of the output is an alignment in sequence-alignment format.
Below this is a rawer output of informat:
EQUIV records given an alignment slightly stricter than at the top.
If you selected the options to dump coordinates, these will be headed by an "IXYZ x_new" record for the aligned coordinates, or "IXYZ x_old" for rotated, pre-alignment coordinates. This will be followed by PDB-format ATOM records, finally terminated by an "IXYZ DONE" record. The format should be self-explanatory.
At the bottom of the output are DUMP records giving the RMS of the final
alignment (not at least-squares alignment!), the p-value from the alignment
(see Dr. Gerstein's paper), the structural p-value ("sscore-p-value") which
is comparable in concept to a sequence p-value, and finally, "sequence-smith-waterman-score"
and "sequence-pval" produced by running a FASTA alignment
on the structure's sequences.
Email Werner Krebs <werner.krebs@yale.edu>
for comments on either the server or this help file.