The Original Needleman-Wunsch Algorithm |
Which Can be Easily Adapted to Structural Alignment |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Mark Gerstein, 1998 |
http://bioinfo.mbb.yale.edu/Align/align-tutorial.html |
http://bioinfo.mbb.yale.edu/course |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
M Gerstein & M Levitt (1996). “Using Iterative Dynamic Programming to Obtain |
Accurate Pairwise and Multiple Alignments of Protein Structures,” |
in Proceedings of the Fourth International Conference on Intelligent Systems |
in Molecular Biology, 59-67 (Menlo Park, CA, AAAI Press, June 12-15). |
http://hyper.stanford.edu/~mbg/Align/ismb96 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Needleman, S. B. and Wunsch, C. D. (1971). "A general method applicable to the |
search for similarities in the amino acid sequence of two proteins." J. Mol. Biol. 48: 443-453 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 1 -- Make a Dot Plot (Similarity Matrix) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Put 1's where characters are identical. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
N |
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
R |
|
|
|
|
|
1 |
|
|
|
|
1 |
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
K |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
R |
|
|
|
|
|
1 |
|
|
|
|
1 |
|
|
|
|
|
B |
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
P |
|
|
|
|
|
|
|
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 2 -- Start Computing the Sum Matrix |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
new_value_cell(R,C) <= |
cell(R,C) { Old value, either 1 or 0 } |
+ Max[ |
cell (R+1, C+1), { Diagonally Down, no gaps } |
cells(R+1, C+2 to C_max),{ Down a row, making col. gap } |
cells(R+2 to R_max, C+2) { Down a col., making row gap } |
] |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Use this Excel Formula: |
"=IF(M28="",0,1)+MAX(N45,N46:N47,O45:Y45)" |
Good Excel Formulas Inserted in Step 4 (below) |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
N |
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
R |
|
|
|
|
|
1 |
|
|
|
|
1 |
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
K |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
R |
|
|
|
|
|
1 |
|
|
|
|
2 |
0 |
0 |
|
|
|
B |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
|
|
|
P |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 3 -- Keep Going |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
1 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
C |
|
|
1 |
|
|
|
|
1 |
|
1 |
|
|
|
|
|
|
Y |
|
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
N |
|
|
|
1 |
|
|
|
|
|
|
|
|
|
|
|
|
R |
|
|
|
|
|
5 |
4 |
3 |
3 |
2 |
2 |
0 |
0 |
|
|
|
C |
3 |
3 |
4 |
3 |
3 |
3 |
3 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
K |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
2 |
2 |
3 |
2 |
2 |
2 |
2 |
3 |
2 |
3 |
1 |
0 |
0 |
|
|
|
R |
2 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
2 |
0 |
0 |
|
|
|
B |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
|
|
|
P |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 4 -- Sum Matrix All Done |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Alignment Score is 8 matches. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
8 |
7 |
6 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
Y |
7 |
7 |
6 |
6 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
6 |
6 |
7 |
6 |
5 |
4 |
4 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
Y |
6 |
6 |
6 |
5 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
N |
5 |
5 |
5 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
R |
4 |
4 |
4 |
4 |
4 |
5 |
4 |
3 |
3 |
2 |
2 |
0 |
0 |
|
|
|
C |
3 |
3 |
4 |
3 |
3 |
3 |
3 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
K |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
2 |
2 |
3 |
2 |
2 |
2 |
2 |
3 |
2 |
3 |
1 |
0 |
0 |
|
|
|
R |
2 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
2 |
0 |
0 |
|
|
|
B |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
|
|
|
P |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 5 -- Traceback |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A B C N Y - R Q C L C R - P M |
A Y C - Y N R - C K C R B P |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
8 |
7 |
6 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
Y |
7 |
7 |
6 |
6 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
6 |
6 |
7 |
6 |
5 |
4 |
4 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
Y |
6 |
6 |
6 |
5 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
N |
5 |
5 |
5 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
R |
4 |
4 |
4 |
4 |
4 |
5 |
4 |
3 |
3 |
2 |
2 |
0 |
0 |
|
|
|
C |
3 |
3 |
4 |
3 |
3 |
3 |
3 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
K |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
2 |
2 |
3 |
2 |
2 |
2 |
2 |
3 |
2 |
3 |
1 |
0 |
0 |
|
|
|
R |
2 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
2 |
0 |
0 |
|
|
|
B |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
|
|
|
P |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Step 6 -- Alternate Paths |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A B C - N Y R Q C L C R - P M |
A Y C Y N - R - C K C R B P |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A |
B |
C |
N |
Y |
R |
Q |
C |
L |
C |
R |
P |
M |
|
|
|
A |
8 |
7 |
6 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
Y |
7 |
7 |
6 |
6 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
6 |
6 |
7 |
6 |
5 |
4 |
4 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
Y |
6 |
6 |
6 |
5 |
6 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
N |
5 |
5 |
5 |
6 |
5 |
4 |
4 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
R |
4 |
4 |
4 |
4 |
4 |
5 |
4 |
3 |
3 |
2 |
2 |
0 |
0 |
|
|
|
C |
3 |
3 |
4 |
3 |
3 |
3 |
3 |
4 |
3 |
3 |
1 |
0 |
0 |
|
|
|
K |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
3 |
2 |
1 |
0 |
0 |
|
|
|
C |
2 |
2 |
3 |
2 |
2 |
2 |
2 |
3 |
2 |
3 |
1 |
0 |
0 |
|
|
|
R |
2 |
1 |
1 |
1 |
1 |
2 |
1 |
1 |
1 |
1 |
2 |
0 |
0 |
|
|
|
B |
1 |
2 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
0 |
|
|
|
P |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
0 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|