Profiles and Motifs

- Profile: frequency of each amino acid at each position is estimated

- Motif: a short  signature pattern identified in the  conserved region of the multiple alignment

- HMM: Hidden Markov Model, a generalized profile in rigorous mathematical terms
 
 

1. Profile : a position-specific scoring matrix composed of 21 columns and N rows (N=length of sequences in multiple alignment)

Values of profile:

M(p,a)= Summab=120 W(p,b) x Y(a,b)

Y(a,b): Dayhoff matrix for a and b amino acids
W(p,b): weight for amino acid b at position p.

- Profile can be used for searches against a database (eg. SEARCHWISE)
 
 

2. Motif:
- several proteins are grouped together by similarity searches
- they share a conserved motif
- motif is stringent enough to retrieve the family members from the complete protein database
- PROSITE: a collection of motifs (1135 different motifs)
 
 
 
 

3. Hidden Markov Model:
- a composition of finite number of states,
- each corresponding to a column in a multiple alignment
- each state emits symbols, according to symbol-emission probabilities

Starting from an initial state, a sequence of symbols is generated by moving from state to state until an end stated is reached.