General Concerns in Evaluating Secondary Structure Prediction:

The question of how to compare different secondary structure prediction mthods is more complex than one might first imagine. Two primary issues are: what statistic(s) should one report and how should one's database of proteins with known structure be selected and used?

The question of statistics is largely historical. Customarily proceedures have been evaluated by the percentage of amino acids whose secondary structure class (usually helix, coil, or sheet) is correctly predicted. This statistic is often referred to as the Q3. Recently authors have begun to favor use of a correlation coefficient between predicted and observed secondary structure. This measure avoids the problem of rewarding overprediction of more common secondary structure classes in the database, but is much less intuitive for the reader to interpret. At this time, most authors report both correlation coefficients and Q3s. Many report additional statistics, such as average overlap of predicted secondary structure elements. he relevance of such statistics is difficult to judge.

The choice of database is more complex, although general standards have emerged. Common statistics for databases are ~120 proteins, none of which are more than 25-30 % homologous. Evaluation of a given method is performed by some variation on "jack-knifing," i.e. removing one protein from the database, optimizing the parameters for the remainder of the set, and then predicting the removed protein's secondary structure. This is repeated for each protein in the set. Inclusion of the protein of interest in the parameter optimization can artificially inflate the performance statistics (e.g. an improvement in Q3 for the GOR method (Garnier et al., 1996 ) of 5.3%, or 64.4 % to 69.7% accuracy (from Holley and Karplus, 1991). Since the goal is to accurately model secondary structures of proteins with unknown tertiary structures, this inflation is clearly a failure to accurately assess prediction accuracy.

Next
Previous