SPI - Structure predictability index for protein sequences

Michal Brylinski, Uniwersytet Jagielloński Collegium Medicum
Leszek Konieczny, Uniwersytet Jagielloński Collegium Medicum
Irena Roterman, Uniwersytet Jagielloński Collegium Medicum

Abstract

Estimation of structure predictability for a particular protein is difficult. Many methods estimate it in an a posteriori system evaluating the final, native protein structure. The SPI scale is intended to estimate the structure predictability of a particular amino acid sequence in an a priori system. A sequence-to-structure library was created based on the complete Protein Data Bank. The tetrapeptide was selected as a unit representing a well-defined structural motif. The early-stage folding structure (a model of which was presented elsewhere) was taken as the object for protein structure classification. Seven structural forms were distinguished for structure classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations particularly interesting for threading methods. A comparative analysis of the SPI and Q7 scales with the commonly used SOV and Q3 scales is presented. The complete contingency table, supplementary materials and all the programs used are available on request. © 2005 - IOS Press and Bioinformation Systems e.V. and the authors. All rights reserved.