Further evidence for the likely completeness of the library of solved single domain protein structures

Document Type


Publication Date



Recent studies questioned whether the Protein Data Bank (PDB) contains all compact, single domain protein structures. Here, we show that all quasi-spherical, QS, random protein structures devoid of secondary structure are in the PDB and are excellent templates for all native PDB proteins up to 250 residues. Because QS templates have a similar global contour as native, TASSER can refine 98% (90%) of those whose TM-score is 0.4 (0.35) to structures greater than or equal to the 0.5 TM-score threshold (0.74 (0.64) mean TM-score) for CATH/SCOP assignment. On the basis of this and the fact that, at a TM-score of 0.4, 83% (90%) of all (internal) core secondary structure elements are recovered, a 0.40 TM-score is an appropriate fold similarity assignment threshold. Despite the claims of Taylor, Trovato, and Zhou that many of their structures lack a PDB counterpart, using fr-TM-align, at a 0.45 (0.5) TM-score threshold, essentially all (most) are found in the PDB. Thus, the conclusion that the PDB is likely complete is further supported. © 2012 American Chemical Society.

Publication Source (Journal or Book title)

Journal of Physical Chemistry B

First Page


Last Page


This document is currently not available here.