TY - JOUR
T1 - Protein NMR recall, precision, and F-measure scores (RPF scores)
T2 - Structure quality assessment measures based on information retrieval statistics
AU - Huang, Yuanpeng J.
AU - Powers, Robert
AU - Montelione, Gaetano T.
PY - 2005/2/16
Y1 - 2005/2/16
N2 - One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical for automated NOESY interpretation and structure determination approaches. This paper describes new NMR quality assessment scores, including Recall, Precision, and F-measure scores (referred to here are "NMR RPF" scores), which quickly provide global measures of the goodness-of-fit of the 3D structures with NOESY peak lists using methods from information retrieval statistics. The sensitivity of the F-measure is improved using a scaled Fold Discriminating Power (DP) score. These statistical RPF scores are quite rapid to compute since NOE assignments and complete relaxation matrix calculations are not required. A graphical method for site-specific assessment of structure quality based on the Precision statistic is also described. These statistical measures are demonstrated to be valuable for assessing protein NMR structure accuracy. Their relationships to other proposed NMR "R-factors" and structure quality assessment scores are also discussed.
AB - One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical for automated NOESY interpretation and structure determination approaches. This paper describes new NMR quality assessment scores, including Recall, Precision, and F-measure scores (referred to here are "NMR RPF" scores), which quickly provide global measures of the goodness-of-fit of the 3D structures with NOESY peak lists using methods from information retrieval statistics. The sensitivity of the F-measure is improved using a scaled Fold Discriminating Power (DP) score. These statistical RPF scores are quite rapid to compute since NOE assignments and complete relaxation matrix calculations are not required. A graphical method for site-specific assessment of structure quality based on the Precision statistic is also described. These statistical measures are demonstrated to be valuable for assessing protein NMR structure accuracy. Their relationships to other proposed NMR "R-factors" and structure quality assessment scores are also discussed.
UR - http://www.scopus.com/inward/record.url?scp=13644252170&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=13644252170&partnerID=8YFLogxK
U2 - 10.1021/ja047109h
DO - 10.1021/ja047109h
M3 - Article
C2 - 15701001
AN - SCOPUS:13644252170
SN - 0002-7863
VL - 127
SP - 1665
EP - 1674
JO - Journal of the American Chemical Society
JF - Journal of the American Chemical Society
IS - 6
ER -