Protein NMR recall, precision, and F-measure scores (RPF scores): Structure quality assessment measures based on information retrieval statistics

Yuanpeng J. Huang, Robert Powers, Gaetano T. Montelione

Research output: Contribution to journalArticlepeer-review

238 Scopus citations

Abstract

One of the most important challenges in modern protein NMR is the development of fast and sensitive structure quality assessment measures that can be used to evaluate the "goodness-of-fit" of the 3D structure with NOESY data, to indicate the correctness of the fold and accuracy of the resulting structure. Quality assessment is especially critical for automated NOESY interpretation and structure determination approaches. This paper describes new NMR quality assessment scores, including Recall, Precision, and F-measure scores (referred to here are "NMR RPF" scores), which quickly provide global measures of the goodness-of-fit of the 3D structures with NOESY peak lists using methods from information retrieval statistics. The sensitivity of the F-measure is improved using a scaled Fold Discriminating Power (DP) score. These statistical RPF scores are quite rapid to compute since NOE assignments and complete relaxation matrix calculations are not required. A graphical method for site-specific assessment of structure quality based on the Precision statistic is also described. These statistical measures are demonstrated to be valuable for assessing protein NMR structure accuracy. Their relationships to other proposed NMR "R-factors" and structure quality assessment scores are also discussed.

Original languageEnglish (US)
Pages (from-to)1665-1674
Number of pages10
JournalJournal of the American Chemical Society
Volume127
Issue number6
DOIs
StatePublished - Feb 16 2005

ASJC Scopus subject areas

  • Catalysis
  • General Chemistry
  • Biochemistry
  • Colloid and Surface Chemistry

Fingerprint

Dive into the research topics of 'Protein NMR recall, precision, and F-measure scores (RPF scores): Structure quality assessment measures based on information retrieval statistics'. Together they form a unique fingerprint.

Cite this