Prediction of functional class of novel bacterial proteins without the use of sequence similarity by a statistical learning method

J. Cui, L. Y. Han, C. Z. Cai, C. J. Zheng, Z. L. Ji, Y. Z. Chen

Research output: Contribution to journalArticlepeer-review

11 Scopus citations


A substantial percentage of the putative protein-encoding open reading frames (ORFs) in bacterial genomes have no homolog of known function, and their function cannot be confidently assigned on the basis of sequence similarity. Methods not based on sequence similarity are needed and being developed. One method, SVMProt (, predicts protein functional family irrespective of sequence similarity (Nucleic Acids Res. 2003;31:3692-3697). While it has been tested on a large number of proteins, its capability for non-homologous proteins has so far been evaluated for a relatively small number of proteins, and additional tests are needed to more fully assess SVMProt. In this work, 90 novel bacterial proteins (non-homologous to known proteins) are used to evaluate the capability of SVMProt. These proteins are such that none of their homologs are in the Swiss-Prot database, their functions not clearly described in the literature, and they themselves and their homologs are not included in the training sets of SVMProt. They represent proteins whose function cannot be confidently predicted by sequence similarity methods at present. The predicted functional class of 76.7% of each of these proteins shows various levels of consistency with the literature-described function, compared to the overall accuracy of 87% for the SVMProt functional class assignment of 34,582 proteins that have at least one homolog of known function. Our study suggests that SVMProt is capable of assigning functional class for novel bacterial proteins at a level not too much lower than that of sequence alignment methods for homologous proteins.

Original languageEnglish (US)
Pages (from-to)86-100
Number of pages15
JournalJournal of Molecular Microbiology and Biotechnology
Issue number2
StatePublished - Nov 2005
Externally publishedYes


  • Open reading frames, protein
  • Proteins, non-homologous
  • SVMProt

ASJC Scopus subject areas

  • Biotechnology
  • Microbiology
  • Applied Microbiology and Biotechnology
  • Molecular Biology


Dive into the research topics of 'Prediction of functional class of novel bacterial proteins without the use of sequence similarity by a statistical learning method'. Together they form a unique fingerprint.

Cite this