Defining parameters for homology-tolerant database searching

J. P. Kayser, J. L. Vallet, R. L. Cerny

Research output: Contribution to journalArticlepeer-review

26 Scopus citations


De novo interpretation of tandem mass spectrometry (MS/MS) spectra provides sequences for searching protein databases when limited sequence information is present in the database. Our objective was to define a strategy for this type of homology-tolerant database search. Homology searches, using MSHomology software, were conducted with 20, 10, or 5 of the most abundant peptides from 9 proteins, based either on precursor trigger intensity or on total ion current, and allowing for 50%, 30%, or 10% mismatch in the search. Protein scores were corrected by subtracting a threshold score that was calculated from random peptides. The highest (p<.01) corrected protein scores (i.e., above the threshold) were obtained by submitting 20 peptides and allowing 30% mismatch. Using these criteria, protein identification based on ion mass searching using MS/MS data (i.e., Mascot) was compared with that obtained using homology search. The highest-ranking protein was the same using Mascot, homology search using the 20 most intense peptides, or homology search using all peptides, for 63.4% of 112 spots from two-dimensional polyacrylamide gel electrophoresis gels. For these proteins, the percent coverage was greatest using Mascot compared with the use of all or just the 20 most intense peptides in a homology search (25.1%, 18.3%, and 10.6%, respectively). Finally, 35% of de novo sequences completely matched the corresponding known amino acid sequence of the matching peptide. This percentage increased when the search was limited to the 20 most intense peptides (44.0%). After identifying the protein using MSHomology, a peptide mass search may increase the percent coverage of the protein identified.

Original languageEnglish (US)
Pages (from-to)285-295
Number of pages11
JournalJournal of Biomolecular Techniques
Issue number4
StatePublished - Dec 2004


  • Bioinformatics
  • Homology search
  • Mass spectrometry

ASJC Scopus subject areas

  • Molecular Biology


Dive into the research topics of 'Defining parameters for homology-tolerant database searching'. Together they form a unique fingerprint.

Cite this