Deep vs. shallow learning-based filters of MS/MS spectra in support of protein search engines

Majdi Maabreh, Basheer Qolomany, James Springstead, Izzat Alsmadi, Ajay Gupta

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Despite the linear relation between the number of observed spectra and the searching time, the current protein search engines, even the parallel versions, could take several hours to search a large amount of MS/MS spectra, which can be generated in a short time. After a laborious searching process, some (and at times, majority) of the observed spectra are labeled as non-identifiable. We evaluate the role of machine learning in building an efficient MS/MS filter to remove non-identifiable spectra. We compare and evaluate the deep learning algorithm using 9 shallow learning algorithms with different configurations. Using 10 different datasets generated from two different search engines, different instruments, different sizes and from different species, we experimentally show that deep learning models are powerful in filtering MS/MS spectra. We also show that our simple feature list is significant where other shallow learning algorithms showed encouraging results in filtering the MS/MS spectra. Our deep learning model can exclude around 50% of the non-identifiable spectra while losing, on average, only 9% of the identifiable ones. As for shallow learning, algorithms of: Random Forest, Support Vector Machine and Neural Networks showed encouraging results, eliminating, on average, 70% of the non-identifiable spectra while losing around 25% of the identifiable ones. The deep learning algorithm may be especially more useful in instances where the protein(s) of interest are in lower cellular or tissue concentration, while the other algorithms may be more useful for concentrated or more highly expressed proteins.

Original languageEnglish (US)
Title of host publicationProceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
EditorsIllhoi Yoo, Jane Huiru Zheng, Yang Gong, Xiaohua Tony Hu, Chi-Ren Shyu, Yana Bromberg, Jean Gao, Dmitry Korkin
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1175-1182
Number of pages8
ISBN (Electronic)9781509030491
DOIs
StatePublished - Dec 15 2017
Externally publishedYes
Event2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017 - Kansas City, United States
Duration: Nov 13 2017Nov 16 2017

Publication series

NameProceedings - 2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
Volume2017-January

Other

Other2017 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2017
Country/TerritoryUnited States
CityKansas City
Period11/13/1711/16/17

Keywords

  • Big Data
  • Deep Learning
  • Machine Learning
  • MS/MS Filters
  • Protein Search Engine
  • Searching Space Optimization
  • Shallow Learning

ASJC Scopus subject areas

  • Biomedical Engineering
  • Health Informatics

Fingerprint

Dive into the research topics of 'Deep vs. shallow learning-based filters of MS/MS spectra in support of protein search engines'. Together they form a unique fingerprint.

Cite this