Learning yeast gene functions from heterogeneous sources of data using hybrid Weighted Bayesian Networks

Xutao Deng, Huimin Geng, Hesham Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Scopus citations

Abstract

We developed a machine learning system for determining gene functions from heterogeneous sources of data sets using a Weighted Naive Bayesian Network (WNB). The knowledge of gene functions is crucial for understanding many fundamental biological mechanisms such as regulatory pathways, cell cycles and diseases. Our major goal is to accurately infer functions of putative genes or ORFs (Open Reading Frames) from existing databases using computational methods. However, this task is intrinsically difficult since the underlying biological processes represent complex interactions of multiple entities. Therefore many functional links would be missing when only one or two source of data is used in the prediction. Our hypothesis is that integrating evidence from multiple and complementary sources could significantly improve the prediction accuracy. In this paper, our experimental results not only suggest that the above hypothesis is valid, but also provide guidelines for using the WNB system for data collection, training and predictions. The combined training data sets contain information from gene annotations, gene expressions, clustering outputs, keyword annotations and sequence homology from public databases. The current system is trained and tested on the genes of budding yeast Saccharomyces cerevisiae. Our WNB model can also be used to analyze the contribution of each source of information toward the prediction performance through the weight training process. The contribution analysis could potentially lead to significant scientific discovery by facilitating the interpretation and understanding of the complex relationships between biological entities.

Original languageEnglish (US)
Title of host publicationProceedings - 2005 IEEE Computational SystemsBioinformatics Conference, CSB 2005
Pages25-34
Number of pages10
DOIs
StatePublished - 2005
Event2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005 - Stanford, CA, United States
Duration: Aug 8 2005Aug 11 2005

Publication series

NameProceedings - 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
Volume2005

Conference

Conference2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005
CountryUnited States
CityStanford, CA
Period8/8/058/11/05

Keywords

  • Bayesian network
  • Gene function prediction
  • Machine learning
  • Yeast

ASJC Scopus subject areas

  • Engineering(all)
  • Medicine(all)

Fingerprint Dive into the research topics of 'Learning yeast gene functions from heterogeneous sources of data using hybrid Weighted Bayesian Networks'. Together they form a unique fingerprint.

  • Cite this

    Deng, X., Geng, H., & Ali, H. (2005). Learning yeast gene functions from heterogeneous sources of data using hybrid Weighted Bayesian Networks. In Proceedings - 2005 IEEE Computational SystemsBioinformatics Conference, CSB 2005 (pp. 25-34). [1498003] (Proceedings - 2005 IEEE Computational Systems Bioinformatics Conference, CSB 2005; Vol. 2005). https://doi.org/10.1109/CSB.2005.38