High Throughput Biological Data (HTBD) requires detailed analysis methods and from a life science perspective, these analysis results make most sense when interpreted within the context of biological pathways. Bayesian Networks (BNs) capture both linear and nonlinear interactions and handle stochastic events in a probabilistic framework accounting for noise making them viable candidates for HTBD analysis. We have recently proposed an approach, called Bayesian Pathway Analysis (BPA), for analyzing HTBD using BNs in which known biological pathways are modeled as BNs and pathways that best explain the given HTBD are found. BPA uses the fold change information to obtain an input matrix to score each pathway modeled as a BN. Scoring is achieved using the Bayesian-Dirichlet Equivalent method and significance is assessed by randomization via bootstrapping of the columns of the input matrix. In this study, we improve on the BPA system by optimizing the steps involved in "Data Preprocessing and Discretization", "Scoring", "Significance Assessment", and "Software and Web Application". We tested the improved system on synthetic data sets and achieved over 98% accuracy in identifying the active pathways. The overall approach was applied on real cancer microarray data sets in order to investigate the pathways that are commonly active in different cancer types. We compared our findings on the real data sets with a relevant approach called the Signaling Pathway Impact Analysis (SPIA).
ASJC Scopus subject areas
- Biochemistry, Genetics and Molecular Biology(all)
- Agricultural and Biological Sciences(all)