TY - GEN
T1 - Evaluating the robustness of correlation network analysis in the aging mouse hypothalamus
AU - Cooper, Kathryn M.
AU - Bonasera, Stephen
AU - Ali, Hesham
N1 - Publisher Copyright:
© Springer International Publishing Switzerland 2015.
PY - 2015
Y1 - 2015
N2 - Volumes of high-throughput assays been made publicly available. These massive repositories of biological data provide a wealth of information that can harnessed to investigate pressing questions regarding aging and disease. However, there is a distinct imbalance between available data generation techniques and data analysis methodology development. Similar to the four “V’s” of big data, biological data has volume, velocity, heterogeneity, and is prone to error, and as a result methods for analysis of this “biomedical big data” have developed at a slower rate. One promising solution to this multi-dimensional issue are network models, which have emerged as effective tools for analysis as they are capable of representing biological relationships en masse. Here we examine the need for development of standards and workflows in the usage of the correlation network model, where nodes and edges represent correlation between expression pattern in genes. One structure identified as biologically relevant in a correlation network, the gateway node, represents genes that change in co-expression between two different states. In this research, we manipulate parameters used to identify the gateway nodes within a given dataset to determine the consistency of results among network building and clustering approaches. This proof-of-concept is extremely important to investigate as there is a growing pool of methods used for various steps in our network analysis workflow, causing a lack of robustness, consistency, and reproducibility. This research compares the original gateway nodes analysis approach with manipulation in (1) network creation and (2) clustering analysis to test the consistency of structural results in the correlation network. To truly be able to trust these approaches, it must be addressed that even minor changes in approach can have sweeping effects on results. The results of this study allow the authors to call for stronger studies in benchmarking and reproducibility in biomedical “big” data analyses.
AB - Volumes of high-throughput assays been made publicly available. These massive repositories of biological data provide a wealth of information that can harnessed to investigate pressing questions regarding aging and disease. However, there is a distinct imbalance between available data generation techniques and data analysis methodology development. Similar to the four “V’s” of big data, biological data has volume, velocity, heterogeneity, and is prone to error, and as a result methods for analysis of this “biomedical big data” have developed at a slower rate. One promising solution to this multi-dimensional issue are network models, which have emerged as effective tools for analysis as they are capable of representing biological relationships en masse. Here we examine the need for development of standards and workflows in the usage of the correlation network model, where nodes and edges represent correlation between expression pattern in genes. One structure identified as biologically relevant in a correlation network, the gateway node, represents genes that change in co-expression between two different states. In this research, we manipulate parameters used to identify the gateway nodes within a given dataset to determine the consistency of results among network building and clustering approaches. This proof-of-concept is extremely important to investigate as there is a growing pool of methods used for various steps in our network analysis workflow, causing a lack of robustness, consistency, and reproducibility. This research compares the original gateway nodes analysis approach with manipulation in (1) network creation and (2) clustering analysis to test the consistency of structural results in the correlation network. To truly be able to trust these approaches, it must be addressed that even minor changes in approach can have sweeping effects on results. The results of this study allow the authors to call for stronger studies in benchmarking and reproducibility in biomedical “big” data analyses.
KW - Aging
KW - Correlation networks
KW - Gateway nodes
KW - Robustness
KW - SPICi
UR - http://www.scopus.com/inward/record.url?scp=84955287582&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84955287582&partnerID=8YFLogxK
U2 - 10.1007/978-3-319-27707-3_14
DO - 10.1007/978-3-319-27707-3_14
M3 - Conference contribution
AN - SCOPUS:84955287582
SN - 9783319277066
T3 - Communications in Computer and Information Science
SP - 224
EP - 238
BT - Biomedical Engineering Systems and Technologies - 8th International Joint Conference, BIOSTEC 2015, Revised Selected Papers
A2 - Elias, Dirk
A2 - Fred, Ana
A2 - Gamboa, Hugo
PB - Springer Verlag
T2 - 8th International Joint Conference on Biomedical Engineering Systems and Technologies, BIOSTEC 2015
Y2 - 12 January 2015 through 15 January 2015
ER -