TY - GEN
T1 - The development of parallel adaptive sampling algorithms for analyzing biological networks
AU - Dempsey, Kathryn
AU - Duraisamy, Kanimathi
AU - Bhowmick, Sanjukta
AU - Ali, Hesham
PY - 2012
Y1 - 2012
N2 - The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.
AB - The availability of biological data in massive scales continues to represent unlimited opportunities as well as great challenges in bioinformatics research. Developing innovative data mining techniques and efficient parallel computational methods to implement them will be crucial in extracting useful knowledge from this raw unprocessed data, such as in discovering significant cellular subsystems from gene correlation networks. In this paper, we present a scalable combinatorial sampling technique, based on identifying maximum chordal sub graphs, that reduces noise from biological correlation networks, thereby making it possible to find biologically relevant clusters from the filtered network. We show how selecting the appropriate filter is crucial in maintaining the key structures from the original networks and uncovering new ones after removing noisy relationships. We also conduct one of the first comparisons in two important sensitivity criteria - the perturbation due to the vertex numbers of the network and perturbations due to data distribution. We demonstrate that our chordal-graph based filter is effective across many different vertex permutations, as is our parallel implementation of the sampling algorithm.
KW - chordal graphs
KW - cluster overlap
KW - correlation networks
KW - edge enrichment
KW - ordering
UR - http://www.scopus.com/inward/record.url?scp=84867415020&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84867415020&partnerID=8YFLogxK
U2 - 10.1109/IPDPSW.2012.90
DO - 10.1109/IPDPSW.2012.90
M3 - Conference contribution
AN - SCOPUS:84867415020
SN - 9780769546766
T3 - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
SP - 725
EP - 734
BT - Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
T2 - 2012 IEEE 26th International Parallel and Distributed Processing Symposium Workshops, IPDPSW 2012
Y2 - 21 May 2012 through 25 May 2012
ER -