TY - GEN
T1 - A new approach to clustering biological data using message passing
AU - Geng, Huimin
AU - Bastola, Dhundy
AU - Ali, Hesham
PY - 2004
Y1 - 2004
N2 - Clustering algorithms are widely used in bioinformatics to classify data, as in the analysis of gene expression and in the building of phylogenetic trees. Biological data often describe parallel and spontaneous processes. To capture these features, we propose a new clustering algorithm that employs the concept of message passing. Message Passing Clustering (MPC) allows data objects to communicate with each other and produces clusters in parallel, thereby making the clustering process intrinsic. We have proved that MPC shares similarity with Hierarchical Clustering (HC) but offers significantly improved performance because it takes into account both local and global structure. We analyzed 35 sets of simulated dynamic gene expression data, achieving a 95% hit rate in which 639 genes out of total 674 genes were correctly clustered. We have also applied MPC to a real data set to build a phylogenetic tree from aligned mycobacterium sequences. The results show higher classification accuracies as compared to traditional clustering methods such as HC.
AB - Clustering algorithms are widely used in bioinformatics to classify data, as in the analysis of gene expression and in the building of phylogenetic trees. Biological data often describe parallel and spontaneous processes. To capture these features, we propose a new clustering algorithm that employs the concept of message passing. Message Passing Clustering (MPC) allows data objects to communicate with each other and produces clusters in parallel, thereby making the clustering process intrinsic. We have proved that MPC shares similarity with Hierarchical Clustering (HC) but offers significantly improved performance because it takes into account both local and global structure. We analyzed 35 sets of simulated dynamic gene expression data, achieving a 95% hit rate in which 639 genes out of total 674 genes were correctly clustered. We have also applied MPC to a real data set to build a phylogenetic tree from aligned mycobacterium sequences. The results show higher classification accuracies as compared to traditional clustering methods such as HC.
UR - http://www.scopus.com/inward/record.url?scp=14044270139&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14044270139&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:14044270139
SN - 0769521940
SN - 9780769521947
T3 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
SP - 493
EP - 494
BT - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
T2 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
Y2 - 16 August 2004 through 19 August 2004
ER -