TY - GEN
T1 - A new clustering strategy with stochastic merging and removing based on kernel functions
AU - Geng, Huimin
AU - Ali, Hesham
PY - 2005
Y1 - 2005
N2 - With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.
AB - With hierarchical clustering methods, divisions or fusions, once made, are irrevocable. As a result, when two elements in a bottom-up algorithm are assigned to one cluster, they cannot subsequently be separated. Also, when a top-down algorithm separates two elements, they can't be rejoined. Such greedy property may lead to premature convergence and consequently lead to a clustering that is far from optimal. To overcome this problem, we propose a new Stochastic Message Passing Clustering (SMPC) method based on the Message Passing Clustering (MPC) algorithm introduced in our earlier work [1]. SMPC, as a generalized version of MPC, extends the clustering algorithm from a deterministic process to a stochastic process, adding two major advantages. First, in deciding the merging cluster pair, the influences of all clusters are quantified by probabilities, estimated by kernel functions based on their relative distances. Secondly, clustering can be undone to improve the clustering performance when the algorithm detects elements which don't have good probabilities inside the cluster and moves them outside. The test results on colon cancer gene-expression data show that SMPC performs better than the deterministic MPC or hierarchical clustering method.
UR - http://www.scopus.com/inward/record.url?scp=33749059189&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=33749059189&partnerID=8YFLogxK
U2 - 10.1109/CSBW.2005.10
DO - 10.1109/CSBW.2005.10
M3 - Conference contribution
AN - SCOPUS:33749059189
SN - 0769524427
SN - 9780769524429
T3 - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
SP - 41
EP - 42
BT - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
T2 - 2005 IEEE Computational Systems Bioinformatics Conference, Workshops and Poster Abstracts
Y2 - 8 August 2005 through 11 August 2005
ER -