Seed selection algorithm through K-means on optimal number of clusters

Kuntal Chowdhury, Debasis Chaudhuri, Arup Kumar Pal, Ashok Samal

Research output: Contribution to journalArticlepeer-review

34 Scopus citations

Abstract

Clustering is one of the important unsupervised learning in data mining to group the similar features. The growing point of the cluster is known as a seed. To select the appropriate seed of a cluster is an important criterion of any seed based clustering technique. The performance of seed based algorithms are dependent on initial cluster center selection and the optimal number of clusters in an unknown data set. Cluster quality and an optimal number of clusters are the important issues in cluster analysis. In this paper, the proposed seed point selection algorithm has been applied to 3 band image data and 2D discrete data. This algorithm selects the seed point using the concept of maximization of the joint probability of pixel intensities with the distance restriction criteria. The optimal number of clusters has been decided on the basis of the combination of seven different cluster validity indices. We have also compared the results of our proposed seed selection algorithm on an optimal number of clusters using K-Means clustering with other classical seed selection algorithms applied through K-Means Clustering in terms of seed generation time (SGT), cluster building Time (CBT), segmentation entropy and the number of iterations (NOTK−means). We have also made the analysis of CPU time and no. of iterations of our proposed seed selection method with other clustering algorithms.

Original languageEnglish (US)
Pages (from-to)18617-18651
Number of pages35
JournalMultimedia Tools and Applications
Volume78
Issue number13
DOIs
StatePublished - Jul 15 2019

Keywords

  • Cluster building time
  • Cluster validity indices
  • Clustering
  • Joint probability
  • K-means
  • Seed generation time
  • Seed point
  • Segmentation entropy

ASJC Scopus subject areas

  • Software
  • Media Technology
  • Hardware and Architecture
  • Computer Networks and Communications

Fingerprint

Dive into the research topics of 'Seed selection algorithm through K-means on optimal number of clusters'. Together they form a unique fingerprint.

Cite this