Cluster-based boosting

L. Dee Miller, Leen Kiat Soh

Research output: Contribution to journalArticlepeer-review

7 Scopus citations


Boosting is an iterative process that improves the predictive accuracy for supervised (machine) learning algorithms. Boosting operates by learning multiple functions with subsequent functions focusing on incorrect instances where the previous functions predicted the wrong label. Despite considerable success, boosting still has difficulty on data sets with certain types of problematic training data (e.g., label noise) and when complex functions overfit the training data. We propose a novel cluster-based boosting (CBB) approach to address limitations in boosting for supervised learning systems. Our CBB approach partitions the training data into clusters containing highly similar member data and integrates these clusters directly into the boosting process. CBB boosts selectively (using a high learning rate, low learning rate, or not boosting) on each cluster based on both the additional structure provided by the cluster and previous function accuracy on the member data. Selective boosting allows CBB to improve predictive accuracy on problematic training data. In addition, boosting separately on clusters reduces function complexity to mitigate overfitting. We provide comprehensive experimental results on 20 UCI benchmark data sets with three different kinds of supervised learning systems. These results demonstrate the effectiveness of our CBB approach compared to a popular boosting algorithm, an algorithm that uses clusters to improve boosting, and two algorithms that use selective boosting without clustering.

Original languageEnglish (US)
Article number6990607
Pages (from-to)1491-1504
Number of pages14
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number6
StatePublished - Jun 1 2015


  • Artificial Intelligence
  • Clustering Algorithms
  • Machine Learning

ASJC Scopus subject areas

  • Information Systems
  • Computer Science Applications
  • Computational Theory and Mathematics


Dive into the research topics of 'Cluster-based boosting'. Together they form a unique fingerprint.

Cite this