A Cross-Entropy Based Feature Selection Method for Binary Valued Data Classification

Zhipeng Wang, Qiuming Zhu

Research output: Contribution to journalArticlepeer-review

1 Scopus citations


Feature selection is a process of finding a meaningful subset of attributes from a given set of measurements for a purpose of revealing a coherent relation or causality in an event. The process is often indispensable to facilitate an effective pattern classification. It is usually a preprocessing step before constructing a machine learning model in big data analytics for improving the accuracy of predictive results. By selecting the most significant features, it could reduce the time of training and the complexity of the model, avoid data overfitting, and help user to better understand the source data and the modeling outcomes. Though features are commonly dealt with in continuous values, many features appear to be binary valued, i.e., either 1 or 0, in many real-world machine learning applications. Inspired by existing feature selection methods, a new framework called FMC_SELECTOR was presented in this paper which addresses specifically the selection of significant features of binary valued attributes from highly imbalanced large datasets. The FMC_SELECTOR combines the fisher linear discriminant analysis with a cross-entropy mechanism to create an integrated mapping function for evaluating each individual features from a given dataset. A new formulization called Mapping Based Cross-Entropy Evaluation (MCE) was derived for a quantitative ranking of the features. A Positive Case Prediction Score (PPS) is explored to verify the significance of the features selected in a classification process.

Original languageEnglish (US)
Pages (from-to)226-238
Number of pages13
JournalInternational Journal of Computer Information Systems and Industrial Management Applications
StatePublished - 2022


  • Binary features
  • Cross entropy
  • Feature selection
  • Model verification
  • Pattern classification

ASJC Scopus subject areas

  • Management Information Systems
  • Signal Processing
  • Information Systems
  • Computer Vision and Pattern Recognition
  • Strategy and Management
  • Artificial Intelligence


Dive into the research topics of 'A Cross-Entropy Based Feature Selection Method for Binary Valued Data Classification'. Together they form a unique fingerprint.

Cite this