Impact of Imputation Strategies on Fairness in Machine Learning

Simon Caton, Saiteja Malisetty, Christian Haas

Research output: Contribution to journalArticlepeer-review

Abstract

Research on Fairness and Bias Mitigation in Machine Learning often uses a set of reference datasets for the design and evaluation of novel approaches or definitions. While these datasets are well structured and useful for the comparison of various approaches, they do not reflect that datasets commonly used in real-world applications can have missing values. When such missing values are encountered, the use of imputation strategies is commonplace. However, as imputation strategies potentially alter the distribution of data they can also affect the performance, and potentially the fairness, of the resulting predictions, a topic not yet well understood in the fairness literature. In this article, we investigate the impact of different imputation strategies on classical performance and fairness in classification settings. We find that the selected imputation strategy, along with other factors including the type of classification algorithm, can significantly affect performance and fairness outcomes. The results of our experiments indicate that the choice of imputation strategy is an important factor when considering fairness in Machine Learning. We also provide some insights and guidance for researchers to help navigate imputation approaches for fairness.

Original languageEnglish (US)
Pages (from-to)1011-1035
Number of pages25
JournalJournal of Artificial Intelligence Research
Volume74
DOIs
StatePublished - 2022

ASJC Scopus subject areas

  • Artificial Intelligence

Fingerprint

Dive into the research topics of 'Impact of Imputation Strategies on Fairness in Machine Learning'. Together they form a unique fingerprint.

Cite this