A New Approach to Batch Effect Removal Based on Distribution Matching in Latent Space

Huaqing Li, Haluk Dogan, Juan Cui

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Scopus citations

Abstract

Advanced measurement techniques such as genomics are capable of acquiring high-throughput data in high dimensions, enabling new scientific discoveries, and offering unique insights in biomedical research. However, biological measurements can be easily affected by systematic variations especially when those measures are obtained from distinct batches involving different platforms and experimental conditions. Such batch effect is usually larger than biological signal of interest and can cause invalid downstream analysis and false discovery if not properly handled. Here we proposed a new learning approach based on multivariate distribution matching in the latent space for batch effect removal while preserving signals of most interest. This new data-driven approach consists of three key components: an autoencoder trained to encode the data into low-dimension neurons that represent data pattern; a similarity measurement procedure to identify batch-effect associated neurons; and a residual network-based matching framework to transform the affected neurons' distribution from one batch to another where the adjusted neurons will be decoded to reconstruct new datasets with batch effect removed. The effectiveness of the proposed approach has been validated in several ways using public genomic data on Alzheimer disease. This new method provides a highly promising tool for complex batch-effect adjustment and outperforms other commonly used methods.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
EditorsIllhoi Yoo, Jinbo Bi, Xiaohua Tony Hu
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages423-430
Number of pages8
ISBN (Electronic)9781728118673
DOIs
StatePublished - Nov 2019
Event2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019 - San Diego, United States
Duration: Nov 18 2019Nov 21 2019

Publication series

NameProceedings - 2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019

Conference

Conference2019 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2019
Country/TerritoryUnited States
CitySan Diego
Period11/18/1911/21/19

Keywords

  • autoencoder
  • batch effect removal
  • distribution matching
  • genomic data
  • machine learning
  • residue network

ASJC Scopus subject areas

  • Biochemistry
  • Biotechnology
  • Molecular Medicine
  • Modeling and Simulation
  • Health Informatics
  • Pharmacology (medical)
  • Public Health, Environmental and Occupational Health

Fingerprint

Dive into the research topics of 'A New Approach to Batch Effect Removal Based on Distribution Matching in Latent Space'. Together they form a unique fingerprint.

Cite this