A compression-based technique for comparing biological sequences

Ramez Mina, Hesham H. Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Comparing biological sequences represents one of the most important tools in computational biology. By comparing the sequences, we identify similar subsequences which may lead to the identification of structures as well as similar functions. Sequence alignment has been the method of choice for testing similarity and gained a lot of trust among researchers, though this method suffers some shortcomings. In particular, having repetitions in the input sequences often leads to inaccurate results, especially if these repetitions are dispersed overall the sequence. In this paper, we are conducting a study of alternative methods based on compression techniques, borrowed from information theory, to identify accurate comparison of the sequences. We test the proposed technique on various datasets and illustrate that they outperform alignment based methods in several cases.

Original languageEnglish (US)
Title of host publication2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010
Pages94-97
Number of pages4
DOIs
StatePublished - 2010
Event2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010 - Cairo, Egypt
Duration: Dec 16 2010Dec 18 2010

Publication series

Name2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010

Conference

Conference2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010
CountryEgypt
CityCairo
Period12/16/1012/18/10

ASJC Scopus subject areas

  • Biomedical Engineering

Fingerprint Dive into the research topics of 'A compression-based technique for comparing biological sequences'. Together they form a unique fingerprint.

Cite this