TY - GEN
T1 - A compression-based technique for comparing biological sequences
AU - Mina, Ramez
AU - Ali, Hesham H.
PY - 2010
Y1 - 2010
N2 - Comparing biological sequences represents one of the most important tools in computational biology. By comparing the sequences, we identify similar subsequences which may lead to the identification of structures as well as similar functions. Sequence alignment has been the method of choice for testing similarity and gained a lot of trust among researchers, though this method suffers some shortcomings. In particular, having repetitions in the input sequences often leads to inaccurate results, especially if these repetitions are dispersed overall the sequence. In this paper, we are conducting a study of alternative methods based on compression techniques, borrowed from information theory, to identify accurate comparison of the sequences. We test the proposed technique on various datasets and illustrate that they outperform alignment based methods in several cases.
AB - Comparing biological sequences represents one of the most important tools in computational biology. By comparing the sequences, we identify similar subsequences which may lead to the identification of structures as well as similar functions. Sequence alignment has been the method of choice for testing similarity and gained a lot of trust among researchers, though this method suffers some shortcomings. In particular, having repetitions in the input sequences often leads to inaccurate results, especially if these repetitions are dispersed overall the sequence. In this paper, we are conducting a study of alternative methods based on compression techniques, borrowed from information theory, to identify accurate comparison of the sequences. We test the proposed technique on various datasets and illustrate that they outperform alignment based methods in several cases.
UR - http://www.scopus.com/inward/record.url?scp=79952551940&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=79952551940&partnerID=8YFLogxK
U2 - 10.1109/CIBEC.2010.5716047
DO - 10.1109/CIBEC.2010.5716047
M3 - Conference contribution
AN - SCOPUS:79952551940
SN - 9781424471706
T3 - 2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010
SP - 94
EP - 97
BT - 2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010
T2 - 2010 5th Cairo International Biomedical Engineering Conference, CIBEC 2010
Y2 - 16 December 2010 through 18 December 2010
ER -