SWAT: A new spliced alignment tool tailored for handling more sequencing errors

Yifeng Li, Hesham H. Ali

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

There are several computer programs that align mRNA with its genomic counterpart to determine exon boundaries. Though most of these programs perform such alignment efficiently and accurately, they can only tolerate a relatively small number of sequencing errors. These programs also highly depend on the GT/AG rule in finding splice sites. Both properties make them less desirable in the case of aligning EST reconstructed transcript with genomic DNA to identify splicing variants, where a lot of sequencing errors and noncanonical splice sites are expected. Using a novel heuristic algorithm, we developed a tool that can handle much more sequencing errors. Test dataset results indicated that SWAT (Sequencing-error Well-handled Alignment Tool) has a much stronger error-handling ability than Sim4 and Spidey, two other popular spliced alignment tools. In the presence of up to 10 percent randomly introduced sequencing errors, it can still give the precise number of exons and exon boundaries in most cases. The robustness of SWAT makes it a desirable tool in cases where sequencing error is a concern. A web service is freely available at http://appl.unmc.edu/swat/swat.html.

Original languageEnglish (US)
Title of host publicationLecture Notes in Computer Science
EditorsV.S. Sunderam, G.D. Albada, P.M.A. Sloot, J.J. Dongarra
Pages927-935
Number of pages9
Volume3515
EditionII
StatePublished - 2005
Event5th International Conference on Computational Science - ICCS 2005 - Atlanta, GA, United States
Duration: May 22 2005May 25 2005

Other

Other5th International Conference on Computational Science - ICCS 2005
Country/TerritoryUnited States
CityAtlanta, GA
Period5/22/055/25/05

ASJC Scopus subject areas

  • Computer Science (miscellaneous)

Fingerprint

Dive into the research topics of 'SWAT: A new spliced alignment tool tailored for handling more sequencing errors'. Together they form a unique fingerprint.

Cite this