TY - GEN
T1 - SRPVS
T2 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
AU - Huang, Xiaolu
AU - Ali, Hesham
AU - Sadanandam, Anguraj
AU - Singh, Rakesh
PY - 2004
Y1 - 2004
N2 - In some protein sequence regions, when two sequences share similar amino acid composition, they also share the same biological structure regardless of the sequence order. Traditional protein analysis tools, since they are sequence order dependent, cannot detect such a sequence order relaxing similarity. In this study, a more flexible protein comparison algorithm, the Similar enRiched Parikh Vector Searching (SRPVS) algorithm is designed to detect sequence similarity in a local-sequence-order-flexible manner. In SRPVS, a peptide sequence is broken into a group of Parikh vectors of predefined word sizes, and then Similar enRiched Parikh Vectors (SRPV) are searched between the two sequences and an Order Score is assigned to each pair of SRPV to reflect the order difference between the two sequences. A test has shown that SRPVS can detect shuffled protein sequence regions that share biological structure between two protein sequences.
AB - In some protein sequence regions, when two sequences share similar amino acid composition, they also share the same biological structure regardless of the sequence order. Traditional protein analysis tools, since they are sequence order dependent, cannot detect such a sequence order relaxing similarity. In this study, a more flexible protein comparison algorithm, the Similar enRiched Parikh Vector Searching (SRPVS) algorithm is designed to detect sequence similarity in a local-sequence-order-flexible manner. In SRPVS, a peptide sequence is broken into a group of Parikh vectors of predefined word sizes, and then Similar enRiched Parikh Vectors (SRPV) are searched between the two sequences and an Order Score is assigned to each pair of SRPV to reflect the order difference between the two sequences. A test has shown that SRPVS can detect shuffled protein sequence regions that share biological structure between two protein sequences.
UR - http://www.scopus.com/inward/record.url?scp=14044249499&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=14044249499&partnerID=8YFLogxK
M3 - Conference contribution
AN - SCOPUS:14044249499
SN - 0769521940
SN - 9780769521947
T3 - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
SP - 674
EP - 675
BT - Proceedings - 2004 IEEE Computational Systems Bioinformatics Conference, CSB 2004
Y2 - 16 August 2004 through 19 August 2004
ER -