A Fast Pattern Matching Algorithm for Biological Sequences

摘要：

With the remarkable increase in the number of DNA and proteins sequences, it is more important for the study of pattern matching in querying sequence patterns in the biological sequence database. To further raise the performance of the pattern matching algorithm, a fast exact algorithm (called ZTBMH), which is a variation of Zhu-Takaoka algorithm, is presented. It absorbs the idea of Boyer-Moore-Horspool algorithm, which utilizes only bad character heuristic and reduces the number of comparisons, thus improves the performance in practice. The best, worst and average cases in time complexities of the new algorithm are also discussed in this paper. The experimental results show that the proposed algorithm works better than other compared algorithms, especially in case of small alphabets such as nucleotides sequences, and thus the proposed algorithm is quite applicable for exact pattern matching in biological sequences.

关键词： pattern matching algorithm biological sequences

作者: Yong Huang Xuezeng Pan Yunjun Gao Guoyong Cai

作者单位: College of Computer Science and Technology Zhejiang University Hangzhou, China College of Computer Science and Technology Guilin University of Electronic Science and Technology Gu

会议类型: 国际会议

会议名称: The 2nd International Conference on Bioinformatics and Biomedical Engineering(iCBBE 2008)(第二届生物信息与生物医学工程国际会议)

会议地点: 上海

会议语种:英文

页码: 608-611

在线出版日期: 2008-05-16（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A Fast Pattern Matching Algorithm for Biological Sequences