A New Method for Finding Approximate Repetitions in DNA Sequences

摘要：

Searching for approximate repetitions in a DNA sequence has been an important topic in gene analysis. One of the problems in the study is that because of the varying lengths of patterns, the similarity between patterns cannot be judged accurately if we use only the concept of ED ( Edit Distance ). In this paper we shall make effort to define a new function to compute similarit-y, which considers both the difference and sameness between patterns at the same time. Seeing the computational complexity, we shall also propose new filter methods based on frequency vector, with which we can sort out candidate set of approximate repetitions efficiently. We use SUA instead of sliding window to get the fragments in a DNA sequence, so that the patterns of an approximate repetition have no limitation on length. The results show that with this technique we are able to find a bigger number of approximate repetitions than that of those found with tandem repeat finder

关键词： approximate repetitions DNA sequences similarity SUA

作者: Yajun Jiang Zhenlun Yang Zengrong Zhan

作者单位: School of Information Engineering, Guangzhou Panyu Polytechnic Shawan Qinshanhu, Panyu District, Guangzhou 511483, China

会议类型: 国际会议

会议名称: 2010 2nd International Conference on Signal Processing System(2010年信号处理系统国际会议 ICSPS 2010)

会议地点: 大连

会议语种:英文

页码: 1643-1649

在线出版日期: 2010-07-05（万方平台首次上网日期，不代表论文的发表时间）

会议专题

A New Method for Finding Approximate Repetitions in DNA Sequences