会议专题

Algorithm for finding coding signal using homogeneous Markov chains independently for three codon positions

Many currently used algorithms for protein coding sequences require large learning sets of true genes to estimate sensible values for used parameters which are necessary to make the prediction reasonable. They also fail in recognition of short genes which usually contain weak coding signal. To avoid these problems, we worked out a new algorithm for finding protein coding potential in prokaryotic genomes. This algorithm uses homogeneous Markov chain for modeling nucleotide transition between fixed positions in codons thereby reduces order of Markov chain retaining simultaneously information on dependence between nucleotides in sequence on relatively long distances. We tested performance of this algorithm in relationship to size of the learning set with true and false positive rates for different model orders. We also made some comparisons between our algorithm and commonly used GeneMark. The presented algorithm works better especially for smaller learning sets.

ORF gene finding Markov chains

Pawel Blazej Pawel Mackiewicz StanislawCebrat

Department of Genomics, Faculty of biotechnology University of Wroclawul. Przybyszewskiego 63/77,51-148 Wroclaw, Poland

国际会议

2011 International Conference on Bioinformatics and Computational Biology(ICBCB 2011)(2011年生物信息学与计算生物学国际会议)

海口

英文

20-24

2011-02-22(万方平台首次上网日期,不代表论文的发表时间)