会议专题

Gene Prediction in Metagenomic Fragments Based on the SVM Algorithm

Metagenomic sequencing is becoming a powerful method to explore various environmental organisms without isolation and cultivation. Genomic sequences data generated by this technology is growing explosively while numerous computational methods for analysis are still urgently in need. One of the first and most important processes is exhaustive gene prediction. As short and anonymous DNA fragments, assembly of metagenomic sequences usually has not a fixed end point to obtain complete genomes and moreover is often not available. This situation makes the annotation more complicated than in complete genomes. Here, we present a newly developed SVM-based algorithm which comprises a supervised universal model and a data-specific novel model. It utilizes entropy density profiles of codon usage, translation initiation signal scoring and open read frame length for model training. Tests on fixed-length artificial shotgun sequences of 700 bp showed a sensitivity of 94.7% and a specificity of 94.9% on average, which indicate that our method has the totally higher performance than the best of current gene prediction methods. Thousands of additional genes are predicted when applied to two metagenomic samples from human gut community. Furthermore, compared to other gene predictors, our algorithm predicts the most potential novel genes.

Yongchu Liu Jiangtao Guo Huaiqiu Zhu

Department of Biomedical Engineering, College of Engineering, Center for Theoretical Biology, Peking University, Beijing 100871, China

国际会议

2011 4th International Conference on Biomedical Engineering and Informatics(第四届生物医学工程与信息学国际会议 BMEI 2011)

上海

英文

1750-1754

2011-10-15(万方平台首次上网日期,不代表论文的发表时间)