Machine Learning Approach for ab initio Prediction of microRNA Precursors
Although comparative genomics based methods provided important techniques to predict new miRNAs, it is unable to identify novel miRNAs for which there are no known close homologies. It is a fact that almost all pre-miRNAs have the characteristic of stem-loop hairpin structures. Therefore those hairpin structures give key clues to the ab initio prediction of pre-miRNAs. However, a large amount of pre-miRNA-like hairpins can be folded in many genomes. It is challenging to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (pseudo pre-miRNAs). In this paper, to distinguish the real pre-miRNAs from other hairpin sequences with similar stem-loops (pseudo pre-miRNAs), we proposed a novel machine learning method: random forest. Coupled with a hybrid feature which consists of local contiguous structure-sequence composition, minimum of free energy (MFE) of the secondary structure and p-value of randomization test, the prediction model achieves 98.21% specificity and 95.09% sensitivity.
real/pseudo pre-miRNAs classification random forest
Peng Jiang Wenkai Wang Fei Sang Jing Tong Zuhong Lu
State Key Laboratory of Bioelectronics,Department of Biological Science and Medical Engineering,Sout State Key Laboratory of Bioelectronics,Department of Biological Science and Medical Engineering,Sout
国际会议
The 5th International Forum on Post-genome Technologies(5IFPT)(第五届国际后基因组生命科学技术学术论坛)
苏州
英文
2007-09-10(万方平台首次上网日期,不代表论文的发表时间)