Artificial Neural Networks and Support Vector Machine Identify Alu Elements as Being Associated with Human Housekeeping Genes
The human genome contains the most common 75S-and tRNA-derived short interspersed nuclear repetitive DNA elements (SINEs), named Alu. Alu elements, other SINEs, and processed pseudogenes are all processed by the same retrotransposition machinery. Most housekeeping genes contain multiple copies of processed pseudogenes. The present study showed that mean percentage of SINEs in the sequences of housekeeping genes was significantly higher than that of neuron-(p < 0.001) and myocyte-specific genes (p < 0.01). Consistently, GEP, RBF, MLP, PNN, and SVM showed that SINEs were the most important factor associated with housekeeping genes, with the value > 19.54% being most predictive. Based on the area under the receiver operating characteristic curves, there was no significant difference among these classifiers. Detailed analysis of the components of SINEs showed that housekeeping genes contained more Alus than neuron- and myocyte-specific genes (p < 0.001), which were supported by all neural networks and SVM.
genome neuron myocyte interspersed element GEP RBF MLP PNN SVM decision tree
Permphan Dharmasaroja
Department of Anatomy Faculty of Science, Mahidol University Bangkok, Thailand
国际会议
上海
英文
1676-1680
2011-10-15(万方平台首次上网日期,不代表论文的发表时间)