Characteristic Genes Selection Research in Colon Cancer Gene Expression Profile
In recent years, the technology of tumor gene expression profile offers an entirely new, systemic research method for oncology and is widely concerned in basic research and clinical application of oncology area. With the large-scale development of gene expression profile, its a key research topic for information biology that how to analyze gene expression profile effectively and how to dig up and find out the undiscovered information. This article mainly focuses on the way to choose characteristic genes which can determinate whether it is a colon cancer from the whole gene expression profile. In this paper, we propose an algorithm named SVM-RFE-k to select characteristic genes from thousands of genomes. We apply the Bhattacharyya distance of genes to measure the amount of information contained in genes based on the biological knowledge and gene selection method. Through that we can remove the irrelevant genes, on this basis using the gene selection algorithms,SVM-REF-K algorithm.to find characteristic genes in genetic information.According to statistics, we get 4 genes finally which can identify the colon cancer at the accuracy rate of 92% in test set. Compared with the existing Relief algorithm which has an accuracy rate of 83%, the SVMRFE-k method has an advantage in identifying colon cancer.
DNA microarray technology SVM-RFE-k Bhattacharyya Distance characteristic genome
Yisheng Jin Bei Jin Jinkui Xie Zongyuan Yang
Department of Computer Science & Technology East China Normal University Shanghai, P.R.China
国际会议
2011 International Conference on Database and Data Mining(ICDDM 2011)(2011年数据库和数据挖掘国际会议)
三亚
英文
186-190
2011-03-25(万方平台首次上网日期,不代表论文的发表时间)