Feature Genes Selection for Colon Tumor Based on Association Analysis
The data of gene expression profile is characterized by small sample and high dimension, and it contains a large number of redundant genes, which reduce the diagnostic accuracy of cancer and increase the calculating complexity. Hence, it is necessary to find the feature genes. In this paper, we propose the FGSAM algorithm (Feature Genes Selection Algorithm based on the pointwise Mutual information) for selecting of the feature genes of Colon Tumor. Firstly, using the approach of information gain, we get candidate feature genes. Then, based on pointwise mutual information, we calculate the association grade of gene pairs to filter the candidate set Finally, we validate the selected feature genes by three methods Bayesian Network, C4.5 algorithm and Decision Table. In our experiments, the algorithm filters 21 redundant genes from the candidate set, which greatly reduces the scale of feature genes for the colon tumor. Meantime, the final classification accuracy can achieve 85.48%, and it also proves the method we used is feasible and effective.
Feature Genes Feature Selection Association Analysis Pointwise Mutual Information
Qin Xiangqing Hong Chun Huang Cuili
College of Computer Science Sichuan University Chengdu, China Department of Mathematics Sichuan University Chengdu, China
国际会议
2010 International Conference on Future Information Technology(2010年未来信息技术国际会议 ICFIT 2010)
长沙
英文
704-708
2010-12-14(万方平台首次上网日期,不代表论文的发表时间)