The Research of Missing Value Estimation of Gene Sequence Based on Improved KNN
Gene based data mining has been received wider and wider attention as gene carries genetic information of living creature. While mining gene information, one of the tasks is to estimate the missing values reasonably and effectively, so as to reflect the original information of gene sequence. By analyzing the theory of KNN (K nearest neighbor algorithm), an improved KNN for gene sequence was proposed, which resolves the problem of missing values while mining gene data. Results show the feasibility of the algorithm with experiments using data from genbank.
Gene sequence Missing values KNN
Cai Qing Wu Qingfeng Dong Huailin Liu Han
Software School Xiamen University Xiamen, Fujian Province 361005, China
国际会议
第四届国际计算机新科技与教育学术会议(2009 4th International Conference on Computer Science & Education)
南京
英文
1235-1238
2009-07-25(万方平台首次上网日期,不代表论文的发表时间)