会议专题

MISSING VALUE ESTIMATION FOR GENE EXPRESSION DATA USING GRNN

Gene expression datasets often contain missing values. Effective missing value estimation methods are required since missing values affect downstream analysis such as hierarchical clustering and K-means clustering. Traditionally, two simple methods were used to solve this problem: replace all the missing values by zeros; replace missing values by row averages. These methods have their deficiencies, for they dont take advantage of the rich information provided by the expression patterns of other genes. Recently, some methods such as KNN or SVD were proposed to solve this problem. But few papers mentioned solving it by neural networks. In this paper, we propose a method using general regression neural network (GRNN) for missing value estimation. This method consists of 2 steps: (1) Selection of genes for estimation; Here L2-norm is used to obtain genes that used for estimation. (2) Recovering the missing data using GRNN. Chosen genes are used to train the GRNN. When the net is constructed, we use the existing data in the genes for recovering as inputs, and the missing values as outputs. In all the comparative studies of this method with traditional methods, it has the minimum Normalized Root Mean Squared Error (NRMSE). And this demonstrates the feasibility that using GRNN for missing value estimation.

Missing data estimation GRNN gene expression data

Hui Yi Xiaofeng Song

Department of Biomedical engineering, Nanjing University of Aeronautics and Astronautics,Nanjing 210016, China

国际会议

The 4th International Forum on Post-genome Technologies(4IFPT)(第四届国际后基因组生命科学技术学术论坛)

杭州

英文

375-378

2006-09-25(万方平台首次上网日期,不代表论文的发表时间)