会议专题

Clustering of SNP Data based on SCLIQUE

SNP clustering is an indispensable exploratory tool of biology researchers, which can identify coexpression or co-regulated genes, and predict functions of unknown genes according to the same cluster of genes with known ones. CLIQUE clustering algorithm is an effective way to solve highdimensional clustering problems, but it is not applicable for categorical data. Single nucleotide polymorphisms (SNPs) are single base pair positions in genomic DNA at which different sequence alternatives (alleles) exist in normal individuals in some population(s). SNPS data is genotype value, which belongs to the categorical data. In this paper, we improve CLIQUE algorithm aimed at SNP clustering from three aspects: re-defining the grids division, re-defining common face between two units, redefining rules on the generation of high-dimensional candidate dense units. Experiments show that the proposed algorithm SCLIQUE not only takes the advantages of CLIQUE algorithm, but also expands CLIQUE clustering algorithm from numer ical space to categorical space.

SNP clustering high dimensional clustering SCLIQUE algorithm categorical data

Min Jia Yue Wu Zhou Lei Zongtian Liu

Computer Engineering and Science Shanghai, China

国际会议

2011 International Conference on Computer Science and Network Technology(2011计算机科学与网络技术国际会议 ICCSNT 2011)

哈尔滨

英文

2359-2363

2011-12-24(万方平台首次上网日期,不代表论文的发表时间)