Hierarchical Clustering of Lung Cancer Related Genes
It is still at an initial stage to study lung cancer gene by using data mining techniques. In this paper, the hierarchical clustering method was applied to study lung cancer-related genes. Total 367 lung cancer associated genes sequences were down-loaded from GenBank and Ensembal. The nucleotide content distributes of each gene sequences was first calculated via Matlab. Each gene sequence was then defined as a point within a vector space of 84 dimensional, basing on the corresponding nucleotide content distributes. Similarity matrix between every two genes was calculated basing on the Pearson correlation. Hierarchical clustering analysis on the 367 gene sequences had been finally done using agglomerative method. All the data were divided into nine clusters according to the height 0.01.By comparing with the Gene Ontology (GO) annotation (http:// www.geneontology.org), the results show some correlativity between the clusters and GO function classification, which indicates certain correlativity between the base contents of range and the gene functions.
hierarchical cluster lung cancer gene gene ontology function
Yu Wei Zhang Huajia Wu Kuanheng Lin Qiangqian He Miao
Life Science School Sun Yat-sen University Guangzhou 510275, P.R. China
国际会议
上海
英文
63-65
2008-05-16(万方平台首次上网日期,不代表论文的发表时间)