Subcellular Locations Prediction of Proteins Based on Chaos Game Representation
To understand the functions of various proteins, it would be helpful to obtain information about their subcellular locations. With the rapid accumulation of newly found protein sequence data in databanks, it would be worthwhile to develop a fast computational prediction method to identify proteins subcellular location. In this paper, we considered 4 subcellular locations of proteins from rice: chloroplast, cytoplasmic, integral membrane protein and nucleus. Our data set is the proteins with known locations from the SWISS-PROT and TrEMBL database. We introduced the Chaos Game Representation (CGR) of protein to transform the protein sequence into the numerical vector, instead of the quasi-amino acid composition. Furthermore, we added two dimensions in the end based on the amino acids physics chemistry properties. The results show that the Chaos Game Representation is better than the amino acid composition, and the new characters can improve the accuracy obviously.
subcellular locations chaos game representation SVM instablity indez active residue
Li Nana Niu Xiaohui Shi Feng Hu Xuehai
College of Science,Huazhong Agriculture University Wuhan,PR China
国际会议
北京
英文
1-4
2009-06-11(万方平台首次上网日期,不代表论文的发表时间)