Semi-supervised protein subcellular localization
Background: Protein subcelluar localization is concerned with predicting the location of a protein within a cel using computational method. The location information can indicate key functionalities of proteins. Accurate predictions of subcellular localizations of protein can aid the prediction of protein function and genome annotation, as well as the identification of drug targets. Computational methods based on machine learning, such as support vector machine approaches, have already been widely used in the prediction of protein subcellular localization. However, a major drawback of these machine learning-based approaches is that a large amount of data should be labeled in order to let the prediction system learn a classifier of good generalization ability. However, in real world cases, it is laborious, expensive and time-consuming to experimentally determine the subcellular localization of a protein and prepare instances of labeled data.Results: In this paper, we present an approach based on a new learning framework, semisupervised learning, which can use much fewer labeled instances to construct an high quality prediction model. We construct an initial classifier using a small set of labeled examples first,and then use unlabeled instances to refine the classifier for future predictions.Conclusions: Experimental results show that our methods can effectively reduce the workload for labeling data using the unlabeled data. Our method is shown to enhance the state-of-the-art prediction results of SVM classifiers by more than 10%.
Qian XU Derek Hao HU Hong XUE Weichuan YU Qiang YANG
Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Department of Computer Science and Engineering, Hong Kong University of Science and Technology, Clea Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Program of Bioengineering, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon,
国际会议
The 7th Asia-Pacific Bioinformatics Conference(第七届亚太生物信息学大会)
北京
英文
537-547
2009-01-01(万方平台首次上网日期,不代表论文的发表时间)