Long sequence feature extraction based on deep learning neural network for protein secondary structure prediction
In this paper,a long sequence feature extraction method(LSFE)is proposed for protein secondary structure prediction.The proposed method is based on deep learning architecture which is mainly composed of three-layers: sparse auto-encoder,convolution feature extraction layer,and the softmax classifier.PSSM(position-specific scoring matrix)is used as the raw sequence representation.Two groups of self-taught feature filters are learned from 5-polypeptides and 13-polypeptides by the sparse auto-encoder layer.Finally,the new representations of 35-polypeptides got by the convolution layer are fed into the softmax classifier,as the top shallow classifier,for fast prediction.The experimental results indicate that overall accuracy(Q3)of around 74%on 25PDB is got within very short waiting time.Hence this deep learning architecture breaks up the top bound of window size in the art-of-state SVM+PSSM classifier,and showing the potential power in future work on bigger dataset.
Sparse auto-encoder Convolutional neural network Self-taught learning Feature extraction Protein secondary structure prediction Softmax classifier
Yehong Chen
School of Printing & Packaging,Qilu University of Technology,Jinan,China
国际会议
重庆
英文
843-847
2017-10-03(万方平台首次上网日期,不代表论文的发表时间)