会议专题

Prediction of Mucin-type O-glycosylation by Support Vector Machines

Mucin-type O-glycosylation is one of the main types of the mammalian protein glycosylation. It is serine (Ser) or threonine (Thr) speci.c, though any consensus sequence is still unknown. In this report, support vector machines (SVM) are used for the prediction of O-glycosylation for each Ser or Thr site in the protein sequences. 99 mammalian protein sequences are selected from UniProt8.0. A certain length of a protein subsequence with Ser or Thr site at the center is used as input data to SVM, after the encoding in three ways. That is, sparse encoding, 5-letter encoding, and multiple encoding which uses both sparse and 5-letter encodings. The results of prediction experiments show that multiple encoding is most effective. The effective prediction requires the detailed information on amino acid residues in the nearest neighbors of the prediction target site, and the relatively rough information of biochemical characteristics on amino acid residues within approximately the 15th nearest neighbors of the target site. In addition, it is observed that the ratio of positive to negative data for the learning affects the performance.

Ikuko Nishikawa Hirotaka Sakamoto Ikue Nouno Kazutoshi Sakakibara Masahiro Ito

Ritsumeikan University 1-1-1 Noji-higashi, Kusatsu, 525-8577 Japan

国际会议

2007 IEEE/ICME International Conference on Complex Medical Engineering-CME2007(CME2007 第二届国际复合医学工程学术大会)

北京

英文

1901-1905

2007-05-23(万方平台首次上网日期,不代表论文的发表时间)