Intron Identification Approaches Based on Weighted Features and Fuzzy Decision Trees
Current computational predictions of splice sites largely depend on the sequence patterns of known intronic sequence features (ISFs) described in the classical intron definition model (IDM). The computation-oriented IDM (COIDM) clearly provides more specific and concrete information for describing intron flanks of splice sites (IFSSs). In the paper, we proposed a novel approach of fuzzy decision trees (FDTs) which utilize 1) weighted ISFs of twelve uni-frame patterns (UFPs) and forty-five multi-frame patterns (MFPs) and 2) gain ratios to improve the performances in identifying an intron. First, we fuzzified extracted features from genomic sequences using membership functions with an unsupervised self-organizing map (SOM) technique. Then, we brought in different viewpoints of globally weighting and crossly referring in generating fuzzy rules which are interpretable and useful for biologists to verify whether a sequence is an intron or not. Finally, the experimental results revealed the effectiveness of the proposed method in improving the identification accuracy. Besides, we also implemented an on-line intronic identifier to infer an unknown genomic sequence.
Yin-Fu Huang Ching Ping Liang Sing-Wu Liou
Graduate School of Computer Science and Information Engineering National Yunlin University of Scienc Graduate School of Engineering Science and Technology National Yunlin University of Science and Tech
国际会议
成都
英文
1-4
2010-06-18(万方平台首次上网日期,不代表论文的发表时间)