An Efficient Subsequences Mining Algorithm
As a step forward to analyzing patterns in sequences, we introduce the problem of mining closed repetitive gapped subsequences and propose efficient solutions. Given a database of sequences where each sequence is an ordered list of events, the pattern we would like to mine is called repetitive gapped subsequence. Different from the sequential pattern mining problem, repetitive support captures not only repetitions of a pattern in different sequences but also the repetitions within a sequence. Given a users-specified support threshold min_sup, we study finding the set of all patterns with repetitive support no less than min _sup. To obtain a compact yet complete result set and improve the efficiency, we also study finding closed patterns. Efficient mining algorithms to find the complete set of desired patterns are proposed based on the idea of instance growth. Our performance study on various datasets shows the efficiency of our approach. A case study is also performed to show the utility of our approach.
Sequence Database Closed Data Mining Sequence Pattern
Hongyan Pan
Department of Computer Science Zhejiang Business Technology Institute Ningbo 315012,PRC
国际会议
北京
英文
1-4
2009-06-11(万方平台首次上网日期,不代表论文的发表时间)