Mining Order-Preserving Submatrices Based on Frequent Sequential Pattern Mining
Order-Preserving Submatrices (OPSMs) have been widely accepted as a pattern-based biclustering and used in gene expression data analysis.The OPSM problem aims at finding the groups of genes that exhibit similar rises and falls under some certain conditions.However,most methods are heuristic algorithms which are unable to reveal OPSMs entirely.In this paper,we proposed an exact method to discover all OPSMs based on frequent sequential pattern mining.Firstly,an algorithm is adjusted to disclose all common subsequences (ACS) between every two sequences.Then an improved data structure for prefix tree was used to store and traverse all common subsequences,and Apriori Principle was employed to mine the frequent sequential pattern efficiently.Finally,the experiments were implemented on a real data set and GO analysis was applied to identify whether the patterns discovered were biological significant.The results demonstrate the effectiveness and the efficiency of this method.
OPSM biclustering all common subsequences Apriori Principle frequent sequence the prefix tree
Yun Xue Yuting Li Weijun Deng Jiejin Li Jianxiong Tang Zhengling Liao Tiechen Li
School of Physics and Telecommunication Engineering,South China Normal University, Guangzhou, China, 510006
国际会议
The Third International Coference on Health Information Science(HIS2014)2014年第三届健康信息学国际学术会议
深圳
英文
184-193
2014-04-22(万方平台首次上网日期,不代表论文的发表时间)