An Efficient Algorithm for Mining Embedded Frequent Subtree on Biological Data

摘要：

As a technology based on database, statistics and AI, data mining provides biological research a useful information analyzing tool. The key factors which influence the performance of biological data mining approaches are the largescale of biological data and the high similarities among patterns mined. In this paper, we present an efficient algorithm named IRTM for mining frequent subtrees embedded in biological data. We also advance a string encoding method for representing the trees, and a scope-list for extending all substrings for frequency test. The IRTM algorithm adopts vertically mining approach, and uses some pruning techniques to further reduce the computational time and space cost. Experimental results show that IRTM algorithm can achieve significantly performance improvement over previous works.

关键词： Embedded Frequent Sub Tree Scope-List Biological data

作者: Wei Liu Ling Chen

作者单位: Department of Computer Science,Nanjing University of Aeronautics and Astronautics Department of Comp Department of Computer Science, Yangzhou University State Key Lab of Novel Software Tech, Nanjing Un

会议类型: 国际会议

会议名称: The 10th International Conference on Intelligent Technologies(第十届智慧科技国际会议 InTech09)

会议地点: 桂林

会议语种:英文

页码: 613-619

在线出版日期: 2009-12-12（万方平台首次上网日期，不代表论文的发表时间）

会议专题

An Efficient Algorithm for Mining Embedded Frequent Subtree on Biological Data