A Fast Algorithm of Mining Induced Subtrees
Induced subtrees mining is of important researching value in the fields of XML documents,bioinformatics,web log and so forth.In this paper,the two conceptions of subtree vector and pruning threshold are proposed,and an algorithm ITMSV (induced subtrees mining based on subtree vector)is presented to discover frequent induced subtrees quickly by taking fulladvantages of the features of subtree vector and combining with the hash table.The algorithm,as a result of constructing a multi- layered data structure,can lessen the time of distinguishing isomorphism during mining,and need scan database only once so that it induces times of scanning and improves the efficiency of algorithm.The experimental result shows that the algorithm ITMSV is more efficient and effective than TreeMiner.
data mining frequent subtree induced subtree
Yun Li Xin Guo Yunhao Yuan Jia Wu Ling Chen
Institute of Information Engineering Yangzhou University Jiangsu,225009,China
国际会议
2008 IEEE International Conference on Onformation and Automation(IEEE 信息与自动化国际会议)
张家界
英文
195-199
2008-06-20(万方平台首次上网日期,不代表论文的发表时间)