Apriori Parallel Improved Algorithm Based on MapReduce Distributed Architecture
Under the environment of big data,efficiency is low and there are many candidates when the traditional serial Apriori algorithm in dealing with massive data.This paper proposes a parallel better algorithm based on MapReduce distributed architecture.Based on the basic Apriori algorithm on MapReduce,this paper makes a reconstruction of the original transaction database,and implements parallel in data set fragmentation.The algorithm optimizes the transaction database; candidate item sets counting and pruning strategy.The experimental results show that the improved algorithm proposed in this paper can reduce the candidate items and improve the efficiency.
Apriori algorithm MapReduce distributed
She Xiangyang Zhang Ling
College of Computer Science and Technology Xian University of Science and Technology Xian, China
国际会议
哈尔滨
英文
517-521
2016-07-21(万方平台首次上网日期,不代表论文的发表时间)