会议专题

A practical parallel algorithm for computing minimum spanning tree/forest with MapReduce

Originated from traveling salesman problem, minimum spanning tree (MST) is one of the most studied combinatorial problems and keeps on expanding its application areas. However, traditional MST algorithms face the challenge of costing too much time and space when the graph becomes huge. Thus a variety of algorithms have been developed to optimize the process of finding MST, including many distributed and parallel algorithms. In this paper, we present a simple and practical parallel algorithm for computing minimum spanning tree with MapReduce. MapReduce is a distributed programming framework put forward by google with the ability to process large amount of data in a parallel way. Communication and synchronization cost of parallel algorithms implemented by traditional techniques are eliminated by using it In addition, the framework works well on cluster composed of commodity machines, no special need for particular hardware or architecture. Actually this is one of most significant features of cloud computing, and MapReduce is originally designed as an efficient and powerful tool for cloud. Our algorithm, implemented with MapReduce, is practical and suitable for the more and more prevailing cloud platform. We have done experiments on a cluster of ten node machines, installed with Hadoop software, which is an open source implementation of MapReduce. And the result shows that our algorithm shows good scalability and is quite efficient

minimum spanning tree parallel algorithm MapReduce cloud computing Hadoop

Jin Chang Zhiyong Zhong Shengzhong Feng Zhexue Huang Jun Luo Le Zhou

Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen, China

国际会议

2010 International Conference on Future Information Technology(2010年未来信息技术国际会议 ICFIT 2010)

长沙

英文

35-41

2010-12-14(万方平台首次上网日期,不代表论文的发表时间)