Pipelined-MapReduce: An improved MapReduce Parallel programing model

摘要：

MapReduce is a parallel programming model, and used to handle large datasets. The MapReduce program can be automatically concurrent executed in large-scale commodity machines. We proposed an improved MapReduce programming model—Pipelined-MapReduce, to solve the data intensive of information retrieval problems. Pipelined-MapReduce allows data transfer by pipeline between the operations, expanding the batched MapReduce programming model, and can reduce the completion time, and improve the system utilization rate. The experimental results demonstrate that the implemention of Pipelined-MapReduce can scale well and efficiently process large datasets on commodity machines.

关键词： MapReduce Hadoop Pipelined-MapReduce Parallel processing

作者: Li Wang Zhiwei Ni Yiwen Zhang Zhang jun Wu Liyang Tang

作者单位: Hefei University of Technology of school of Management, Hefei, Anhui, 230009, China Hefei University of Technology of school of Management, Hefei, Anhui, 230009, China Anhui University

会议类型: 国际会议

会议名称: 2011 Fourth International Conference on Intelligent Computation Technology and Automation(2011年第四届智能计算技术与自动化国际会议 ICICTA 2011)

会议地点: 深圳

会议语种:英文

页码: 871-874

在线出版日期: 2011-03-28（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Pipelined-MapReduce: An improved MapReduce Parallel programing model