Robinia: Exploiting Data Parallelism for Scientific Data Processing on Wide Area Network
Because the growing scientific data are often distributed among collaborating research institute, exploiting data parallelism on wide area network such as Internet is one of the best ways to achieve scalable performance for processing on big scientific data.In this paper, a data-intensive processing framework named Robinia is proposed for flexible, extendible, data-intensive scientific computing on WAN.It purchases a center-less symmetric architecture, distributes scientific data in storage clusters which composed by a head node and a pack of data nodes, schedules master executor and worker executors to process scientific data simultaneously.Experience of prototype for global drought detection shows that applications based on Robinia can reach almost linear speedup, and achieve scalable performance by adding processing nodes when data size grows.
data parallelism wide area network scientific parallel computing
Zhenchun Huang Dong Zhao Yang Gu Guoqing Li Quan Zou
Department of Computer Science and Technology,Tsinghua University, Beijing, China Institute of Remote Sensing and Digital Earth,Chinese Academy of Sciences (CAS), Beijing, China
国际会议
上海
英文
130-139
2013-08-01(万方平台首次上网日期,不代表论文的发表时间)