会议专题

Robinia: Exploiting Data Parallelism for Scientific Data Processing on Wide Area Network

  Because the growing scientific data are often distributed among collaborating research institute, exploiting data parallelism on wide area network such as Internet is one of the best ways to achieve scalable performance for processing on big scientific data.In this paper, a data-intensive processing framework named Robinia is proposed for flexible, extendible, data-intensive scientific computing on WAN.It purchases a center-less symmetric architecture, distributes scientific data in storage clusters which composed by a head node and a pack of data nodes, schedules master executor and worker executors to process scientific data simultaneously.Experience of prototype for global drought detection shows that applications based on Robinia can reach almost linear speedup, and achieve scalable performance by adding processing nodes when data size grows.

data parallelism wide area network scientific parallel computing

Zhenchun Huang Dong Zhao Yang Gu Guoqing Li Quan Zou

Department of Computer Science and Technology,Tsinghua University, Beijing, China Institute of Remote Sensing and Digital Earth,Chinese Academy of Sciences (CAS), Beijing, China

国际会议

International Workshop on Data-Intensive Scientific Discovery and Applications 2013(2013数据密集型的科学发现与应用国际研讨会)

上海

英文

130-139

2013-08-01(万方平台首次上网日期,不代表论文的发表时间)