会议专题

Impact of Process Mapping on MPI Allgather Performance in Multi-core Cluster

the rapid development of multi-core architecture,cluster computing has come into a multi-core era.MPI1is one of the most important parallel programming environments in cluster computing.Naturally,when MPI applications always run on multi-core clusters,processes are mapped onto intra-node and inter-node cores.The MPI library provides many collective communication functions among which MPI Allgather is one of the most frequently used function.So it is significant to do an in-deep study on the impact of multi--core architecture on MPI collective functions behaviors and give a irection how to get optimal erformanceon multi-core clusters by mapping processes onto intra-node or inter-node cores according to application behaviors.In this paper,a new concept called Hierarchical Communication Latency and Bandwidth(HCLB)is given.The HCLB indicates that we should map the processes performing most frequent communications into intra-node cores in a multi-core cluster.We take MPI Allgather function as an example to illustrate the performance affected by different process mapping methods.The four algorithms include three classic algorithms and a new algorithm called Neighbor Exchange Algorithm proposed in our previous work2.We an alyze the communication behaviors of the four algorithms in detail.The experimental results reveal that it is important to get optimal performance on multi-core clusters by mapping processes strategies according to the communication behaviors of message passing algorithm.

Yuxin Tang Yunquan Zhang

Laboratory of Parallel Computing,Institute of Software,Chinese Academy of Sciences 100190 Beijing,Ch Laboratory of Parallel Computing,Institute of Software,Chinese Academy of Sciences 100190 Beijing,Ch

国际会议

The Inaugural Symposium on Parallel Algorithms, Architectures and Programming(并行算法、结构和编程国际研讨会)

广州

英文

66-79

2008-09-16(万方平台首次上网日期,不代表论文的发表时间)