A GPU Oriented Data Partitioning Method to Overlap Communication and Computation
Currently GPU is used as a coprocessor to accelerate computation in heterogeneous concurrency systems comprised of CPU and GPU. However, the architecture of main core plus coprocessor needs extra spending on communication. An effective method to solve this problem is to process data in batches in order to overlap communication and computation. But without a specific partitioning method, users always have no reference when processing data in batches, which leads to be less effective in promoting the overall performance of application. In this paper, taking the system communication bandwidth and the GPU computing power into account, we propose a novel data partitioning method which can partition the data into blocks with different sizes in proportion. We implement our method with CUDA, and the experimental results show that our method is feasible, and it can effectively overlap communication and computation which contributes to promoting the overall performance of application.
GPU overlap communication and computation data partition
Bao Zhang Haijun Cao Xiaoshe Dong Dan Li Leijun Hu
Department of Computer Science and Technology, Xian Jiaotong University, Xian, China 710049 State Key Laboratory of High-end Server & Storage Technology Jinan 250013
国际会议
2010 International Conference on Future Information Technology(2010年未来信息技术国际会议 ICFIT 2010)
长沙
英文
103-107
2010-12-14(万方平台首次上网日期,不代表论文的发表时间)