Implementing Sparse Matrix-Vector Multiplication using CUDA based on a Hybrid Sparse Matrix Format

摘要：

The Sparse Matrix-Vector product (SpMV) is a key operation in engineering and scientific computing. Methods for efficiently implementing it in parallel are critical to the performance of many applications. Modern Graphics Processing Units (GPUs) coupled with the advent of general purpose programming environments like NVIDIAS CUDA, have gained interest as a viable architecture for data-parallel general purpose computations. Currently, SpMV implementations using CUDA based on common sparse matrix format have already appeared. Among them, the performance of implementation based on ELLPACK-R format is the best However, in this implementation, when the maximum number of nonzeros per row does substantially differ from the average, thread is suffering from load imbalance. This paper proposes a new matrix storage format called ELLPACK-RP, which combines ELLPACK-R format with JAD format, and implements the SpMV using CUDA based on it The result proves that it can decrease the load imbalance and improve the SpMV performance efficiently.

关键词： SpMV GPU CUDA matrix format ELLPACKRP

作者: Wei Cao Lu Yao Zongzhe Li Yongxian Wang Zhenghua Wang

作者单位: National Key Lab for Parallel and Distributed Processing National University of Defense Technology Changsha, China

会议类型: 国际会议

会议名称: The 2010 International Conference on Computer Application and System Modeling(2010计算机应用与系统建模国际会议 ICCASM 2010)

会议地点: 太原

会议语种:英文

页码: 161-165

在线出版日期: 2010-10-22（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Implementing Sparse Matrix-Vector Multiplication using CUDA based on a Hybrid Sparse Matrix Format