A Static Analytical Performance Model for GPU Kernel
Graphics processing units(GPUs)have shown increased popularity play an important role as a kind of coprocessor in heterogeneous co-processing environment.Tens of thousands threads collaborative work in parallel to solve heavily dataparallel problems efficiently in GPUs architecture.The achieved performance,therefore,depends on the capability of multiple threads in parallel collaboration when processing algorithm,the effectiveness of latency of latency hiding,and the utilization of multiprocessors.In this paper,we propose a static analytical kernel performance(SAKP)model for GPU kernel.The model considers three important factors that affecting the performance of GPU kernel,which including the cost of computing instruction,memory accessing and synchronization.In the proposed model a set of kernel and device features for the target GPU is generated.In conjunction of kernel and device features we determine the performance limiting factor and we generate an estimation of kernels execution time.We performed experiments on matrix multiplication(MM)and histogram generation(HG)in NVIDIA GTX680 GPU card and showed an absolute error in predictions less than 6.8%.Meanwhile,we validated our proposed model is more accuracy and simple by comparing with other current kernel models.
GPUs co-processing collaborative work static analytical kernel performance (SAKP) model memory accessing synchronization GPU kernel
Jinjing Li Qingkui Chen Baoping Liu Bocheng Liu
University of Shanghai for science and Technology
国内会议
第10届全国计算机支持的协同工作学术会议暨中国计算机学会协同计算专委年度工作会议
太原
英文
445-452
2015-08-28(万方平台首次上网日期,不代表论文的发表时间)