Customized MMRF:Efficient Matrix Operations on SIMD Processors
Wireless communication and multimedia applications feature a large amount of matrix operations with different matrix size.These operations require accessing matrix in column order.This paper implements a Multi-Grained Matrix Register File (MMRF) that supports multi-grained parallel row-wise and column-wise access.We implement a 4*4 MIMO decoding with the help of MMRF to illustrate the efficient matrix operations on SIMD processors.Experimental results show that,compared with TMS320C64x+,our SIMD processor can achieve about 5.65x to 7.71x performance improvement by employing the MMRF.By customized design technology,we reduce the area and critical-path delay of MMRF by 17.9% and 39.1% respectively.
SIMD Matrix Operations Customize
Zhang Kai Wang Yaohua Chen Shuming Li Zhentao Wen Liang
School of Computer, National University of Defense Technology Changsha 410073, China
国际会议
太原
英文
761-764
2013-04-06(万方平台首次上网日期,不代表论文的发表时间)