会议专题

Parallelizing and Optimizing H.264 on Synchronous Data Triggered Architecture

Synchronous data triggered architecture (SDTA) is advantaged with high-performance,flexible scalability,low-cost communication between processor elements. With the prevalence of multimedia nowadays,it is significant to parallelize and optimize the implementation of OpenMAX DLs video component with SDTA support. H.264 is the core of the video part. In this paper,we propose several optimization techniques for H.264. There are loop unrolling,primitives and vectorization,dependences elimination,batch processing and revision,computing direction transformation,load/store acceleration. With these techniques,we have parallelized all the H.264 APIs,and two representative APIs gain speedups of 5.6,and 6.5 compared to the original sequential ones.. Then we propose addition hardware for clip operations. With this hardware,the test benches reduced execution time by 18%,and 38%,respectively,and the speedups are 6.7,and 10.3. We believe that these techniques are especially beneficial to multimedia applications,even other applications on synchronous data triggered architecture.

H.264 optimization parallelization SDTA SIMD

Liu Cong Wang Zhiying Lai Xin Gan Xinbiao Chen Fangyuan

School of Computer,National University of Defense Technology,Changsha,China

国际会议

2012 International Conference on Computer Science and Electronic Engineering(2012 IEEE计算机科学与电子工程国际会议 ICCSEE 2012)

杭州

英文

185-190

2012-03-23(万方平台首次上网日期,不代表论文的发表时间)