Performance Optimization of a CFD Application on Intel Multicore and Manycore Architectures
This paper reports our experience optimizing the performance of a high-order and high accurate Computational Fluid Dynamics (CFD) application (HOSTA) on the state of art multicore processor and the emerging Intel Many Integrated Core (MIC) coprocessor.We focus on effective loop vectorization and memory access optimization.A series techniques,including data structure transformations,procedure inlining,compiler SIMDization,OpenMP loop collapsing,and the use of Huge Pages,are explored.Detailed execution time and event counts from Performance Monitoring Units are measured.The results show that our optimizations have improved the performance of HOSTA by 1.61× on a two Intel Sandy Bridge processors based computer node and 1.97× on a Intel Knights Corner coprocessor,the public MIC product.The microarchitecture level effects of these optimizations are also discussed.
Computational Fluid Dynamics multicore manycore performance optimization performance analysis
Yonggang Che Lilun Zhang Yongxian Wang Chuanfu Xu Wei Liu Xinghua Cheng
Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology,Changsha,China
国际会议
ACA,Advanced Computer Architecture(2014年全国计算机体系结构学术会议)
沈阳
英文
84-97
2014-08-23(万方平台首次上网日期,不代表论文的发表时间)