Process-level and Thread-level Parallel Programming Mechanism and Performance Optimization Techniques on Multi-core Clusters
This paper studies the multi-core architectures multi-threading techniques and shared L2 caches are how to impact on the parallel programming and its performance optimization. By applying the strat egy to decrease the frequency that the processing cores access the main memory and to utilize efficiently the shared L2 cache, this paper presents a hybrid programming mechanism with procees-level and thread-level parallelism for the compute-intensive applications which can adapt the number of the processing cores and the size of main memory and the size of usable space of shared L2 cache, and then proposes some performance optimization techniques of parallel programs with multiple processes and threads on the multi-core clusters. The experiments to solve in parallel the steady-state heat distribution problem on the cluster with multi-core computers show that the parallel programming mechanism and perfor mance optimization methods are usable and efficient.
parallel programming multi-core clusters multi-thread multi-process performance optimization multi-level caches
Hualin Huang Cheng Zhong Zhonglong Lu
School of Computer and Electronics and Information,Guangxi University,Nanning,China
国际会议
南宁
英文
163-181
2009-12-04(万方平台首次上网日期,不代表论文的发表时间)