Fusion Coherence:Scalable Cache Coherence for Heterogeneous Kilo-Core System
Future heterogeneous systems will integrate CPUs and GPUs on a single chip to achieve high computing performance as well as high throughput.In general,it would discard the current discrete pattern and will build a uniformed shared memory system avoiding explicit data movement among CPUs and GPUs connected by high throughput NoC.We propose a scalable cache coherence solution Fusion Coherence for Heterogeneous Kilo-core System Architecture by integrating CPUs and GPUs on a single chip to mitigate the coherence bandwidth side effects of GPU memory requests as well as overhead of copying data among memories of CPUs and GPUs.The Fusion Coherence coalesces L3 data cache of CPUs and GPUs based on a uniformed physical memory,further integrates a region directory and cuckoo directory into two levels of cache coherence directory without modifying cache coherence protocol.According to the experimental results with a subset of Rodina benchmarks,it is effective to decrease the overhead of data transfer and get an average execution speedup by 2.4x.The highest speedup is approximate to 4x for data-intensive applications.
Fusion Coherence Fusion Directory Two-level Cache Directories Heterogeneous Kilo-core System Cache Coherence
Songwen Pei Myoung-Seo Kim Jean-Luc Gaudiot Naixue Xiong
Department of Computer Science and Engineering,University of Shanghai for Science and Technology,Sha Department of Electrical Engineering and Computer Science,University of California,Irvine,California School of Computer Science,Colorado Technical University,Springs,Colorado 80907,USA
国际会议
ACA,Advanced Computer Architecture(2014年全国计算机体系结构学术会议)
沈阳
英文
1-15
2014-08-23(万方平台首次上网日期,不代表论文的发表时间)