会议专题

A Dedicated Adaptive Loop Pre-fetch Mechanism for Stream-like Application

For the stream-like applications with high-bandwidth and low latency, optimizing the memory latency can effectively improve the QoS. In this paper, we propose a dedicated adaptive loop pre-fetch mechanism to reduce the memory latency and also improve the pre-fetch accuracy. In the mechanism, when a loop sequences is detected, the stream pre-fetch engine can adaptively initiate the pre-fetch operation and store the return data into the on-chip stream buffers. The pre-fetch engine consists of loop sequences recognition, stream buffer FIFOs, address calculation ALU. A hardware engine is implemented and integrated into a processor to verify the mechanism. When the processor with the pre-fetch engine is running a regular loop sequences, it can save 2/3 to 1/2 of the time spent on memory latency. Also the mechanism can alleviate the cache pollution and the cache thrash.

Xiao-Ping Huang Xiao-Ya Fan Yu-Hui Chen Xiang-Dong He

Computer School, Northwestern Polytechnical University, Xian 710072, China

国际会议

2010 10th IEEE International Conference on Solid-State and Integrated Circuit Technology(第十届固态和集成电路技术国际会议 ICSICT-2010)

上海

英文

575-577

2010-11-01(万方平台首次上网日期,不代表论文的发表时间)