会议专题

NV-TS: A Fault Tolerance Transaction System Based on Persistent Memory

The scalability of future high performance computing (HPC) systems are challenged by high failure rates. So fault tolerance technique will play a more important role in future HPC field. Currently,the checkpoint-restart technique is the main fault tolerance technique. However,the checkpoint-restart approach results in a very high overhead,which influences the efficiency of HPC systems seriously. In this paper,we leverage the emerging NVRAM technology and propose to combine transaction and NVRAM technique to design a new fault tolerance technique. We present NV-TS,a fault tolerance transaction system based on NVRAM. NV-TS guarantee that the update of application state is atomic and durable. If the system crashes suddenly during the application execution,the atomicity of transaction will ensure the consistency of application state. After the system restarts,the application could continue to run. Our experiment shows that NV-TS could improve the performance of fault tolerance with a small memory overhead.

Transaction Fault tolerance Performance Persistent memor

Xu Li Kai Lu Xu Zhou

National University of Defense Technology,China

国际会议

2012 International Conference on Computer Science and Electronic Engineering(2012 IEEE计算机科学与电子工程国际会议 ICCSEE 2012)

杭州

英文

221-224

2012-03-23(万方平台首次上网日期,不代表论文的发表时间)