Optimizing Adaptive Checkpointing Schemes for Grid Workflow Systems
One of the major challenges in wide use of Grid workflow systems is fault tolerance and avoidance.Checkpointing schemes provide a way of fault detection and recovery. In our research, we focus on performance optimization of checkpointing schemes for Grid workflow systems. We propose a set of adaptive checkpointing schemes that dynamically adjust the checkpointing intervals online by using store-checkpoints (SCPs) and compare-checkpoints (CCPs). These schemes can efficiently utilize comparison and storage operations and significantly improve the performance. Further, these schemes can calculate the optimal numbers of checkpoints by which minimize the mean execution time. We also expand the schemes from single-task execution scenarios to multitask execution scenarios. Simulation results show these schemes outstandingly increase the likelihood of timely task completion when faults occur.
Yang Xiang Zhongwen Li Hong Chen
School of Engineering and Information Technology Deakin University, Melbourne Campus,Burwood 3125, A Information Science and Technology College Xiamen University, Xiamen 361005, China
国际会议
第五届网格与协同计算国际会议(The Fifth International Conference on Grid and Cooperative Computing GCC 2006)
长沙
英文
181-188
2006-10-21(万方平台首次上网日期,不代表论文的发表时间)