A Method of Deduplication for Data Remote Backup
The paper describes the Remote Data Disaster Recovery System us-ing Hash to identify and avoid sending duplicate data blocks between the Pri-mary Node and the Secondary Node, thereby, to reduce the data replication network bandwidth, decrease overhead and improve network efficiency. On both nodes, some extra storage spaces (the Hash Repositories) besides data disks are used to record the Hash for each data block on data disks. We extend the data replication protocol between the Primary Node and the Secondary Node. When the data, whose Hash exists in the Hash Repository, is duplication, the block address is transferred instead of the data, and that reduces network bandwidth requirement, saves synchronization time, and improves network efficiency.
Disaster Recovery Deduplication Hash Duplicate Data
Jingyu Liu Yu-an Tan Yuanzhang Li Xuelan Zhang Zexiang Zhou
School of Computer Science and Technology, Beijing Institute of Technology,Beijing, 100081, P.R. Chi School of Computer Science and Technology, Beijing Institute of Technology,Beijing, 100081, P.R. Chi Toyou Feiji Electronics CO, LTD, Beijing, 100081, P.R. China
国际会议
南昌
英文
68-75
2010-10-22(万方平台首次上网日期,不代表论文的发表时间)