Filtering and Matching of Data Blocks to Avoid Disk Bottleneck in De-duplication File System

摘要：

　　Since the growing scale of data has generated huge redundancy,de-duplication which can eliminate redundancy and improve space utilization of storage device has been widely adopted.De-duplication filesystem can provide a unified interface to the upper application and implement inline de-duplication.In this paper,we design and implement FmdFS,a kernel-space de-duplication filesystem.Due to memory limitation,metadata of FmdFS is stored on disk group.Meanwhile a scale-adaptive binary tree filter is constructed in memory,which not only avoids access to the metadata on disk for searching fingerprints of most new data,but also records the groups where duplicate data is stored.In addition,FmdFS uses LRU hash cache,which holds the metadata group that has been recently accessed,to exploit locality to match the duplicate data to avoid access to the metadata on disk.In comparison with traditional de-duplication filesystems,FmdFS has the higher write performance.

关键词： De-duplication Filesystem Scale-adaptive Binary Tree Filter LRU Hash Cache

作者: Jiajia Zhang Xingjun Zhang Runting Zhao Xiaoshe Dong

作者单位: Department of Computer Science and Technology Xian Jiaotong University,Xian 710049,China

会议类型: 国际会议

会议名称: ACA,Advanced Computer Architecture(2014年全国计算机体系结构学术会议)

会议地点: 沈阳

会议语种:英文

页码: 68-82

在线出版日期: 2014-08-23（万方平台首次上网日期，不代表论文的发表时间）

会议专题

Filtering and Matching of Data Blocks to Avoid Disk Bottleneck in De-duplication File System