A Novel Method to Improve Hit Rate for Big Data Quick Reading
In big data mining analysis,the data records in the dataset are randomly retrieved.The distributed storage modes,such as BigTable,HBase,provide the cache policy for file blocks in retrieval operations.Since these records are scattered in different file blocks,the block cache does not have a high hit rate.To deal with the above problem,an LRU-based double queue K-frequency cache method(DLK)is proposed.The method presents a double queue storage structure,applying different storage and eviction rules for the data with varying access frequency(i.e.,high/low access frequency).While the method divides the memory into data area and list area and adopts different data structure to reduce the time of data retrieval and data processing.The experimental results show that proposed method can reduce retrieval time by 30%with the cache mechanism.Compared with existing methods,DLK can improve the hit rate by 60.1%and reduce the retrieval time by 43.5%.While applying in smaller cache capacity,our method outperforms other algorithms.
distributed cache replacement frequency double-queue
Xiaobo Zhang Zhaohui Zhang Lizhi Wang Xinxin Zhou Pengwei Wang
Computer Science and Technology,Donghua University,Shanghai,China
国际会议
西安
英文
105-113
2018-09-21(万方平台首次上网日期,不代表论文的发表时间)