咨询与建议

看过本文的还看了

相关文献

该作者的其他文献

文献详情 >An Enhanced Physical-Locality ... 收藏

An Enhanced Physical-Locality Deduplication System for Space Efficiency

作     者:Peng-Fei Li Yu Hua Qin Cao 李鹏飞;华宇;曹钦

作者机构:Wuhan National Laboratory for OptoelectronicsSchool of Computer Science and TechnologyHuazhong University of Science and TechnologyWuhan 430074China 

出 版 物:《Journal of Computer Science & Technology》 (计算机科学技术学报(英文版))

年 卷 期:2024年第39卷第6期

页      面:1361-1379页

核心收录:

学科分类:0809[工学-电子科学与技术(可授工学、理学学位)] 08[工学] 

基  金:supported in part by the National Natural Science Foundation of China under Grant Nos.62125202 and U22B2022 

主  题:deduplication system data reduction space efficiency physical-locality 

摘      要:An abundance of data have been generated from various embedded devices, applications, and systems, and require cost-efficient storage services. Data deduplication removes duplicate chunks and becomes an important technique for storage systems to improve space efficiency. However, stored unique chunks are heavily fragmented, decreasing restore performance and incurs high overheads for garbage collection. Existing schemes fail to achieve an efficient trade-off among deduplication, restore and garbage collection performance, due to failing to explore and exploit the physical locality of different chunks. In this paper, we trace the storage patterns of the fragmented chunks in backup systems, and propose a high-performance deduplication system, called HiDeStore. The main insight is to enhance the physical-locality for the new backup versions during the deduplication phase, which identifies and stores hot chunks in the active containers. The chunks not appearing in new backups become cold and are gathered together in the archival containers. Moreover, we remove the expired data with an isolated container deletion scheme, avoiding the high overheads for expired data detection. Compared with state-of-the-art schemes, HiDeStore improves the deduplication and restore performance by up to 1.4x and 1.6x, respectively, without decreasing the deduplication ratios and incurring high garbage collection overheads.

读者评论 与其他读者分享你的观点

用户名:未登录
我的评分