检索结果-内蒙古大学图书馆

Dynamic Replication Policy on HDFS Based on Machine Learning Clustering

IEEE ACCESS 2023年 11卷 18551-18559页

作者： Ahmed, Motaz A. Khafagy, Mohamed H. Shaheen, Masoud E. Kaseb, Mostafa R. Fayoum Univ Fac Comp & Artificial Intelligence Dept Comp Sci Al Fayyum 63514 Egypt

Data growth in recent years has been swift, leading to the emergence of big data science. distributed File Systems (DFS) are commonly used to handle big data, like Google File System (GFS), Hadoop distributed File System (HDFS), and others. The DFS should provide the availability of data and reliability of the system in case of failure. The DFS replicates the files in different locations to provide availability and reliability. These replications consume storage space and other resources. The importance of these files differs depending on how frequently they are used in the system. So some of these files do not deserve to replicate many times because it is unimportant in the system. This paper introduces a Dynamic Replication Policy using Machine Learning Clustering (DRPMLC) on HDFS, which uses Machine Learning to cluster the files into different groups and apply other replication policies to each group to reduce the storage consumption, improve the read and write operations time and keep the availability and reliability of HDFS as a high-performance distributed computing (HPDC).

关键词： Feature extraction File systems Machine learning Databases Big Data distributed computing Support vector machines Replicability Availability big data clustering Hadoop distributed file system high-performance distributed computing machine learning reliability replication policy

来源：评论

学校读者我要写书评

暂无评论

A MapReduce Enabled Simulated Annealing Genetic Algorithm

A MapReduce Enabled Simulated Annealing Genetic Algorithm

引用

International Conference on Identification, Information and Knowledge in the Internet of Things

作者： Hu, Luokai Liu, Jin Liang, Chao Ni, Fuchuan Lenovo Mobile Commun Technol Co Ltd Xiamen Peoples R China Guilin Univ Elect Technol Guangxi Key Lab Trusted Software Guilin Peoples R China Wuhan Univ Comp Sch State Key Lab Software Engn Wuhan Peoples R China Huazhong Agr Univ Dept Comp Sci Wuhan Peoples R China

ISBN: (纸本)9781479980031

Intelligent algorithms such as genetic algorithms and simulated annealing algorithms have widely been applied to the field of large scale data analysis and data processing. It is potential for the high-performance distributed computing technologies or platforms to further increase the execution efficiency of these traditional intelligent algorithms. Against this background, we propose a novel MapReduce enabled simulated annealing genetic algorithm that has two distinctive characteristics. The first is that, our algorithm is the synthesis of the conventional genetic algorithm and the simulated annealing algorithm. While most genetic algorithms are easy to fall into local optimal solution, the simulated annealing algorithm accepts non-optimal solution at a certain probability to jump out of local optimal. This characteristic guarantees our proposed algorithm has a higher probability of getting the global optimal solution than traditional genetic algorithms. The other is that our algorithm is a parallel algorithm running on the high-performance parallel platform Phoenix++ other than a conventional serial genetic algorithm. Phoenix++ implements the MapReduce programming model that processes and generates large data sets with our parallel, distributed algorithm on a cluster. The experiments on Phoenix++ indicate that the convergence speed of the proposed algorithm significantly outperforms its traditional genetic rivals.

关键词： Genetic algorithm Simulated Annealing algorithm high-performance distributed computing MapReduce Phoenix plus

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：