检索结果-内蒙古大学图书馆

IEEE TRANSACTIONS ON SUSTAINABLE COMPUTING 2022年第1期7卷 112-124页

作者： Song, Jie Hu, Shengqiang Bao, Yubin Yu, Ge Northeastern Univ Software Coll Shenyang 110819 Liaoning Peoples R China Northeastern Univ Sch Comp Sci & Engn Shenyang 110819 Liaoning Peoples R China

Currently, in addition to the performance, the energy consumption (hereinafter EC) of jobs running in a big data processing systemis also of interest to academia and industry because it grows rapidly as an increasing amount of data is processed. Many studies focus on the EC optimization of jobs from the perspective of computation, which is specific to the algorithms in each job. However, the part of EC involved in I/O operations, which is general and universal, is mostly ignored in optimization. In this paper, we concentrate on the EC optimization of jobs from the perspective of I/O operations. To save energy, we argue that data compression could be exploited. On one hand, energy is saved by processing compressed data with less I/O cost. On the other hand, extra EC is incurred from the necessary data compression/decompression process, which may offset the saved energy. Therefore, there are tradeoffs to consider when determining whether to compress data for these jobs. In this paper, such tradeoffs and boundary conditions are studied. We first abstract a paradigm for the runtime environment of big data processing jobs. Then, we establish the power, jobs, compression, and I/O models in detail. Based on these models, we discuss the compression tradeoffs and derive the boundary conditions for EC optimization. Finally, we design and conduct experiments to validate our proposition. The experimental results confirm that the tradeoffs and boundary conditions exist for typical jobs in MapReduce and Spark. As explained, first, the EC of a job is reduced using data compression. Second, whether or not such optimization occurs is related to the specification of both the compression algorithm and the job and is determined by corresponding boundary conditions. Third, for a compression algorithm, the larger its compression/decompression speed and the better its compression ratio, the more likely it is to achieve EC optimization.

关键词： big data processing system compression algorithm energy consumption optimization

来源：评论

学校读者我要写书评

暂无评论

Docker Container-Based big data processing system in Multiple Clouds for Everyone 3

Docker Container-Based Big Data Processing System in Multipl...

引用

IEEE International systems Engineering Symposium (ISSE)

作者： Naik, Nitin Minist Def Def Sch Commun & Informat Syst London England

ISBN: (纸本)9781538634035

big data processing is progressively becoming essential for everyone to extract the meaningful information from their large volume of data irrespective of types of users and their application areas. big data processing is a broad term and includes several operations such as the storage, cleaning, organization, modelling, analysis and presentation of data at a scale and efficiency. For ordinary users, the significant challenges are the requirement of the powerful data processing system and its provisioning, installation of complex big data analytics and difficulty in their usage. Docker is a container-based virtualization technology and it has recently introduced Docker Swarm for the development of various types of multi-cloud distributed systems, which can be helpful in solving all above problems for ordinary users. However, Docker is predominantly used in the software development industry, and less focus is given to the data processing aspect of this container-based technology. Therefore, this paper proposes the Docker container-based big data processing system in multiple clouds for everyone, which explores another potential dimension of Docker for big data analysis. This Docker container-based system is an inexpensive and user-friendly framework for everyone who has the knowledge of basic IT skills. Additionally, it can be easily developed on a single machine, multiple machines or multiple clouds. This paper demonstrates the architectural design and simulated development of the proposed Docker container-based big data processing system in multiple clouds. Subsequently, it illustrates the automated provisioning of big data clusters using two popular big data analytics, Hadoop and Pachyderm (without Hadoop) including the Web-based GUI interface Hue for easy data processing in Hadoop.

关键词： Docker Container Docker Swarm big data processing system Cloud Hadoop Hue Pachyderm

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：