检索结果-内蒙古大学图书馆

Efficient data placement and replication for QoS-Aware Approximate Query Evaluation of Big data Analytics

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 2019年第12期30卷 2677-2691页

作者： Xia, Qiufen Xu, Zichuan Liang, Weifa Yu, Shui Guo, Song Zomaya, Albert Y. Dalian Univ Technol Key Lab Ubiquitous Network & Serv Software Liaoni Int Sch Informat Sci & Engn Dalian 116024 Liaoning Peoples R China Dalian Univ Technol Sch Software Dalian 116024 Liaoning Peoples R China Australian Natl Univ Res Sch Comp Sci Canberra ACT 2601 Australia Univ Technol Sydney Sch Software Ultimo NSW 2007 Australia Hong Kong Polytech Univ Dept Comp Hung Hom Hong Kong Peoples R China Univ Sydney Sch Comp Sci Camperdown NSW 2006 Australia

Enterprise users at different geographic locations generate large-volume data that is stored at different geographic datacenters. These users may also perform big data analytics on the stored data to identify valuable information in order to make strategic decisions. However, it is well known that performing big data analytics on data in geographical-located datacenters usually is time-consuming and costly. In some delay-sensitive applications, the query result may become useless if answering a query takes too long time. Instead, sometimes users may only be interested in timely approximate rather than exact query results. When such approximate query evaluation is the case, applications must sacrifice timeliness to get more accurate evaluation results or tolerate evaluation result with a guaranteed error bound obtained from analyzing the samples of the data to meet their stringent timeline. In this paper, we study quality-of-service (QoS)-aware data replication and placement for approximate query evaluation of big data analytics in a distributed cloud, where the original (source) data of a query is distributed at different geo-distributed datacenters. We focus on the problems of placing data samples of the source data at some strategic datacenters to meet stringent query delay requirements of users, by exploring a non-trivial trade-off between the cost of query evaluation and the error bound of the evaluation result. We first propose an approximation algorithm with a provable approximation ratio for a single approximate query. We then develop an efficient heuristic algorithm for evaluating a set of approximate queries with the aim to minimize the evaluation cost while meeting the delay requirements of these queries. We finally demonstrate the effectiveness and efficiency of the proposed algorithms through both experimental simulations and implementations in a real test-bed, real datasets are employed. Experimental results show that the proposed algorithms are promisi

关键词： Big data Query processing Delays Approximation algorithms Quality of service Distributed databases Software data replication and placement big data analytics approximate query evaluation approximation algorithms algorithm analysis

来源：评论

学校读者我要写书评

暂无评论

QoS-Aware Proactive data replication for Big data Analytics in Edge Clouds 19

QoS-Aware Proactive Data Replication for Big Data Analytics ...

引用

48th International Conference on Parallel Processing (ICPP)

作者： Xia, Qiufen Bai, Luyao Liang, Weifa Xu, Zichuan Yao, Lin Wang, Lei Dalian Univ Technol Dalian Liaoning Peoples R China Australian Natl Univ Canberra ACT Australia

ISBN: (纸本)9781450371964

We are in the era of big data and cloud computing, large quantity of computing resource is desperately needed to detect invaluable information hidden in the coarse big data through query evaluation. Users demand big data analytic services with various Quality of Service (QoS) requirements. However, cloud computing is facing new challenges in meeting stringent QoS requirements of users due to the remoteness from its users. Edge computing has emerged as a new paradigm to address such shortcomings by bringing cloud services to the edge of the operation network in proximity of users for performance improvement. To satisfy the QoS requirements of users for big data analytics in edge computing, the data replication and placement problem must be properly dealt with such that user requests can be efficiently and promptly responded. In this paper, we consider data replication and placement for big data analytic query evaluation. We first cast a novel proactive data replication and placement problem of big data analytics in a two-tier edge cloud environment, we then devise an approximation algorithm with an approximation ratio for it, we finally evaluate the proposed algorithm against existing benchmarks, using both simulation and experiment in a testbed based on real datasets, the evaluation results show that the proposed algorithm is promising.

关键词： data replication and placement big data analytics edge clouds query evaluation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：