检索结果-内蒙古大学图书馆

sql-on-Hadoop: Full Circle Back to Shared-Nothing Database Architectures

PROCEEDINGS OF THE VLDB ENDOWMENT 2014年第12期7卷 1295-1306页

作者： Floratou, Avrilia Minhas, Umar Farooq Ozcan, Fatma IBM Almaden Res Ctr San Jose CA 95120 USA

sql query processing for analytics over Hadoop data has recently gained significant traction. Among many systems providing some SQI, support over Hadoop, (live is the first native Hadoop system that uses an underlying framework such as MapReduce or Tez to process sql-like statements. Impala, on the other hand, represents the new emerging class of sql-on-Hadoop systems that exploit a shared-nothing parallel database architecture over Hadoop. Both systems optimize their data ingestion via columnar storage, and promote different file formats: ORC and Parquet. In this paper, we compare the performance of these two systems by conducting a set of cluster experiments using a TPC-H like benchmark and two TPC-DS inspired workloads. We also closely study the I/O efficiency of their columnar formats using a set of micro-benchmarks. Our results show that Impala is 3.3X to 4.4X faster than Hive on MapReduce and 2.1X to 2.8X than Hive on Tez for the overall TPC-H experiments. Impala is also 8.2X to 10X faster than Hive on MapReduce and about 4.3X faster than Hive on Tez for the TPC-DS inspired experiments. Through detailed analysis of experimental results, we identify the reasons for this performance gap and examine the strengths and limitations of each system.

关键词： query processing Benchmarking Digital storage Data ingestions Database architecture Micro benchmark Parallel Database Performance gaps Shared nothing sql query processing sql on hadoop

来源：评论

学校读者我要写书评

暂无评论

Ontology-Based query processing in a Large-Scale P2P Environment

Ontology-Based Query Processing in a Large-Scale P2P Environ...

引用

3rd International Conference on Information and Communication Technologies

作者： Al King, Raddad Hameurlain, Abdelkader Morvan, Franck Univ Toulouse 3 IRIT Lab F-31062 Toulouse France

ISBN: (纸本)9781424417513

Due to the characteristics of P2P systems, sql query processing in these systems is more complex than traditional distributed DBMS. In this context, semantic and structural heterogeneity of local schemas prevent peers to exchange their data in a comprehensive way. Schema heterogeneity could lead to incorrect answers for the localization query. Furthermore, in the optimization phase, the lack of information and the obsolete statistics found in local catalogs make the execution plan suboptimal. In this paper, we propose a new approach to sql query processing in P2P environments. The main features of proposed approach are: (i) Avoiding any centralized structures of the peers participating in the system, (ii) Integrating a Domain Ontology which ensures a comprehensive data exchange in Chord protocol that guarantees efficient locating of data sources, and (iii) Extending the localization phase to be able to obtain information needed in the optimization phase. This information is more reliable than the statistics found in the local catalogs. We describe our approach in detail with examples.

关键词： Domain Ontology P2P sql query processing

来源：评论

学校读者我要写书评

暂无评论

Ontology-Based query processing in a Large-Scale P2P Environment

Ontology-Based Query Processing in a Large-Scale P2P Environ...

引用

3rd International Conference on Information and Communication Technologies, vol.3

作者： Raddad Al King Abdelkader Hameurlain Franck Morvan IRIT Laboratory University of Paul Sabatier Toulouse France

关键词： Domain Ontology P2P sql query processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：