检索结果-内蒙古大学图书馆

9th Annual ACM International Conference on Web Search and Data Mining (WSDM)

作者： Aly, Ahmed M. Elmeleegy, Hazem Qi, Yan Aref, Walid Google Inc Mountain View CA USA Turn Inc Redwood City CA USA Purdue Univ W Lafayette IN USA

ISBN: (纸本)9781450337168

Despite the importance and widespread use of range data, e.g., time intervals, spatial ranges, etc., little attention has been devoted to study the processing and querying of range data in the context of big data. The main challenge relies in the nature of the traditional index structures e.g., B-Tree and R-Tree, being centralized by nature, and hence are almost crippled when deployed in a distributed environment. To address this challenge, this paper presents Kangaroo, a system built on top of Hadoop to optimize the execution of range queries over range data. The main idea behind Kangaroo is to split the data into non-overlapping partitions in a way that minimizes the query execution time. Kangaroo is query workload aware, i.e., results in partitioning layouts that minimize the query processing time of given query patterns. In this paper, we study the design challenges Kangaroo addresses in order to be deployed on top of a distributed file system, i.e., HDFS. We also study four different partitioning schemes that Kangaroo can support. With extensive experiments using real range data of more than one billion records and real query workload of more than 30,000 queries, we show that the partitioning schemes of Kangaroo can significantly reduce the I/O of range queries on range data.

关键词： hadoop query processing and optimization big data indexing

来源：评论

学校读者我要写书评

暂无评论

SHC: Distributed query processing for Non-Relational Data Store 34

SHC: Distributed Query Processing for Non-Relational Data St...

引用

34th IEEE International Conference on Data Engineering Workshops (ICDEW)

作者： Yang, Weiqing Tang, Mingjie Yu, Yongyang Liang, Yanbo Saha, Bikas Hortonworks Santa Clara CA 95054 USA Purdue Univ W Lafayette IN 47907 USA

ISBN: (纸本)9781538655207

We introduce a simple data model to process non-relational data for relational operations, and SHC (Apache Spark - Apache HBase Connector), an implementation of this model in the cluster computing framework, Spark. SHC leverages optimization techniques of relational data processing over the distributed and column-oriented key-value store (i.e., HBase). Compared to existing systems, SHC makes two major contributions. At first, SHC offers a much tighter integration between optimizations of relational data processing and non-relational data store, through a plug-in implementation that integrates with Spark SQL, a distributed in-memory computing engine for relational data. The design makes the system maintenance relatively easy, and enables users to perform complex data analytics on top of key-value store. Second, SHC leverages the Spark SQL Catalyst engine for high performance query optimizations and processing, e.g., data partitions pruning, columns pruning, predicates pushdown and data locality. SHC has been deployed and used in multiple production environments with hundreds of nodes, and provides OLAP query processing on petabytes of data efficiently.

关键词： query processing and optimization distributed computing in-memory computing

来源：评论

学校读者我要写书评

暂无评论

FEDERATED DATABASE-SYSTEMS FOR MANAGING DISTRIBUTED, HETEROGENEOUS, AND AUTONOMOUS DATABASES

引用

COMPUTING SURVEYS 1990年第3期22卷 183-236页

作者： SHETH, AP LARSON, JA BELLCORE PISCATAWAYNJ 08854 INTEL CORP HILLSBOROOR 97124

A federated database system (FDBS) is a collection of cooperating database systems that are autonomous and possibly heterogeneous. In this paper, we define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed. We then define a methodology for developing one of the popular architectures of an FDBS. Finally, we discuss critical issues related to developing and operating an FDBS.

关键词： DESIGN, MANAGEMENT ACCESS CONTROL DATABASE ADMINISTRATOR DATABASE DESIGN AND INTEGRATION DISTRIBUTED DBMS FEDERATED DATABASE SYSTEM HETEROGENEOUS DBMS MULTIDATABASE LANGUAGE NEGOTIATION OPERATION TRANSFORMATION query processing and optimization REFERENCE ARCHITECTURE SCHEMA INTEGRATION SCHEMA TRANSLATION SYSTEM EVOLUTION METHODOLOGY SYSTEM SCHEMA PROCESSOR ARCHITECTURE TRANSACTION MANAGEMENT

来源：评论

学校读者我要写书评

暂无评论

Evolution of data management systems: from uni-processor to large-scale distributed systems 12

Evolution of data management systems: from uni-processor to ...

引用

Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia

作者： Abdelkader Hameurlain Paul Sabatier University Toulouse Cedex France

ISBN: (纸本)9781450313070

The purpose of this talk is to provide a comprehensive state of the art concerning the evolution of data management systems from uni-processor systems to large scale distributed systems. We focus our study on the query processing and optimization methods. For each environment, we recall their motivations and point out main characteristics of proposed methods, especially, the nature of decision-making (centralized or decentralized control for high level of scalability), adaptive level (intra-operator and/or inter-operator), impact of parallelism (partitioned and pipelined parallelism) and dynamicity (e.g. elasticity) of execution models.

关键词： query processing and optimization parallel and distributed database systems P2P systems heterogeneity data grid systems relational database systems data management autonomy data integration systems large scale dynamicity mobile agents

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：