Software engineers can consider today a multitude of storage solutions and data formats to achieve better performance, lower cost, or even explore the power expression of a data model to develop an application. We cal...
详细信息
ISBN:
(纸本)9781728113371
Software engineers can consider today a multitude of storage solutions and data formats to achieve better performance, lower cost, or even explore the power expression of a data model to develop an application. We call it polyglot access. Nevertheless, the cost of developing polyglot software increases due, for instance, to the complexity of managing multiple connections to databases and the need for training people to use different tools, models and query languages. This paper presents a scalable middleware, called WA-RDF, that provides a unique gateway to multiple nosql databases. Different from other similar ideas, WA-RDF uses the well-known abstractions of Semantic Web to store and query RDF data into key/value, document and graph databases. Moreover, WA-RDF includes workload-awareness, fragmentation and partitioning components to meet the nosql high level of scalability. An experimental evaluation shows that the approach is promising. It scaled linearly to the dataset size and query frequency growth, and outperformed a multimodel database in the tested use cases.
Although nosql databases are claimed to be schemaless, several nosql database vendors have chosen JSON as agile data representation format and provide a JSON-based API or query facility to simplify the life of applica...
详细信息
ISBN:
(纸本)9783319444062;9783319444055
Although nosql databases are claimed to be schemaless, several nosql database vendors have chosen JSON as agile data representation format and provide a JSON-based API or query facility to simplify the life of application developers. Whereas many applications require the management of temporal data, the JSON Schema language lacks explicit support for time-varying data. In this paper, for a systematic approach to the management of temporal data in nosql databases, we propose a framework called Temporal JSON Schema (tau JSchema), inspired by the tau XSchema framework defined for XML data. tJSchema allows defining a temporal JSON schema from a conventional JSON schema and a set of temporal logical and physical characteristics. Our framework guarantees logical and physical data independence for temporal schemas and provides a lowimpact solution since it requires neither modifications of existing JSON documents, nor extensions to the JSON format, the JSON Schema language, and all related tools and languages.
Modern systems record large quantities of electronic data capturing time-ordered events, system state information, and behavior. Subsequent analysis enables historic and current system status reporting, supports fault...
详细信息
Modern systems record large quantities of electronic data capturing time-ordered events, system state information, and behavior. Subsequent analysis enables historic and current system status reporting, supports fault investigations, and may provide insight for emerging system trends. Unfortunately, the management of log data requires ever more efficient and complex storage tools to access, manipulate, and retrieve these records. Truly effective solutions also require a well- planned architecture supporting the needs of multiple stakeholders. Historically, database requirements were well-served by relational data models, however modern, non-relational databases, i. e. nosql, solutions, initially intended for "big data" distributed system may also provide value for smaller-scale problems such as those required by log data. However, no evaluation method currently exists to adequately compare the capabilities of traditional (relational database) and modern nosql solutions for small-scale problems. This research proposes a methodology to evaluate modern data storage and retrieval systems. While the methodology is intended to be generalizable to many data sources, a commercially- produced unmanned aircraft system served as a representative use case to test the methodology for aircraft log data. The research first defined the key characteristics of database technologies and used those characteristics to inform laboratory simulations emulating representative examples of modern database technologies (relational, key-value, columnar, document, and graph). Based on those results, twelve evaluation criteria were proposed to compare the relational and nosql database types. The Analytical Hierarchy Process was then used to combine literature findings, laboratory simulations, and user inputs to determine the most suitable database type for the log data use case. The study results demonstrate the efficacy of the proposed methodology.
A beginner's guide to get you up and running with Cassandra, DynamoDB, HBase, InfluxDB, MongoDB, Neo4j, and RedisAbout This Book• Covers the basics of 7 nosql databases and how they are used in the enterprises• Qu...
详细信息
ISBN:
(数字)9781787127142
ISBN:
(纸本)9781787288867
A beginner's guide to get you up and running with Cassandra, DynamoDB, HBase, InfluxDB, MongoDB, Neo4j, and Redis
About This Book
• Covers the basics of 7 nosql databases and how they are used in the enterprises
• Quick introduction to MongoDB, DynamoDB, Redis, Cassandra, Neo4j, InfluxDB, and Hbase
• Includes effective techniques for database querying and management
Who This Book Is For
If you are a budding DBA or a developer who wants to get started with the fundamentals of nosql databases, this book is for you. Relational DBAs who want to get insights into the various offerings of popular nosql databases will also find this book to be very useful.
What You Will Learn
• Understand how MongoDB provides high-performance, high-availability, and automatic scaling
• Interact with your Neo4j instances via database queries, Python scripts, and Java application code
• Get familiar with common querying and programming methods to interact with Redis
• Study the different types of problems Cassandra can solve
• Work with HBase components to support common operations such as creating tables and reading/writing data
• Discover data models and work with CRUD operations using DynamoDB
• Discover what makes InfluxDB a great choice for working with time-series data
In Detail
This is the golden age of open source nosql databases. With enterprises having to work with large amounts of unstructured data and moving away from expensive monolithic architecture, the adoption of nosql databases is rapidly increasing. Being familiar with the popular nosql databases and knowing how to use them is a must for budding DBAs and developers.
This book introduces you to the different types of nosql databases and gets you started with seven of the most popular nosql databases used by enterprises today. We start off with a brief overview of what nosql databases are, followed by an explanation of why and when to use them. The book then covers the seven most popular databases in each of these catego
While the concept of database schema plays a central role in relational database systems, most nosql systems are schemaless: these databases are created without having to formally define its schema. Instead, it is imp...
详细信息
ISBN:
(纸本)9783319252643;9783319252636
While the concept of database schema plays a central role in relational database systems, most nosql systems are schemaless: these databases are created without having to formally define its schema. Instead, it is implicit in the stored data. This lack of schema definition offers a greater flexibility;more specifically, the schemaless databases ease both the recording of non-uniform data and data evolution. However, this comes at the cost of losing some of the benefits provided by schemas. In this article, a MDE-based reverse engineering approach for inferring the schema of aggregate-oriented nosql databases is presented. We show how the obtained schemas can be used to build database utilities that tackle some of the problems encountered using implicit schemas: a schema diagram viewer and a data validator generator are presented.
In order to reduce data processing time in nosql databases, we propose in this paper a quantum approach to extracting information from unstructured databases. In fact, we apply Grover's algorithm instead of classi...
详细信息
ISBN:
(纸本)9781450365628
In order to reduce data processing time in nosql databases, we propose in this paper a quantum approach to extracting information from unstructured databases. In fact, we apply Grover's algorithm instead of classical algorithms to search in nosql databases.
We present the Elton tool, a publicly available cloud resource elasticity management system tailored to nosql databases. Elton is integrated in the Ganetimgr web platform, and offers an easy to use web interface, thro...
详细信息
ISBN:
(纸本)9781538655207
We present the Elton tool, a publicly available cloud resource elasticity management system tailored to nosql databases. Elton is integrated in the Ganetimgr web platform, and offers an easy to use web interface, through which monitoring and horizontal scaling of nosql databases can be performed and what-if analysis queries are enabled. Elton uses Markov Decision Processes (MDPs) as the underlying modeling framework, and encapsulates state-of-the-art horizontal scaling policies that offer different trade-offs between performance and monetary deployment cost. Its main novelty is that it employs probabilistic model checking to allow for both efficient elasticity decisions and analysis of scaling actions and serves as a case study about the benefits of model checking in online decision making and analysis.
Data models are a central piece in information systems, being the relational data models very popular and extensively used. In Big Data, and due to the characteristics of the nosql databases, the data modeling task is...
详细信息
ISBN:
(纸本)9783319409733;9783319409726
Data models are a central piece in information systems, being the relational data models very popular and extensively used. In Big Data, and due to the characteristics of the nosql databases, the data modeling task is seen in another perspective, as those databases are considered schema-free. Nevertheless, these databases also need data models that ensure the proper storage and querying of the data. Considering the vast amount of relational databases and the ever-increasing volume of data, the importance of data models in Big Data increases. In this work, a specific set of rules is proposed for the automatic transition between a traditional and a Big Data environment, considering two specific objectives: the identification of a columnar data model for HBase supporting operational needs and the identification of a tabular data model for Hive supporting analytical needs. The obtained results show the applicability of the proposed rules and their relevance for data modeling in Big Data environments.
nosql databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by nosql databases become increasingly essential when dealing with geospatial data. Th...
详细信息
nosql databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by nosql databases become increasingly essential when dealing with geospatial data. The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of data that the data stores must manage. Such characteristics of big spatial data surpassed the capability and anticipated use cases of relational databases. Because we can choose from an extensive collection of nosql databases these days, it becomes vital for organizations to make an informed decision. nosql Database benchmarks provide system architects, who shoulder a considerable burden of selecting the right technology for their data stores, with a vital start point and source of information. The major utility of these benchmarks is reproducing experiments on similar experimental data that can verify and optimize the process of selecting an optimum tool for data management needs in the early phases of the development. The goal of this research is to develop a benchmark that can compare the performance of nosql databases for querying complex geospatial data. We have analyzed throughputs, latencies, and runtime of MongoDB and Couchbase to identify the correct fit for our use case. This way we have also demonstrated a systematic process that can be followed to make an optimum choice of datastore. This benchmark can be extended easily to any nosql database that supports geospatial querying.
暂无评论