This paper presents a logical database design methodology for a MongoDB nosql database. Given a query, the design methodology is able to assist database designers to determine the best set of configurations of data, a...
详细信息
ISBN:
(纸本)9781665437721
This paper presents a logical database design methodology for a MongoDB nosql database. Given a query, the design methodology is able to assist database designers to determine the best set of configurations of data, also known elsewhere as scheme trees, in the database such that the retrieval time of the query can be minimal or reduced. The design methodology first models an application of interest with a conceptual model. Based on our previous researches, the design methodology then generates from the conceptual model as few scheme trees as possible, which will eventually be implemented as MongoDB's collections in the database. To illustrate the design methodology, the COVID-19 data set was downloaded as an example application. The design methodology first conceptualized the data set with an Entity-Relationship model. Multiples queries were then devised to access various parts of the date set, whose executions required retrievals of the attribute values of all or some of the entity types and/or the relationship in the ER model. The design methodology then generated the best sets of scheme trees for the queries.
For over four decades, Relational database management systems RDBMS have been the primary model for data storage, retrieval and management. However, due to the continuous information growth in current organizations an...
详细信息
ISBN:
(纸本)9781665419963
For over four decades, Relational database management systems RDBMS have been the primary model for data storage, retrieval and management. However, due to the continuous information growth in current organizations and the increasing needs for scalability and performance, specially while handling a very huge amount of data that generated by various new generation real time applications or social networking sites that could be unstructured or semi-structured data, poses a set of challenges to the existing RDBMS Vendors. Such challenges have created a need for adaptation alternative technologies in the field of data storage and manipulation. nosql technology is the alternative category of Database Management Systems that have been emerged as the solution to the ever-growing data requirements. In this paper, the advantages and the limitations of relational databases we will be presented. The nosql data model, types of nosql data stores, characteristics of each data store, advantages and disadvantages of nosql over RDBMS will also be discussed. The paper helps the interest users to take a review of the different database model solutions, which can serve as a base for selecting the proper database model that can satisfy their application requirements.
Typical data warehouse systems are implemented either on a relational database or on a multi-dimensional database. While the former supports ROLAP operations the latter supports MOLAP. We explore a third alternative, ...
详细信息
ISBN:
(纸本)9789897583759
Typical data warehouse systems are implemented either on a relational database or on a multi-dimensional database. While the former supports ROLAP operations the latter supports MOLAP. We explore a third alternative, that is, to implement a data warehouse on a nosql database. For this, we propose rules that help us move from information obtained from data warehouse requirements engineering stage to the logical model of nosql databases, giving rise to NOSOLAP (nosql OLAP). We show the advantages of NOSOLAP over ROLAP and MOLAP. We illustrate our NOSOLAP approach by converting to the logical model of Cassandra and give an example.
The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handlin...
详细信息
ISBN:
(纸本)9781728108582
The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of spatial data that data stores have to manage. Traditional relational databases reveal limitations in handling such big geospatial data. mainly due to their rigid schema requirements and limited scalability. Numerous nosql databases have emerged and actively serve as alternative data stores for big spatial data. Benchmarks play a crucial role in evaluating nosql databases and provide decision-makers with trustworthy information to choose the most suitable data store for applications. In this study, we present a framework to evaluate the performance and scalability of geospatial nosql databases, called GeoYCSB. We extend YCSB, a de facto benchmark framework for nosql systems, by integrating new components to its design architecture and also by implementing geospatial workloads. We use GeoYCSB to evaluate two leading document stores, MongoDB and Couchbase, which support geospatial queries. GeoYCSB is extensible and can be used to evaluate any nosql databases for geospatial workloads, provided they support spatial queries.
The majority of data kept in organizations are in unstructured arrangement including text, sound, video etc. The scope of the keywords is in recovering required documents in a web look or to brief documents for rankin...
详细信息
ISBN:
(纸本)9781450366526
The majority of data kept in organizations are in unstructured arrangement including text, sound, video etc. The scope of the keywords is in recovering required documents in a web look or to brief documents for ranking purposes. Keywords are the minutest segments which is used to represent a textual document and are frequently used to point to the most pertinent data contained in texts. Extracting keywords and key phrases is mainly used for identifying content of a document. In this research, we introduce a keyword and key phrase-based extraction approach using MongoDB which is a nosql database. By using this new method, documents are queried and ranked with the help of the keyword or keyphrases which is decided and specified by the user in a query. This result can be considered as a measure to decide the Key Performance Indicator of any business organization.
Software engineers can consider today a multitude of storage solutions and data formats to achieve better performance, lower cost, or even explore the power expression of a data model to develop an application. We cal...
详细信息
ISBN:
(纸本)9781728113371
Software engineers can consider today a multitude of storage solutions and data formats to achieve better performance, lower cost, or even explore the power expression of a data model to develop an application. We call it polyglot access. Nevertheless, the cost of developing polyglot software increases due, for instance, to the complexity of managing multiple connections to databases and the need for training people to use different tools, models and query languages. This paper presents a scalable middleware, called WA-RDF, that provides a unique gateway to multiple nosql databases. Different from other similar ideas, WA-RDF uses the well-known abstractions of Semantic Web to store and query RDF data into key/value, document and graph databases. Moreover, WA-RDF includes workload-awareness, fragmentation and partitioning components to meet the nosql high level of scalability. An experimental evaluation shows that the approach is promising. It scaled linearly to the dataset size and query frequency growth, and outperformed a multimodel database in the tested use cases.
The Database field is undergoing significant changes. Although relational systems are still predominant, the interest in nosql systems is continuously increasing. In this scenario, polyglot persistence is envisioned a...
详细信息
The Database field is undergoing significant changes. Although relational systems are still predominant, the interest in nosql systems is continuously increasing. In this scenario, polyglot persistence is envisioned as the database architecture to be prevalent in the future. Therefore, database tools and systems are evolving to support several data models. Multi-model database tools normally use a generic or unified metamodel to represent schemas of the data model that they support. Such metamodels facilitate developing database utilities, as they can be built on a common representation. Also, the number of mappings required to migrate databases from a data model to another is reduced, and integrability is favored. In this paper, we present the U-Schema unified metamodel able to represent logical schemas for the four most popular nosql paradigms (columnar, document, key-value, and graph) as well as relational schemas. We will formally define the mappings between U-Schema and the data model defined for each database paradigm. How these mappings have been implemented and validated will be discussed, and some applications of U-Schema will be shown. To achieve flexibility to respond to data changes, most of nosql systems are ``schema-on-read,'' and the declaration of schemas is not required. Such an absence of schema declaration makes structural variability possible, i.e., stored data of the same entity type can have different structure. Moreover, data relationships supported by each data model are different;For example, document stores have aggregate objects but not relationship types, whereas graph stores offer the opposite. Through the paper, we will show how all these issues have been tackled in our approach. As far as we know, no proposal exists in the literature of a unified metamodel for relational and the nosql paradigms which describes how each individual data model is integrated and mapped. Our metamodel goes beyond the existing proposals by distinguishing ent
nosql databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by nosql databases become increasingly essential when dealing with geospatial data. Th...
详细信息
nosql databases provide an edge when it comes to dealing with big unstructured data. Flexibility, agility, and scalability offered by nosql databases become increasingly essential when dealing with geospatial data. The proliferation of geospatial applications has tremendously increased the variety, velocity, and volume of data that the data stores must manage. Such characteristics of big spatial data surpassed the capability and anticipated use cases of relational databases. Because we can choose from an extensive collection of nosql databases these days, it becomes vital for organizations to make an informed decision. nosql Database benchmarks provide system architects, who shoulder a considerable burden of selecting the right technology for their data stores, with a vital start point and source of information. The major utility of these benchmarks is reproducing experiments on similar experimental data that can verify and optimize the process of selecting an optimum tool for data management needs in the early phases of the development. The goal of this research is to develop a benchmark that can compare the performance of nosql databases for querying complex geospatial data. We have analyzed throughputs, latencies, and runtime of MongoDB and Couchbase to identify the correct fit for our use case. This way we have also demonstrated a systematic process that can be followed to make an optimum choice of datastore. This benchmark can be extended easily to any nosql database that supports geospatial querying.
Today, Big Data is the main topic of discussion everywhere due to its huge popularity as its getting generated in a huge volume in every second. It is getting huge consideration and gratitude because of its wide resea...
详细信息
ISBN:
(纸本)9781665446419
Today, Big Data is the main topic of discussion everywhere due to its huge popularity as its getting generated in a huge volume in every second. It is getting huge consideration and gratitude because of its wide research area and application scenarios. Large scale, bulky, quick changes, huge growth in data is generally stated as Big Data. Data that is obtained from a wide variety of sources are usually in a format of structured, unstructured or semi-structured data. Many times big data is collected from multiple application sources, so there is a presence of structural heterogeneity. This problem of structural heterogeneity is one of the major challenges for researchers around the world and can be overcome using big data Integration. As big data refers to the data in large volumes, available in different formats and generated at extraordinary speed so to capture, process and analyze this kind of data becomes difficult using traditional data processing tools. These difficulties can be overcome using big data management tools and techniques. Big Data Integration and Management are very crucial, revolutionizing the industries and has many applications in all sectors of human life. This paper discusses brief information about big data, its history, integration issues, and available management methods and tools.
When nosql database systems are used in an agile software development setting, data model changes occur frequently and thus, data is routinely stored in different versions. This leads to an overhead affecting the soft...
详细信息
ISBN:
(数字)9781728142661
ISBN:
(纸本)9781728142678
When nosql database systems are used in an agile software development setting, data model changes occur frequently and thus, data is routinely stored in different versions. This leads to an overhead affecting the software development and in particular, the management of data accesses. In this context, different data migration strategies exist, which are characterized by certain advantages and disadvantages. Using exactly that strategy whose characteristics match the according migration scenario, depends on the query workload, the changes in the data model caused by schema evolution, and the requirements for the application in terms of migration costs and latency during data accesses. In this paper we present a methodology of selfadapting data migration, which automatically adjusts migration strategies and its parameters accordingly, thereby supporting the agile software development.
暂无评论