A large amount of data are generated every moment by connected objects creating Internet of Things (IoT). These data are difficult to handle using traditional databases leading to use nosql databases. These latter hav...
详细信息
ISBN:
(纸本)9781538642481
A large amount of data are generated every moment by connected objects creating Internet of Things (IoT). These data are difficult to handle using traditional databases leading to use nosql databases. These latter have achieved a large popularity thanks to their high performance, flexibility in scaling, and high availability. But, which nosql database is the most suitable for IoT applications? In this paper, we discuss the main requirements of IoT data management, and we compare five of the most popular nosql databases namely, Redis, Cassandra, MongoDb, Couchbase and Neo4j, in accordance with IoT data management requirements, in order to find the most suitable nosql database for IoT applications.
nosql databases offer powerful abstractions for querying non-relational data. However, nosql products generally pursue superior flexibility, customizability, scalability, and performance goals while neglecting support...
详细信息
ISBN:
(纸本)9781538655207
nosql databases offer powerful abstractions for querying non-relational data. However, nosql products generally pursue superior flexibility, customizability, scalability, and performance goals while neglecting support for generally useful data management tools. In particular, products typically ship without integrated support for management features rendered conventional by the long history of RDBMSs, such as sophisticated query processing systems, join operations, aggregate functions, and integrity constraints. The design decision forces users of nosql technologies to find alternative methods for providing missing tools by engaging either directly or indirectly in a suboptimal k-implementation cycle as developers re-invent new instances of the same data management tools across nosql products. This paper articulates the problem associated with the lax regard for data management support currently defining the class of nosql databases and introduces the Piper package index and management system as an exploratory solution.
Handling data in this current era of Data Science is the biggest challenge. Most of the data practioners moved their path of research in understanding behavior of system. These behaviors are like sentiments;which are ...
详细信息
ISBN:
(纸本)9781538628423
Handling data in this current era of Data Science is the biggest challenge. Most of the data practioners moved their path of research in understanding behavior of system. These behaviors are like sentiments;which are of many types;published and publishing in social networks. These sentiments are evolving at exponential level and becoming crucial for behavioral study of system. Analysis over such sentiments helps in prediction for increasing the profitable of the system. To do this, sentiments must be modelled. As sentiments are evolving explosively traditional approaches are not apt for modelling sentiments;as a result nosql databases were exploited for such applications. This paper emphasis on why DBDBs (Document Based databases), nosql databases;has gaining momentum for handling sentiments. To do this bulk number of sentiments over social networks are taken into interpretation;using MongoDB and CouchDB these sentiments are exploited for analysis.
Nowadays there is a growing need for collecting and processing data from different sources in heterogeneous and semi structured formats. Scientists and companies are strongly urged to find a way for extracting knowled...
详细信息
ISBN:
(纸本)9781538681619
Nowadays there is a growing need for collecting and processing data from different sources in heterogeneous and semi structured formats. Scientists and companies are strongly urged to find a way for extracting knowledge out of them. In this paper, we present a nosql database approach for modeling heterogeneous and semi -structured information in both software architecture and data modeling aspects. We built a robust analytics framework by integrating Apache Spark with Apache Cassandra and in following utilize data mining techniques for presenting a model capable of predicting the relationship between tourist arrivals and nights spent in Greece. The proposed model puts to use a constructed dataset both from the Hellenic Statistical Authority and Eurostat. The evaluation shows that the proposed data model, used for fitting the current dataset, predicts tourist behaviour with high accuracy.
Schema-flexible nosql data stores lend themselves nicely for storing versioned data, a product of schema evolution. In this lightning talk, we apply pending schema changes to records that have been persisted several s...
详细信息
ISBN:
(纸本)9781538655207
Schema-flexible nosql data stores lend themselves nicely for storing versioned data, a product of schema evolution. In this lightning talk, we apply pending schema changes to records that have been persisted several schema versions back. We present first experiments with MongoDB and Cassandra, where we explore the trade-off between applying chains of pending changes stepwise (one after the other), and as composite operations. Contrary to intuition, composite migration is not necessarily faster. The culprit is the computational overhead for deriving the compositions. However, caching composition formulae achieves a speed up: For Cassandra, we can cut the runtime by nearly 80%. Surprisingly, the relative speedup seems to be system-dependent. Our take away message is that in applying pending schema changes in nosql data stores, we need to base our design decisions on experimental evidence rather than on intuition alone.
Mobile developers constantly have to deal with users pressure for continuous delivery of apps while keeping quality attributes such as confidentiality and data integrity. To better support developers in testing securi...
详细信息
Mobile developers constantly have to deal with users pressure for continuous delivery of apps while keeping quality attributes such as confidentiality and data integrity. To better support developers in testing security vulnerabilities during evolution and maintenance of mobile apps, in this demo we present a novel tool, OPIA, for on-device security testing. OPIA allows developers/testers to (i) conduct SQL-injection attacks and collect logs to identify leaks of sensitive information through record-and-replay testing, and (ii) extract data stored in local databases and shared preferences to identify sensitive information that is not properly encrypted, anonymized. OPIA is publicly available at GitHub.
With the advancement in big data, nosql databases are enjoying ever-growing popularity. The increasing use of this technology in large applications also brings security concerns to the fore. Historically, SQL injectio...
详细信息
With the advancement in big data, nosql databases are enjoying ever-growing popularity. The increasing use of this technology in large applications also brings security concerns to the fore. Historically, SQL injection has been one of the major security threats over the years. Recent studies reveal that nosql databases also have become vulnerable to injections. However, nosql security is yet to receive the attention it deserves from the industry or academia. In this work, we develop a tool for detecting nosql injections using supervised learning. To the best of our knowledge, our developed training dataset on nosql injection is the first of its kind. We manually design important features and apply various supervised learning algorithms. Our tool has achieved 0.93 F 2 -score as established by 10-fold cross-validation. We also apply our tool to a nosql injection generating tool, nosqlMap and find that our tool outperforms Sqreen, the only available nosql injection detection tool, by 36.25% in terms of detection rate. The proposed technique is also shown to be database-agnostic achieving similar performance with injection on MongoDB and CouchDB databases.
The trajectory data often contains a large amount of information. So effective storage management is needed to analyze and apply the trajectory data. This paper proposes a distributed storage strategy for the characte...
详细信息
The trajectory data often contains a large amount of information. So effective storage management is needed to analyze and apply the trajectory data. This paper proposes a distributed storage strategy for the characteristics of trajectory data and the requirement of data balance among distributed nodes. A trajectory data is converted into a rectangle by the MBR to ensure data integrity;and the center of the MBR is calculated for K-Means clustering;then, this paper divides the clustering results by grid and merges the grids with high overlap rate. Finally, a data partitioning strategy is used to store the divided data in each nodes so that the data volume of each node is balance. Experiments show that the proposed distributed storage strategy can effectively take into account the integrity of trajectory data, spatial proximity, data structure flexibility to achieve data balance among each nodes based on distributed nosql database.
Modern nosql databases use log-structured merge (LSM) storage architectures to support high write throughput. LSM architectures aggregate writes in a mutable MemTabte (stored in memory), which is regularly flushed to ...
详细信息
ISBN:
(数字)9781728108582
ISBN:
(纸本)9781728108599
Modern nosql databases use log-structured merge (LSM) storage architectures to support high write throughput. LSM architectures aggregate writes in a mutable MemTabte (stored in memory), which is regularly flushed to disk, creating a new immutable file called an SSTable. Periodically, some of the SSTables are chosen to be merged - replaced with a single SSTable containing their union. A merge policy (a.k.a. compaction policy) specifies when to do merges and which SSTables to combine. A bounded depth merge policy is one that guarantees that the number of SSTables never exceeds a given parameter k, typically in the range 3-10. Bounded-depth policies are useful in applications where low read latency is crucial, but they and their underlying combinatorics are not yet well understood. This paper compares several bounded-depth policies, including representative policies from industrial nosql databases and two new ones based on recent theoretical modeling. The results validate the proposed theoretical model and show that, compared to the existing policies, the newly proposed policies can have substantially lower write amplification.
This paper proposes the system realizing integrated use of Relational databases and a nosql Database. A heterogeneous information source integration system has been proposed. It enables equi-join and projection of tab...
详细信息
This paper proposes the system realizing integrated use of Relational databases and a nosql Database. A heterogeneous information source integration system has been proposed. It enables equi-join and projection of tables without converting data. With this system, it became possible to handle various databases at the same time by using JDBC, but the databases which can be handled are only relational databases. Therefore, we add functions to the system to handle the nosql database which has been increasingly used in recent years. The table join execution time and the memory consumption of the system are experimentally evaluated. It showed that the performance is tolerable in practical use as compared with integrated use of only relational database. We also showed that it is practical to change the join order or to register column families which are unique elements of HBase.
暂无评论