Nowadays data is increasing at such high rate there is need of tool which can analyze this data in equally efficient matter, as this data analysis contributes to crucial decision making while dealing with day to day b...
详细信息
ISBN:
(纸本)9781538652589;9781538652572
Nowadays data is increasing at such high rate there is need of tool which can analyze this data in equally efficient matter, as this data analysis contributes to crucial decision making while dealing with day to day business scenarios. It is quite impractical to analyze this data with traditional relational database tools. There has been new growth in technologies which increases our ability to process to this data, Hadoop and nosql databases are some of the names that are very popular these for the same reason. In particular, Elasticsearch is most preferred tool now a days, it is text-based search engine developed by Apache Foundation. It is near real time, highly scalable and distributed tool to search, analyze and store your data. Hybrid database system proposes a solution to most common problem of increased data retrieval time as the size of data increases.
Advances in sensor technologies have led to the instrumentation of sensor networks for bridge monitoring and management. For a dense sensor network, enormous amount of sensor data are collected. The data need to be ma...
详细信息
Advances in sensor technologies have led to the instrumentation of sensor networks for bridge monitoring and management. For a dense sensor network, enormous amount of sensor data are collected. The data need to be managed, processed, and interpreted. Data management issues are of prime importance for a bridge management system. This paper describes a data management infrastructure for bridge monitoring applications. Specifically, nosql database systems such as MongoDB and Apache Cassandra are employed to handle time-series data as well the unstructured bridge information model data. Standard XML-based modeling languages such as OpenBrIM and SensorML are adopted to manage semantically meaningful data and to support interoperability. Data interoperability and integration among different components of a bridge monitoring system that includes on-site computers, a central server, local computing platforms, and mobile devices are illustrated. The data management framework is demonstrated using the data collected from the wireless sensor network installed on the Telegraph Road Bridge, Monroe, MI.
This paper presents a convergence of distributed key-value storage systems in clouds and supercomputers. It specifically presents ZHT, a zero-hop distributed key-value store system, which has been tuned for the requir...
详细信息
This paper presents a convergence of distributed key-value storage systems in clouds and supercomputers. It specifically presents ZHT, a zero-hop distributed key-value store system, which has been tuned for the requirements of high-end computing systems. ZHT aims to be a building block for future distributed systems, such as parallel and distributed file systems, distributed job management systems, and parallel programming systems. ZHT has some important properties, such as being lightweight, dynamically allowing nodes join and leave, fault tolerant through replication, persistent, scalable, and supporting unconventional operations such as append, compare and swap, callback in addition to the traditional insert/lookup/remove. We have evaluated ZHT's performance under a variety of systems, ranging from a Linux cluster with 64 nodes, an Amazon EC2 virtual cluster up to 96 nodes, to an IBM Blue Gene/P supercomputer with 8K nodes. We compared ZHT against other key-value stores and found it offers superior performance for the features and portability it supports. This paper also presents several real systems that have adopted ZHT, namely, FusionFS (a distributed file system), IStore (a storage system with erasure coding), MATRIX (distributed scheduling), Slurm++ (distributed HPC job launch), Fabriq (distributed message queue management);all of these real systems have been simplified because of key-value storage systems and have been shown to outperform other leading systems by orders of magnitude in some cases. It is important to highlight that some of these systems are rooted in HPC systems from supercomputers, while others are rooted in clouds and ad hoc distributed systems;through our work, we have shown how versatile key-value storage systems can be in such a variety of environments. Copyright (c) 2015 John Wiley & Sons, Ltd.
The business potential of big data is leading to a data-driven economy, where low-cost and low-latency data analysis represents a major competitive advantage. The research community has proposed many technological sol...
详细信息
The business potential of big data is leading to a data-driven economy, where low-cost and low-latency data analysis represents a major competitive advantage. The research community has proposed many technological solutions for big data, such as nosql databases, which are difficult to evaluate and compare via standard IT procurement procedures. In addition, lack of competences in big data domains make procurement of big data solutions a tedious and uncertain process, which might impair the success of a business. In this paper, we present a score-based benchmark for distributed databases, which supports adopters in selecting a solution that fits their needs. The proposed benchmark is independent from the configurations of the specific database and deployment environment, requires low effort on the part of end users, is extensible and can be applied to both SQL and nosql databases, can be used to evaluate databases according to different properties (e.g., performance, consistency), and can be integrated with existing benchmarks to reduce the burden of their execution. We experimentally evaluate our methodology to validate its effectiveness.
Now-a-days nosql databases have been potentially used for web-scale applications due its supports towards dynamic and flexible schema design and scalability. However, lack of a suitable database designing method for h...
详细信息
ISBN:
(纸本)9781509025985
Now-a-days nosql databases have been potentially used for web-scale applications due its supports towards dynamic and flexible schema design and scalability. However, lack of a suitable database designing method for heterogeneous nosql databases create difficulties towards application developers and database designers to choose and design such databases effectively. To overcome this challenge, this paper proposes a systematic and rule based transformation mechanism to convert conceptual level nosql data model [6] into an equivalent logical level data model specified in JSON schema (Java Script Object Notation). Further, the correctness of the proposed transformation mechanism has been verified. Moreover, the proposed mechanism is illustrated using a case study.
In the era of Big Data, social media analysis has grown extremely popular. Twitter, one of the most popular social media, is believed to contain many user opinion in its message. Thus, leading organizations start util...
详细信息
ISBN:
(纸本)9781509017096
In the era of Big Data, social media analysis has grown extremely popular. Twitter, one of the most popular social media, is believed to contain many user opinion in its message. Thus, leading organizations start utilizing its data using social media analytic tools to get in-sight of their markets in real-time. However, many Twitter analytic tools are still specified only in some specific tasks. Therefore, in order to enhance the possibility of doing many analysis on Twitter, a data warehouse technology can be utilized to receive, process, and store a real-time Twitter streams. Nonetheless, data warehouse development using relational database start to show its limit on storing big data let alone real-time. Hence, nosql (Not Only SQL) technology has emerged as an alternative solution. In this paper, we try to develop near real-time Twitter data warehouse using nosql database, Cassandra, and compare its storing and querying performance with that developed using relational databases. The results show that Cassandra performs significantly better in storing data than the relational databases. Meanwhile, in its querying performance, Cassandra is slower while using small data but way faster on vast data.
This paper describes an information repository to support bridge monitoring applications on a cloud computing platform. Bridge monitoring, with instrumentation of sensors in particular, collects significant amount of ...
详细信息
ISBN:
(纸本)9781510600447
This paper describes an information repository to support bridge monitoring applications on a cloud computing platform. Bridge monitoring, with instrumentation of sensors in particular, collects significant amount of data. In addition to sensor data, a wide variety of information such as bridge geometry, analysis model and sensor description need to be stored. Data management plays an important role to facilitate data utilization and data sharing. While bridge information modeling (BrIM) technologies and standards have been proposed and they provide a means to enable integration and facilitate interoperability, current BrIM standards support mostly the information about bridge geometry. In this study, we extend the BrIM schema to include analysis models and sensor information. Specifically, using the OpenBrIM standards as the base, we draw on CSI Bridge, a commercial software widely used for bridge analysis and design, and SensorML, a standard schema for sensor definition, to define the data entities necessary for bridge monitoring applications. nosql database systems are employed for data repository. Cloud service infrastructure is deployed to enhance scalability, flexibility and accessibility of the data management system. The data model and systems are tested using the bridge model and the sensor data collected at the Telegraph Road Bridge, Monroe, Michigan.
Big data applications that rely on relational databases gradually expose limitations on scalability and performance. In recent years, Hadoop ecosystem has been widely adopted as an evolving solution. This paper presen...
详细信息
ISBN:
(纸本)9781509038060
Big data applications that rely on relational databases gradually expose limitations on scalability and performance. In recent years, Hadoop ecosystem has been widely adopted as an evolving solution. This paper presents the migration of a legacy data analytics application in a provincial data center. The target platform follows "no one size fits all" method. Considering different workloads, data storage is hybrid with distributed file system (HDFS) and distributed nosql database. Beyond the architecture re-design, we focus on the problem of data model transformation from relational database to nosql database. We propose a query-aware approach to free developers from tedious manual work. The approach generates query-specific views (NoView) for nosql and re-structures the views to align with nosql's data model. Our results show that the migrated application achieves high scalability and high performance. We believe that our practice provides valuable insights (such as nosql data modeling methodology), and the techniques can be easily applied to other similar migrations.
Modern trends in the agriculture domain have made people realize the importance of big data. The key challenge of big data in agriculture is to identify the effectiveness of big data analytics. Moreover, how big data ...
详细信息
ISBN:
(纸本)9781509057733
Modern trends in the agriculture domain have made people realize the importance of big data. The key challenge of big data in agriculture is to identify the effectiveness of big data analytics. Moreover, how big data analytics can be used to improve the productivity in agricultural practices. The purpose of the proposed research is to reduce the technological gap between rural communities and information through recommendations and decision support system. The main contribution of this paper is to propose an open source, cost-effective and scalable big data analytics architecture for an Agro advisory system. As a part of implementation, an analytic framework for big data application development is built and implemented. Also, a prototype application for crop yield prediction is implemented for cotton crop in Ahmedabad district, Gujarat, India.
Consider the fact, that the concepts of nosql databases have been developed and recently, big Internet companies such as Google, Amazon, Yahoo!, and Facebook are using nosql databases. Although the primary focus of No...
详细信息
ISBN:
(纸本)9783319163130;9783319163123
Consider the fact, that the concepts of nosql databases have been developed and recently, big Internet companies such as Google, Amazon, Yahoo!, and Facebook are using nosql databases. Although the primary focus of nosql databases is to deal with huge volume of heterogeneous data, these can also be suited for handling moderate volume of data, especially if the data are heterogeneous and there are frequent changes in data. Considering this we consider the development and implementation of an application with moderate volume of heterogeneous data using a nosql database. We perform comparative performance analysis with a relational database system. The experimental evaluations show that nosql databases are also often suitable for handling moderate volume of data.
暂无评论