in-memory databases have become a mainstay of enterprise computing offering significant performance boosts for OLAP and OLTP workloads as well as improved prospects for application integration through an efficient, sh...
详细信息
ISBN:
(纸本)9781467392037
in-memory databases have become a mainstay of enterprise computing offering significant performance boosts for OLAP and OLTP workloads as well as improved prospects for application integration through an efficient, shared database layer. Despite significant R&D investments into in-memory data management, limited insights are available on the impacts of middleware platforms for application integration, i.e., how they need to evolve to leverage in-memory database capabilities. This paper provides a first exposition into how in-memory databases impact Business Process Management, as a mission-critical model-driven application integration middleware. Through it, we discuss how in-memory databases will render some prevalent uses cases of BPM middleware obsolete, while opening up prospects for tighter application integration, better process automation performance and some entirely new BPM capabilities such as process-based application customization. To validate the feasibility of an in-memory BPM, we develop a surprisingly simple BPM runtime embedded into SAP HANA and providing for BPMN-based process automation capabilities.
Modern computer architectures provide high performance computing capability by having multiple CPU cores. Such systems are also typically associated with very large main-memory capacities, thereby allowing them to be ...
详细信息
Modern computer architectures provide high performance computing capability by having multiple CPU cores. Such systems are also typically associated with very large main-memory capacities, thereby allowing them to be used for fast processing of in-memory database applications. However, most of the concurrency control mechanism associated with the index structures of these memory resident databases do not scale well, under high transaction rates. This paper presents the O2-Tree, a fast main memory resident index, which is also highly scalable and tolerant of high transaction rates in a concurrent environment using the relaxed balancing tree algorithm. The O2-Tree is a modified Red-Black tree in which the leaf nodes are formed into blocks that hold key-value pairs, while each internal node stores a single key that results from splitting leaf nodes. Multi-threaded concurrent manipulation of the O2-Tree outperforms popular NoSQL based key-value stores considered in this paper.
Advances in healthcare data management and analytics have opened new horizons for healthcare providers such as cost effective treatments, ability to detect medical fraud, and diagnose diseases at an early stage. Centr...
详细信息
ISBN:
(纸本)9781479950577
Advances in healthcare data management and analytics have opened new horizons for healthcare providers such as cost effective treatments, ability to detect medical fraud, and diagnose diseases at an early stage. Central to these abilities is the need for fast ad-hoc query processing of large volumes of complex healthcare datasets. End users who work with healthcare databases spend enormous effort in data exploration since exploration is the first step to any subsequent predictive modeling to generate actionable insights for patients, providers and physicians. Unfortunately, unlike other domains the complexity and volumes of claims (ICD9 or 10) as well as clinical (HL7) healthcare datasets results in data exploration solutions being extremely slow and cumbersome when attempted using traditional disk resident data warehousing approaches. In this paper we describe the first ever attempt of real-time data exploration for healthcare datasets using in-memory databases. We benchmark and compare two such in-memory database systems to study responsiveness and ability to handle complexity of typical health data exploration tasks. We share our work in progress results and outline key issues that need to be addressed for forthcoming advances in this very important big data vertical.
in-memory databases (IMDB) are a raising technology and have the potential to mark the end of computing performance bottlenecks. Our literature review did reveal that to date, no study analyzes the resulting benefits ...
详细信息
ISBN:
(纸本)9780692253205
in-memory databases (IMDB) are a raising technology and have the potential to mark the end of computing performance bottlenecks. Our literature review did reveal that to date, no study analyzes the resulting benefits of IMDB usage and little is known about cause and effect of IMDB usage and its economic effects. This paper provides a structured analysis of experiences gained with IMDB usage. Surprisingly, our study showed that the promoted vision of IMDB as enabler for an integrated OLAP/OLTP infrastructure has not been put into practice yet. Currently, IMDB is mainly used for response time improvement in the field of analytical data processing. Furthermore, we observed that IMDB is predominantly used to improve existing business processes, instead of establishing new business models. The findings of our study may serve as basis for the development of hypotheses, validation and advancement of the theory of disruptive innovation and further empirical quantitative studies.
Specialized processing units such as GPUs or FPGAs provide great opportunities to speed up database operations by exploiting parallelism and relieving the CPU. However, distributing a workload on suitable (co-)process...
详细信息
Specialized processing units such as GPUs or FPGAs provide great opportunities to speed up database operations by exploiting parallelism and relieving the CPU. However, distributing a workload on suitable (co-)processors is a challenging task, because of the heterogeneous nature of a hybrid processor/co-processor system. In this paper, we present a framework that automatically learns and adapts execution models for arbitrary algorithms on any (co-)processor. Our physical optimizer uses the execution models to distribute a workload of database operators on available (co-)processing devices. We demonstrate its applicability for two common use cases in modern database systems. Additionally, we contribute an overview of GPU-co-processing approaches, an in-depth discussion of our framework's operator model, the required steps for deploying our framework in practice and the support of complex operators requiring multi-dimensional learning strategies. (C) 2013 Elsevier Ltd. All rights reserved.
The paper presents a primary-backup protocol to manage replicated in-memory database systems (IMDBs). The protocol exploits two features of IMDBs: coarse-grain concurrency control and deferred disk writes. Primary cra...
详细信息
ISBN:
(纸本)0769526403
The paper presents a primary-backup protocol to manage replicated in-memory database systems (IMDBs). The protocol exploits two features of IMDBs: coarse-grain concurrency control and deferred disk writes. Primary crashes are quickly detected by backups and a new primary is elected whenever the current one is suspected to have failed. False failure suspicions are tolerated and never lead to incorrect behavior. The protocol uses a consensus-like algorithm tailor-made for our replication environment. Under normal circumstances (i.e., no failures or false suspicions), transactions can be committed after two communication steps, as seen by the applications. Performance experiments have shown that the protocol has very low overhead and scales linearly with the number of replicas.
The hardware technology continues to improve at a considerable rate. Besides the Moore law increments of the CPU speed, it should be considered that the capacity of the main memory in the last years is increasing at a...
详细信息
ISBN:
(纸本)0769524524
The hardware technology continues to improve at a considerable rate. Besides the Moore law increments of the CPU speed, it should be considered that the capacity of the main memory in the last years is increasing at an even more impressive rate. One of the consequences of a continuous increment of memory resource is that we can design and implement memory-embedded Web sites, where both the static resources and the database information is kept in main memory. In this paper, we evaluate the impact of memory trends on the performance of e-commerce sites that continue to be an important reference for Internet-based services in terms of complexity of the hardware/software technology and in terms of performance, availability and scalability requirements. However most results are valid even for other Web-based services. We demonstrate through experiments on a real system how the system bottlenecks change depending on the amount of memory that is (or will be) available for the Web site data. This analysis allows us to anticipate the interventions on the hardware/software components that could improve the capacity of present and future Web systems for content generation and delivery.
Cache-conscious behaviour of data structures becomes more important as memory sizes increase and whole databases fit into main memory. For spatial data, R-trees, originally designed for disk-based data, can be adopted...
详细信息
ISBN:
(纸本)0909925836
Cache-conscious behaviour of data structures becomes more important as memory sizes increase and whole databases fit into main memory. For spatial data, R-trees, originally designed for disk-based data, can be adopted for in-memory applications. In this paper, we will investigate how the small amount of space in an in-memory R-tree node can be used better to make R-trees more cache-conscious. We observe that many entries share sides with their parents, and introduce the partial R-tree which only stores information that is not given by the parent node. Our experiments showed that the partial R-tree shows up to 30 per cent better performance than the traditional R-tree. We also investigated if we could improve the search performance by storing more descriptive information instead of the standard minimum bounding box without decreasing the fanout of the R-tree. The partial static O-tree is based on the O-tree, but stores only the most important part of the information of an O-tree box. Experiments showed that this approach reduces the search time for line data by up to 60 per cent.
暂无评论