The health organizations store the patient data in different repositories and scattered in diverse locations. In the healthcare domain, the problem is that each hospital or even each department under a hospital mainta...
详细信息
ISBN:
(纸本)9781665458412
The health organizations store the patient data in different repositories and scattered in diverse locations. In the healthcare domain, the problem is that each hospital or even each department under a hospital maintains its own database having various data models (SQL, NoSQL, etc.). In this situation, existing or new applications require to grant healthcare actors to locate and share patient data from those pre-existing distributed databases (DDBs) remotely for the needs of patient quality treatment, daily operations of the health centers. However, data integration from distributed data sources is raising concern for data model variability. Therefore, it is significant to identify that how much an application like middleware is efficient to reconstruct and share patient data remotely from heterogeneous DDBs over the networks. The health organizations could also require to ensure whether their existing database model performs well or should replace by another one. So, this paper aims to design a system using different databases consisting of distinct data structures and an algorithm for middleware to integrate data from them with testing the system performance. The experimental results of this research work show that the patient data could be shared from various distributed data sources efficiently. Therefore the study could direct the healthcare organizations for sharing patient data from heterogeneous distributed databases without replacing the existing data model.
Recently, there has been a lot of work undertaken in the area of distributed databases. The architecture and syntax presented in this paper is a response to the very real need for the development of a unified approach...
详细信息
The standard way to scale a distributed OLTP DBMS is to horizontally partition data across several nodes. Ideally, this results in each query/transaction being executed at just one node, to avoid the overhead of distr...
详细信息
ISBN:
(纸本)9781467300421
The standard way to scale a distributed OLTP DBMS is to horizontally partition data across several nodes. Ideally, this results in each query/transaction being executed at just one node, to avoid the overhead of distribution and allow the system to scale by adding nodes. For some applications, simple strategies such as hashing on primary key provide this property. Unfortunately, for many applications, including social networking and order-fulfillment, simple partitioning schemes applied to many-to-many relationships create a large fraction of distributed queries/transactions. What is needed is a fine-grained partitioning, where related individual tuples (e. g., cliques of friends) are co-located together in the same partition. Maintaining a fine-grained partitioning requires storing the location of each tuple. We call this metadata a lookup table. We present a design that efficiently stores very large tables and maintains them as the database is modified. We show they improve scalability for several difficult to partition database workloads, including Wikipedia, Twitter, and TPC-E. Our implementation provides 40% to 300% better throughput on these workloads than simple range or hash partitioning.
This paper presents the reliability mechanisms of SDD-1, a prototype distributed database system being developed by the Computer Corporation of America. Reliability algorithms in SDD-1 center around the concept of the...
详细信息
A locking protocol to coordinate access to a distributed database and to maintain system consistency throughout normal and abnormal conditions is presented. The proposed protocol is robust in the face of crashes of an...
详细信息
A locking protocol to coordinate access to a distributed database and to maintain system consistency throughout normal and abnormal conditions is presented. The proposed protocol is robust in the face of crashes of any participating site, as well as communication failures. Recovery from any number of failures during normal operation or any of the recovery stages is supported. Recovery is done in such a way that maximum forward progress is achieved by the recovery procedures. Integration of virtually any locking discipline including predicate lock methods is permitted by this protocol. The locking algorithm operates, and operates correctly, when the network is partitioned, either intentionally or by failure of communication lines. Each partition is able to continue with work local to it, and operation merges gracefully when the partitions are reconnected. A subroutine of the protocol, that assures reliable communication among sites, is shown to have better performance than two-phase commit methods. For many topologies of interest, the delay introduced by the overall protocol is not a direct function of the size of the network. The communications cost is shown to grow in a relatively slow, linear fashion with the number of sites participating in the transaction. An informal proof of the correctness of the algorithm is also presented in this paper. The algorithm has as its core a centralized locking protocol with distributed recovery procedures. A centralized controller with local appendages at each site coordinates all resource control, with requests initiated by application programs at any site. However, no site experiences undue load. Recovery is broken down into three disjoint mechanisms: for single node recovery, merge of partitions, and reconstruction of the centralized controller and tables. The disjointness of the mechanisms contributes to comprehensibility and ease of proof. The paper concludes with a proposal for an extension aimed at optimizing operation of
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributed processing of multiple queries/business transactions over the Web. Thu...
详细信息
ISBN:
(纸本)0769522106
In here we consider the problem of parallel execution of Join operation by a J2EE cluster. J2EE clusters are intended for coarse-grain distributed processing of multiple queries/business transactions over the Web. Thus, the possiblity of using it J2EE cluster for fine-grain parallel computations (parallel Joins in our case) is intriguing and of practical interest. We have developed a new variant of the SFR algorithm for parallel computation of Cartesian Product in Join operations and proved its optimality in terms of communication/execution-time tradeoffs via a simple lower bound. Our experimental results show that despite the fact that J2EE is considered to be a platform that uses a complex interfaces and software entities, such as various types of Java beans, J2EE clusters can be efficiently used to execute Join operation in parallel.
A distributed database is a collection of data stored in different locations of a distributed system. The processing of queries in distributed databases is quite complex but of great importance for information managem...
详细信息
A distributed database is a collection of data stored in different locations of a distributed system. The processing of queries in distributed databases is quite complex but of great importance for information management. Students who have to learn that process have serious difficulties for understanding them. On this work we present a web platform for helping the students learning the processing and optimization of queries in distributed databases. The novelty of this platform is that as far as we know, there is no similar graphical tool. It allows to visualize step by step the different phases of distributed query processing, showing how are they forming, making it easier for the students to understand these concepts. Moreover, having this web platform available, always and everywhere, indirectly have an impact on other competences like encouraging students' autonomous work and self-learning, adapting the teaching to its one-time necessities and reinforcing the advantages to apply information techniques in the teaching field. The results of the developed tests to validate the platform's functionalities and student's satisfaction were very positive. (C) 2020 The Authors. Published by Elsevier B.V.
This paper focuses on the fragment allocation problem in distributed databases and proposes an approach that minimizes query splitting. Query splitting occurs when a query has to access multiple servers to retrieve th...
详细信息
This paper focuses on the fragment allocation problem in distributed databases and proposes an approach that minimizes query splitting. Query splitting occurs when a query has to access multiple servers to retrieve the fragments it needs, resulting in reduced system performance. The objective of minimizing query splitting is important because it captures many factors that affect the performance of the database, such as reducing response time and cost. The paper presents a column generation-based algorithm to solve the fragment allocation problem, which requires less fine-tuning of its parameters and outperforms the IP approach implemented by CPLEX in terms of the number of queries split and execution time. The approach and algorithm offer practical solutions to optimize the design of a distributed database system. The paper's contribution is significant as it fills the gap in the literature by offering a novel approach that minimizes query splitting, which can serve as a proxy for achieving a combination of other objectives such as minimizing costs, reducing response time, and balancing server workloads.
作者:
Vlach, RCharles Univ
Fac Math & Phys Dept Software Engn Prague 11800 1 Czech Republic
Mobile agent technology raises a new dimension in distributed database processing. Of interest in this paper are mobile procedures cs queryihg multiple databasesdistributed over a network. In the used execution model...
详细信息
ISBN:
(纸本)0769508197
Mobile agent technology raises a new dimension in distributed database processing. Of interest in this paper are mobile procedures cs queryihg multiple databasesdistributed over a network. In the used execution model, mobile execution can profit from both migration, reducing amount of transmitted data, and data prefetching at the most beneficial site. To achieve the lowest response time, an execution strategy should suggest an appropriate mix of agent migration, remote database access, and data prefetching. Since there is no universal strategy suggesting the optimal execution in all cases we must resign ourselves to possibly not optimal solution. Thc main achievement of this paper consists in proposing four dynamic execution strategies of different implementation complexity with different behavior under various conditions. The strategies were tested in the real internet and their performance is compared to each other and to a classical centralized stationary approach.
Query Optimization is principally a multifaceted exploration job that searches for best plan amongst the semantically equal plans that are obtained from any given query. The execution of any processing datasets essent...
详细信息
ISBN:
(纸本)9781467366809;9781467366793
Query Optimization is principally a multifaceted exploration job that searches for best plan amongst the semantically equal plans that are obtained from any given query. The execution of any processing datasets essentially depends on the capability of query optimization procedure to acquire competent query processing approaches. A distributed Database System (DDS) is a group of autonomous cooperating integrated procedure. Query at a specified place may necessitate information from distant places in a distributed Environment. In query optimization, the cost is accompanied by every query execution plan. Cost is the summation of native cost that is I/O cost, CPU cost at every location and cost of transmitting information amongst locations. The key issue of a Query Optimization in a distributed Database System is to obtain an effective query strategy with an efficient accuracy and minimum response time or cost to execute the given query. In this paper a novel methodology is suggested that selected the best query plan as to execute the given query employing Genetic Algorithm Strategy for distributed databases and a Clustering Approach within the databases so as to execute the query plan. Genetic Algorithms are extensively employed and acceptable methods for very challenging optimization problems. This proposed technique gives efficient performance in different environment. The Experimental analysis of the proposed methodology is carried out on 100 different queries distributed over 20 different sites having 8 relations in each query. This is compared with the DB2 distributed optimizers and achieved an increased reliability and high performance with respect to the optimization cost and accuracy for the queries in the distributed databases
暂无评论