The goal of checkpointing in database management systems is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having...
详细信息
The goal of checkpointing in database management systems is to save database states on a separate secure device so that the database can be recovered when errors and failures occur. Recently, the possibility of having a checkpointing mechanism which does not interfere with the transaction processing has been studied [4], [7]. Users are allowed to submit transactions while the checkpointing is in progress, and the transactions are performed in the system concurrently with the checkpointing process. This property of noninterference is highly desirable to real-time applications, where restricting transaction activity during the checkpointing operation is in many cases not feasible. In this paper, a new algorithm for checkpointing in distributed database systems is proposed and its correctness is proved. The practicality of the algorithm is discussed by analyzing the extra work- load and the robustness of it with respect to site failures. [ABSTRACT FROM AUTHOR]
The semijoin offers a method of reducing the amount of data transmission among sites in a distributed database system. Previously, the semijoin has been studied primarily for reducing communication cost in an environ...
详细信息
The semijoin offers a method of reducing the amount of data transmission among sites in a distributed database system. Previously, the semijoin has been studied primarily for reducing communication cost in an environment with global public communication networks. In a local area system, however, wide bandwidth usually is available, and the cost of communication is virtually negligible. In light of this, a simplified model of a local area network is adopted, with no constraint imposed on the transmission line capacity and the communication processing capability at each site. Each site is assumed to be able to send and/or receive any amount of data simultaneously. For the model, an efficient algorithm for deriving the shortest semijoin schedule -- in the sense of minimizing the total number of semijoin transmissions -- is developed for the class of tree queries. The algorithm is based on a schedule diagram newly introduced to represent the semijoin schedule.
A distributed computer control system is being implemented at the Los Alamos Proton Storage Ring using DEC MicroVAX computers with a distributed database. The first prototype system is now in place and has been runnin...
详细信息
A distributed computer control system is being implemented at the Los Alamos Proton Storage Ring using DEC MicroVAX computers with a distributed database. The first prototype system is now in place and has been running during the last storage ring running period. This paper describes the implementation, initial tests, and future plans for the system.
The transaction backout problem arises in the area of distributed databases. Suppose failures partition a data-redundant distributed database, and each partition continues to function as if it were the entire database...
详细信息
The transaction backout problem arises in the area of distributed databases. Suppose failures partition a data-redundant distributed database, and each partition continues to function as if it were the entire database. When the network is reconnected, the transactions executed by different partitions may not be serializable, and hence it may be necessary to backout some of the transactions. The transaction backout problem is to remove the smallest set of transactions that will leave the remaining ones serializable. The general problem is NP-complete, and in this paper we show that the special case of a fixed-size database and two partitions can be solved in polynomial time by dynamic programming.
Compared to centralized database systems, distributed systems have certain advantages dependent on the manner in which data are redundantly distributed. These advantages include: 1. improved response time, 2. bette...
详细信息
Compared to centralized database systems, distributed systems have certain advantages dependent on the manner in which data are redundantly distributed. These advantages include: 1. improved response time, 2. better data availability, and 3. reduced transmission costs. However, distributed systems create complexities necessitating various distribution controls such as global deadlock resolution and consistency maintenance, which result in increased overhead. One of the factors mitigating this overhead is to find the optimal data-allocation schemes determined by the application environment. Thus, it is necessary to consider trade-offs between data availability, which is suitable for applications, and distribution control overhead. Some data allocation schemes are proposed, derived from an assessment of results obtained from a series of simulations that focus on both response time and communication cost against distribution control overhead in some application environments.
This paper studies the problem of query optimization in a distributed database. Assuming a linear additive cost function (in volume of data moved), we present a fast algorithm for finding the optimal program that answ...
详细信息
This paper studies the problem of query optimization in a distributed database. Assuming a linear additive cost function (in volume of data moved), we present a fast algorithm for finding the optimal program that answers a class of common queries, called chain queries. The key to the problem formulation and to the derivation of an efficient algorithm is an elegant parameterization of the database state against which the query is to be answered. This parameterization then enables us to characterize the set of potentially optimal programs, which in turn leads to a fast dynamic programming algorithm. Since in practice the needed parameters may not be available to the database system, we also discuss how to deal with partial parameterizations of the database state.
Consistency control protocols can be classified as either pessimistic or optimistic. Pessimistic protocols check for conflicting file accesses before a transaction references shared files; this prevents transaction re...
详细信息
Consistency control protocols can be classified as either pessimistic or optimistic. Pessimistic protocols check for conflicting file accesses before a transaction references shared files; this prevents transaction restarts but adds intercomputer synchronization delays to execution response times TE. Optimistic protocols avoid intercomputer synchronization delays for TE, but existing optimistic protocols repeatedly restart a transaction until it executes without conflict. Repeated restarts lengthen the time to finalize an update TU, and can saturate the computing and communication resources. We present two new optimistic protocols that avoid repeated restarts: the exclusive-writer protocol (EWP) and the exclusive-writer protocol with locking option (EWL). EWP has no transaction restarts, database rollbacks, or deadlocks due to shared data access. But EWP ensures only a limited form of serializability. EWL is an extension of EWP that ensures full serializability. EWL has no database rollbacks. Also, EWL can guarantee that a transaction will be restarted at most once. To further reduce restarts, each site can independently and dynamically switch between primary site locking (PSL), which has no restarts, and EWL. Such switching requires no additional messages or delays to synchronize protocol selection. Analytic models are developed to study the response times (i.e., TE and TU) of EWP, EWL, PSL, and basic timestamps (BTS). Our study reveals that EWP and EWL have the smallest TF since neither requires update-log maintenance (unlike BTS) nor intercomputer synchronization delays for TE (unlike PSL). EWP has the smallest TU unless the cost of communicating and processing updates is high.
Concurrent broadcase involves the dissemination of a database, consisting of messages initially distributed among the nodes of a network, so that a copy of the entire database eventually resides at each node. This pap...
详细信息
Concurrent broadcase involves the dissemination of a database, consisting of messages initially distributed among the nodes of a network, so that a copy of the entire database eventually resides at each node. This paper examines the time complexity and communication complexity of several distributed procedures for concurrent broadcast. Special properties of concurrent broadcast in a tree are also given. The present time complexity results can be used to bound the time during which inconsistent databases may reside at different nodes, to evaluate and compare procedures for (or including) concurrent broadcast, and to schedule a sequence of instances of concurrent broadcast so that the instances do not overlap and there is not need for sequence numbers.
作者:
ELLIS, CSDept. of Comput. Sci.
Rochester Univ. NY USA Abstract Authors References Cited By Keywords Metrics Similar Download Citation Email Print Request Permissions
In spite of the amount of work recently devoted to distributed systems, distributed applications are relatively rare. One hypothesis to explain this scarcity of examples is a lack of experience with algorithm design t...
详细信息
In spite of the amount of work recently devoted to distributed systems, distributed applications are relatively rare. One hypothesis to explain this scarcity of examples is a lack of experience with algorithm design techniques tailored to an environment in which out-of-date and incomplete information is the rule. Since the design of data structures is an important aspect of traditional algorithm design, the author feels that it is important to consider the problem of distributing data structures. She investigates these ❉s by developing a distributed version of an extensible hash file, which is a dynamic indexing structure which could be useful in a distributed database.
Deadlock handling is an important part of transaction management in a database system. An algorithm is presented for detecting deadlocks in a distributed database system. It uses priorities for transactions to minim...
详细信息
Deadlock handling is an important part of transaction management in a database system. An algorithm is presented for detecting deadlocks in a distributed database system. It uses priorities for transactions to minimize the number of messages initiated for detecting deadlocks. It does not construct any wait-for graph but follows the edges of the graph to detect cycles (edge-chasing). It does not find any phantom deadlock (in the absence of failures), and for the resolution of deadlocks it does not need any additional calculation. The algorithm also incorporates a post-resolution computation that leaves information characterizing dependence relations of remaining transactions in the system's deadlock cycle; this will aid in detecting and resolving future deadlocks. Using the algorithm, it is possible to compute the exact number of messages generated for a given deadlock configuration. The basic algorithm is presented; then, it is extended to take into account shared and exclusive lock modes, simultaneous acquisition of multiple locks, and nested transactions.
暂无评论