We present a formal framework for distributed databases, and we study the complexity of the concurrency control problem in this framework. Our transactions are partially ordered sets of actions, as opposed to the stra...
详细信息
We present a formal framework for distributed databases, and we study the complexity of the concurrency control problem in this framework. Our transactions are partially ordered sets of actions, as opposed to the straight-line programs of the centralized case. The concurrency control algorithm, or scheduler, is itself a distributed program. Three notions of performance of the scheduler are studied and interrelated: (1) its parallelism, (2) the computational complexity of the problems it needs to solve and (3) the cost of communication between the various parts of the scheduler. We show that the number of messages necessary and sufficient to support a given level of parallelism is equal to the minimax value of a combinatorial game. We show that this game is PSPACE-complete. It follows that, unless NP=PSPACEdistributed database
concurrency control
games
complexity
PSPACE-complete
Deadlocks may occur in distributed databases due to conflicts in data file lockings. A system is in a deadlock condition if and only if a directed cycle exists in its demand graph. The difficulties of constructing a...
详细信息
Deadlocks may occur in distributed databases due to conflicts in data file lockings. A system is in a deadlock condition if and only if a directed cycle exists in its demand graph. The difficulties of constructing a consistent demand graph are discussed, and three deadlock detection protocols for distributed databases are presented. The first protocol uses 2 communication phases; the 2nd uses a single communication phases; the 3rd, based on the 2nd, is a one-phase hierarchical deadlock detection protocol. It is assumed in all 3 protocols that the information utilized to locate a resource in a distributed database is provided by a system-wide addressing scheme. The accurate functioning of the protocols is independent of the addressing scheme.
A model is developed for determining the optimal policy for processing a given relational model query. The model is based on operating cost (processing cost and communication cost), which is a function of selection of...
详细信息
A model is developed for determining the optimal policy for processing a given relational model query. The model is based on operating cost (processing cost and communication cost), which is a function of selection of sites for processing query operations, sequence of operations, file size, and data reduction functions. The optimal policy specifies the site selection and sequence of operations that yield minimum operating cost.
The optimal distribution of a database schema over a number of sites in a distributed network is investigated. The database is modeled in terms of objects and links. The design is driven by user-supplied information...
详细信息
The optimal distribution of a database schema over a number of sites in a distributed network is investigated. The database is modeled in terms of objects and links. The design is driven by user-supplied information about data distribution. The inputs required by the optimization model are: 1. cardinality and size information about objects and links, 2. a set of candidate horizontal partitions of relations into fragments and the allocations of the fragments, and 3. the specification of all important transactions, their frequencies, and their sites of origin. An optimization model for a nonreplicated data allocation is developed in the form of a linear integer zero-one programming problem. The objective function is the total transaction processing cost. A decomposition heuristic is introduced to reduce the complexity. The model allows a formal solution to a design problem which is too complex to be solved by random search and for which no good directed search algorithms are known.
A quantitative method is presented for evaluating availability in distributed database Systems. The description of the distributed system and of transaction processing is given in terms of a flow graph. System states ...
详细信息
A quantitative method is presented for evaluating availability in distributed database Systems. The description of the distributed system and of transaction processing is given in terms of a flow graph. System states are represented by a structure vector. Transitions between states are modeled as a Markov process. Solution techniques are discussed both for the case in which transition rates are independent of the system state and for the case in which they depend on it. Finally, the results for an example are given.
This study proposes a robust concurrency control scheme which is reliable and offers ease of implementation and expansion. It falls under the category of least cost concurrency control techniques. A centralized cert...
详细信息
This study proposes a robust concurrency control scheme which is reliable and offers ease of implementation and expansion. It falls under the category of least cost concurrency control techniques. A centralized certifier design is also proposed; it improves on the difficulties encountered in previously proposed certifier methods. As opposed to the case of a common centralized control scheme, in the case of the proposed scheme, it is demonstrated that the failure of the central node poses no threat to the system. It follows that the scheme is able to reap the advantages of a centralized control scheme and still incur the least costs for maintenance of high reliability. Figures.
One of the most important considerations in developing a distributed database system is the concurrency control mechanism. Recently, many arguments have been advanced in favor of the optimistic solution to concurrency...
详细信息
One of the most important considerations in developing a distributed database system is the concurrency control mechanism. Recently, many arguments have been advanced in favor of the optimistic solution to concurrency control. This work reviews two algorithms that apply the Kung-Robinson proposal to a distributed database system. A different algorithm originally proposed by Badal is developed and expanded. This new algorithm switches from an optimistic mode of detecting and resolving non-serial izable execution to a pessimistic mode of preventing non-seri- al izable execution when the degree of conflict reaches a certain level. In other words, the algorithm adapts itself to the degree of conflict. Representative optimistic algorithms are then compared with two-phase locking and two-phase commit under different scenarios. Conclusions are drawn based on the performance of the algorithms under the different scenarios. The new algorithm appears to perform better than any of the other concurrency control mechanisms.
In this paper, a net model for decentralized control of user accesses to a distributed database is proposed. It is developed in detail for the restricted case of updating distributed copies of a single database. Predi...
详细信息
In this paper, a net model for decentralized control of user accesses to a distributed database is proposed. It is developed in detail for the restricted case of updating distributed copies of a single database. Predicate/transition-nets, a first-order extension of Petri nets, are shown to provide suitable means for concise representation of complex decentralized systems and for their rigorous formal analysis. It will be demonstrated in the present paper how these net models can be constructed and interpreted in a quite natural manner and how they can be analyzed by linear algebraic methods. By this, it will be shown that the modeled distributed database system is deadlock-free and guarantees a consistent database as well as a fair and effective service to the users.
We consider the effect on system performance of the distribution of a data base in the form of multiple copies at distinct sites. The purpose of our analysis is to determine the gain in READ throughput that can be obt...
详细信息
We consider the effect on system performance of the distribution of a data base in the form of multiple copies at distinct sites. The purpose of our analysis is to determine the gain in READ throughput that can be obtained in the presence of consistency preserving algorithms that have to be implemented when UPDATE operations are carried out on each copy. We show that READ throughput diminishes if the number of copies exceeds an optimal value. The theoretical model we develop is applied to a system in which consistency is preserved through the use of Ellis' ring algorithm.
A join operation consists of comparing each record in a file with all the records of a second file in order to build a third file. In a distributed database, the files may be partitioned into fragments which could be...
详细信息
A join operation consists of comparing each record in a file with all the records of a second file in order to build a third file. In a distributed database, the files may be partitioned into fragments which could be stored at different nodes of the network. In order to perform a join operation on a set of network nodes, it is necessary to transmit the files to the nodes in the network. A set of execution nodes is optimal if it minimizes the transmission costs of the fragments of the files that are transmitted. This problem can be solved with heuristic or implicit enumeration algorithms. Good estimates of the number of elements in an optimal solution set are required for good efficiency. An upper bound can be determined for the set of execution nodes as a function of the sizes of the source and destination files.
暂无评论