Query optimization is considered as the most significant part in a model of distributeddatabase. The optimizer tries to find an optimal join order, which reduces the query execution cost. Several factors may affect t...
详细信息
Query optimization is considered as the most significant part in a model of distributeddatabase. The optimizer tries to find an optimal join order, which reduces the query execution cost. Several factors may affect the cost of query execution, including number of relations, communication costs, resources, and access to large distributed data sets. The success of a processed query depends heavily on the search methodology that is implemented by the query optimizer. Query processing is considered as NP-hard problem and many researchers are focusing on this problem. Researches are trying to find an appropriate algorithm to seek an ideal solution especially when the size of the database increases. In case of large queries, classical heuristic methods such as ant colony and genetic algorithm can't cover all search space and may lead to falling in a local minimum. In this paper, quantum inspired ant colony algorithm (QIACO), as one of the hybrid strategy of probabilistic algorithms, is utilized to improve the query join cost in the distributeddatabase model. The ability of quantum computing to diversify leads to cover query large search space, which helps in selecting the best trail and thus improves the slow convergence speed and avoid falling into a local optimum. Using this strategy, the algorithm aims to find an optimal join order which minimizes the total execution time. Experimental results show that the proposed model convergence faster with better goodness than the classic ant colony model for same number of ants used.
The authors explain what is meant by a distributed database system and discuss its characteristics. They survey the state of distributeddatabase technology, focusing on how well products meet the goals of transparent...
详细信息
The authors explain what is meant by a distributed database system and discuss its characteristics. They survey the state of distributeddatabase technology, focusing on how well products meet the goals of transparent management of distributed and replicated data, reliability through distributed transactions, better performance, and easier, more economical system expansion. They then consider unsolved problems with regard to network scaling, distribution design, distributed query processing, distributed transaction processing, integration with distributed operating systems, and distributed multidatabasesystems
distributed database systems provide a new data processing and storage technology for decentralized organizations of today. Query optimization, the process to generate an optimal execution plan for the posed query, is...
详细信息
distributed database systems provide a new data processing and storage technology for decentralized organizations of today. Query optimization, the process to generate an optimal execution plan for the posed query, is more challenging in such systems due to the huge search space of alternative plans incurred by distribution. As finding an optimal execution plan is computationally intractable, using stochastic-based algorithms has drawn the attention of most researchers. In this paper, for the first time, a multi-colony ant algorithm is proposed for optimizing join queries in a distributed environment where relations can be replicated but not fragmented. In the proposed algorithm, four types of ants collaborate to create an execution plan. Hence, there are four ant colonies in each iteration. Each type of ant makes an important decision to find the optimal plan. In order to evaluate the quality of the generated plan, two cost models are used-one based on the total time and the other on the response time. The proposed algorithm is compared with two previous genetic-based algorithms on chain, tree and cyclic queries. The experimental results show that the proposed algorithm saves up to about 80 % of optimization time with no significant difference in the quality of generated plans compared with the best existing genetic-based algorithm.
Catalog management schemes may affect the site autonomy, query optimization, view management and data distribution transparency. However, the performance comparison of various catalog architectures has received relati...
详细信息
Catalog management schemes may affect the site autonomy, query optimization, view management and data distribution transparency. However, the performance comparison of various catalog architectures has received relatively little attention. We employ the simulation models to investigate the relative performance of six catalog management schemes-a centralized catalog, two variations of fully replicated catalogs and three variations of partitioned catalogs-in a locally distributed database system and a geographically distributed database system. We show that three variations of partitioned catalogs perform better than the centralized catalogs and fully replicated catalogs over the wide range. The performance of centralized catalogs and fully replicated catalogs with quorum consensus are the worst because of the queuing delays in several queues. Our simulation results also indicate that the performance difference among the variations of partitioned catalogs is mainly due to the recompilation rate.
We suggest a new probe message structure and an efficient probe-based deadlock detection and recovery algorithm that can be used in distributed database systems. We determine the characteristics of the probe messages ...
详细信息
We suggest a new probe message structure and an efficient probe-based deadlock detection and recovery algorithm that can be used in distributed database systems. We determine the characteristics of the probe messages and suggest an algorithm that can reduce the communication cost required for deadlock detection and recovery.
The important problem of distributed database systems (DDBs) is "data allocation". There are many methods for this problem and there are two measures for compare these models: Minimal cost and Performance. I...
详细信息
ISBN:
(纸本)9780769535579
The important problem of distributed database systems (DDBs) is "data allocation". There are many methods for this problem and there are two measures for compare these models: Minimal cost and Performance. In this paper we use a new method for using genetic algorithm. At first we generate clusters based on the communication cost between the sites, then perform genetic algorithm on these clusters to rind which one is the best situation for allocate data, at last allocate data to their sites in the same way. We improve performance of DDB with reduce number of communication cost, data redundancy, and minimize total data transfer cost by using genetic algorithm and grouping sites. We compare our results with another model;our proposal model has a higher performance.
Performance of OLAP queries becomes a critical issue as the amount of data in the data warehouses increases rapidly. To solve this performance issue, we proposed a high performance database cluster system called Hyper...
详细信息
ISBN:
(纸本)9783642241055;9783642241062
Performance of OLAP queries becomes a critical issue as the amount of data in the data warehouses increases rapidly. To solve this performance issue, we proposed a high performance database cluster system called HyperDB in which many PCs can be mobilized for excellent performance. In HyperDB, an OLAP query can be decomposed into sub-queries, and each of the sub-queries can be processed independently on a PC in a short time. But if an OLAP query has nested form (i.e., nested SQL), it could not be decomposed into sub-queries. In this paper, we propose a parallel distributed query processing algorithm for nested queries in HyperDB system. Traditionally, parallel distributed processing of nested queries is known as a difficult problem in the database area.
Data allocation plays a significant role in the design of distributed database systems. Data transfer cost is a major cost of executing a query in a distributed database system. So the performance of distributed datab...
详细信息
ISBN:
(数字)9789811007552
ISBN:
(纸本)9789811007552;9789811007545
Data allocation plays a significant role in the design of distributed database systems. Data transfer cost is a major cost of executing a query in a distributed database system. So the performance of distributed database systems is greatly dependent on allocation of data between the different sites of the network. The performance of static data allocation algorithms decreases as the retrieval and update access frequencies of queries from different sites to fragments changes. So, selecting a suitable method for allocation in the distributed database system is a key design issue. In this paper, the data allocation framework for non-replicated dynamic distributed database system using threshold and time constraint algorithm (TTCA) is developed and the performance of TTCA is evaluated against the threshold algorithm on the basis of total cost of reallocation and the number of migrations of fragments from one site to another site.
A distributed database system often replicates data across its servers to provide a fault-resistant application, which maximizes server availability. Various replication control protocols have been developed to ensure...
详细信息
A distributed database system often replicates data across its servers to provide a fault-resistant application, which maximizes server availability. Various replication control protocols have been developed to ensure data consistency. In this paper, we develop optimal design methods for the quorum-consensus replication protocol, which (1) maximizes availability of the distributed database systems and (2) minimizes the total system cost by calculating the optimal read quorum and the optimal number of system servers. Several numerical examples and applications are provided to illustrate the results.
The Performance and the efficiency of a distributed database system depend highly on the way data are allocated to the sites. The NP-completeness of the data allocation problem and the large size of its real occurrenc...
详细信息
The Performance and the efficiency of a distributed database system depend highly on the way data are allocated to the sites. The NP-completeness of the data allocation problem and the large size of its real occurrence, call for employing a fast and scalable heuristic algorithm. In this paper, we address the data allocation problem in terms of minimizing two different types of data transmission across the network, i.e., data transmissions due to site-fragment dependencies and those caused by inter-fragment dependencies. We propose a new heuristic algorithm which is based on the ant colony optimization meta-heuristic, with regards to the applied strategies for query optimization and integrity enforcement. The goal is to design an efficient data allocation scheme to minimize the total transaction response time under memory capacity constraints of the sites. Experimental tests indicate that our algorithm is capable of producing near- optimal solutions within a reasonable time. The results also reveal the flexibility and scalability of the proposed algorithm.
暂无评论