The design of distributeddatabase has became demanding with the increase in use of IoT and cloud based services. distributeddatabase system's performance is totally relies on its design. Allocation of data is on...
详细信息
ISBN:
(纸本)9781665424615
The design of distributeddatabase has became demanding with the increase in use of IoT and cloud based services. distributeddatabase system's performance is totally relies on its design. Allocation of data is one of the major design issues while designing distributeddatabases. This paper presents a new technique for non-redundant allocation of data in distributed database design. The proposed approach allocates the data based on Simplified Biogeography Based Optimization (Simplified-BBO). The performance comparison of Simplified-BBO based approach is done against the GA and BBO based approaches. The proposed approach helps in decreasing the data communication cost during query execution which results in increasing the overall performance of distributeddatabase systems.
The performance of a distributeddatabase system depends particularly on the site-allocation of the fragments. Queries access different fragments among the sites, and an originating site exists for each query. A data ...
详细信息
The performance of a distributeddatabase system depends particularly on the site-allocation of the fragments. Queries access different fragments among the sites, and an originating site exists for each query. A data allocation algorithm should distribute the fragments to minimize the transfer and settlement costs of executing the query plans. The primary cost for a data allocation algorithm is the cost of the data transmission across the network. The data allocation problem in a distributeddatabase is NP-complete, and scalable evolutionary algorithms were developed to minimize the execution costs of the query plans. In this paper, quadratic assignment problem heuristics were designed and implemented for the data allocation problem. The proposed algorithms find near-optimal solutions for the data allocation problem. In addition to the fast ant colony, robust tabu search, and genetic algorithm solutions to this problem, we propose a fast and scalable hybrid genetic multi-start tabu search algorithm that outperforms the other well-known heuristics in terms of execution time and solution quality.
Medical images are growing dramatically both in quantity and application in the data era and the emerging Big Data problem. Searching and finding the proper medical image among such a huge number of medical images is ...
详细信息
Medical images are growing dramatically both in quantity and application in the data era and the emerging Big Data problem. Searching and finding the proper medical image among such a huge number of medical images is not possible unless using a proper medical search engine. Indexing is the backend process in information retrieval systems in which documents are annotated with the index entry to be retrieved more accurately and efficiently. Most of the indexing techniques for medical images are content-based that are more complex and time consuming than the text-based ones. In this paper, a text-based medical image indexing technique is proposed that use medical images' attributes and fragments them with the hybrid fragmentation approach (that are used in distributed database design) and re-form each of such attribute fragments into a hierarchy, constructing a multidimensional index. Hybrid fragmentation approach uses both Horizontal and vertical fragmentation of medical images' attributes provided in header of medical image standard formats (such as the DICOM). Horizontal fragmentation uses values of image attributes (i.e., image content and properties dependent), whilst the vertical fragmentation uses pairwise affinity and correlation of the attributes in the application domain (i.e., application dependent). So, the proposed hybrid fragmentation approach based indexing of medical images aim to consider both the image properties and application statistics together to provide a better functionality. As the experimental performance evaluation results illustrate, the proposed multidimensional indexing can provide better precision of information retrieval rather that a single index or a set of multiple indexes, since that considers semantic relationship of the medical image's attributes via the hybrid (horizontal and vertical) fragmentation. Moreover, the hybrid fragmentations approach based indexing also outperforms the vertical fragmentation-based multidimensional medi
The two important aspects for design of distributeddatabase systems are operation allocation and data allocation. Operation allocation refers to query execution plan indicating which operations (subqueries) should be...
详细信息
The two important aspects for design of distributeddatabase systems are operation allocation and data allocation. Operation allocation refers to query execution plan indicating which operations (subqueries) should be allocated to which sites in a computer network, so that query processing costs are minimized. Data allocation is to allocate relations to sites so that the performance of distributeddatabase are improved. In this research, we developed a solution technique for operation allocation and data allocation problem, using three objective functions: total time minimization or response time minimization, and the combination of total time and response time minimization. We formulated these allocation problems and provided analytical cost models for each objective function. Since the problem is NP-hard, we proposed a heuristic solution based on genetic algorithm (GA). Comparison of results with the exhaustive enumeration indicated that GA produced optimal solutions in all cases in much less time.
The main purpose of this paper is to show the advantage of using a model proposed by us, which minimizes roundtrip response time versus traditional models that minimize query transmission and processing costs for the ...
详细信息
The main purpose of this paper is to show the advantage of using a model proposed by us, which minimizes roundtrip response time versus traditional models that minimize query transmission and processing costs for the design of a distributeddatabase with vertical fragmentation. To this end, an experiment was conducted to compare the roundtrip response time of the optimal solution obtained using our model versus the roundtrip response time of the optimal solution obtained using a traditional model. The experimental results show that for most cases the optimal solution from a traditional model yields a response time which is larger than the response time of the optimal solution obtained from our model, and sometimes it can be thrice as large. (C) 2013 Elsevier B.V. All rights reserved.
Considering the existing massive volumes of data processed nowadays and the distributed nature of many organizations, there is no doubt how vital the need is for distributeddatabase systems. In such systems, the resp...
详细信息
Considering the existing massive volumes of data processed nowadays and the distributed nature of many organizations, there is no doubt how vital the need is for distributeddatabase systems. In such systems, the response time to a transaction or a query is highly affected by the distribution design of the database system, particularly its methods for fragmentation, replication, and allocation data. According to the relevant literature, from the two approaches to fragmentation, namely horizontal and vertical fragmentation, the latter requires the use of heuristic methods due to it being NP-Hard. Currently, there are a number of different methods of providing vertical fragmentation, which normally introduce a relatively high computational complexity or do not yield optimal results, particularly for large-scale problems. In this paper, because of their distributed and scalable nature, we apply swarm intelligence algorithms to present an algorithm for finding a solution to vertical fragmentation problem, which is optimal in most cases. In our proposed algorithm, the relations are tried to be fragmented in such a way so as not only to make transaction processing at each site as much localized as possible, but also to reduce the costs of operations. Moreover, we report on the experimental results of comparing our algorithm with several other similar algorithms to show that ours outperforms the other algorithms and is able to generate a better solution in terms of the optimality of results and computational complexity.
The design of responsive distributeddatabase systems is a key concern for information systems managers. In high bandwidth networks latency and local processing are the most significant factors in query and update res...
详细信息
The design of responsive distributeddatabase systems is a key concern for information systems managers. In high bandwidth networks latency and local processing are the most significant factors in query and update response time. Parallel processing can be used to minimize their effects, particularly if it is considered at design time. It is the judicious replication and placement of data within a network that enable parallelism to be effectively used. However, latency and parallel processing have largely been ignored in previous distributed database design approaches. We present a comprehensive approach to distributed database design that develops efficient combinations of data allocation and query processing strategies that take full advantage of parallelism. We use a genetic algorithm to enable the simultaneous optimization of data allocation and query processing strategies. We demonstrate that ignoring the effects of latency and parallelism at design time can result in the selection of unresponsive distributed database designs.
This paper contributes by providing an allocation taxonomy to analyze DOBS models and identify the primary characteristics of a DOBS allocation. A graphical optimization technique, based on the work by Kernighan and L...
详细信息
This paper contributes by providing an allocation taxonomy to analyze DOBS models and identify the primary characteristics of a DOBS allocation. A graphical optimization technique, based on the work by Kernighan and Lin (The Bell System, Technical Journal, pp. 291-307, 1970), is basis of our approach. The algorithm attempts to arrive at a "near optimal" distribution of fragments by exchanging and/or moving fragments between every pair of sites. The algorithms are implemented and are tested with carefully generated test data to obtain an analysis of the performance. The design, implementation and analysis of the allocation algorithms form the most significant contribution of this paper. Several significant insights are derived from the analysis of the results obtained that can be usefully applied to any real life DOBS allocation design. The applicability of the allocation algorithms for the DOBS model of interest may be verified by a cost benefit analysis based on their efficiency. The optimization of the initial allocation is meaningful only if the improvement obtained by optimization justifies the additional overhead. If significant improvement is not obtainable by optimization then our efficient initial allocation scheme alone may serve the purpose.
This paper presents an extension of the DFAR mathematical optimization model, which unifies the fragmentation, allocation and dynamical migration of data in distributeddatabase systems. The extension consists of the ...
详细信息
ISBN:
(纸本)3540673547
This paper presents an extension of the DFAR mathematical optimization model, which unifies the fragmentation, allocation and dynamical migration of data in distributeddatabase systems. The extension consists of the addition of a constraint that models the storage capacity of network sites. This aspect is particularly important in large databases, which exceed the capacity of one or more sites. The Threshold Accepting Algorithm is a variation of the heuristic method known as Simulated Annealing, and it is used for solving the DFAR model. The paper includes experimental results obtained for large test cases.
The allocation of data and operations to nodes in a computer communications network is a critical issue in distributed database design, An efficient distributed database design must trade off performance and cost amon...
详细信息
The allocation of data and operations to nodes in a computer communications network is a critical issue in distributed database design, An efficient distributed database design must trade off performance and cost among retrieval and update activities at the various nodes, It must consider the concurrency control mechanism used as well as capacity constraints at nodes and on links in the network, It must determine where data will be allocated, the degree of data replication, which copy of the data will be used for each retrieval activity, and where operations such as select, project, join, and union will be performed, We develop a comprehensive mathematical modeling approach for this problem, The approach first generates units of data (file fragments) to be allocated from a logical data model representation and a characterization of retrieval and update activities, Retrieval and up date activities are then decomposed into relational operations on these fragments. Both fragments and operations on them are then allocated to nodes using a mathematical modeling approach, The mathematical model considers network communication, local processing, and data storage costs, A genetic algorithm is developed to solve this mathematical formulation.
暂无评论