Query optimization is considered as the most significant part in a model of distributeddatabase. The optimizer tries to find an optimal join order, which reduces the query execution cost. Several factors may affect t...
详细信息
Query optimization is considered as the most significant part in a model of distributeddatabase. The optimizer tries to find an optimal join order, which reduces the query execution cost. Several factors may affect the cost of query execution, including number of relations, communication costs, resources, and access to large distributed data sets. The success of a processed query depends heavily on the search methodology that is implemented by the query optimizer. Query processing is considered as NP-hard problem and many researchers are focusing on this problem. Researches are trying to find an appropriate algorithm to seek an ideal solution especially when the size of the database increases. In case of large queries, classical heuristic methods such as ant colony and genetic algorithm can't cover all search space and may lead to falling in a local minimum. In this paper, quantum inspired ant colony algorithm (QIACO), as one of the hybrid strategy of probabilistic algorithms, is utilized to improve the query join cost in the distributeddatabase model. The ability of quantum computing to diversify leads to cover query large search space, which helps in selecting the best trail and thus improves the slow convergence speed and avoid falling into a local optimum. Using this strategy, the algorithm aims to find an optimal join order which minimizes the total execution time. Experimental results show that the proposed model convergence faster with better goodness than the classic ant colony model for same number of ants used.
To realize a data-driven society, it is essential to develop a distributed database system capable of managing data with varying levels of trust in a unified manner. In such a system, a robust data verification mechan...
详细信息
ISBN:
(纸本)9798331505356;9798331505349
To realize a data-driven society, it is essential to develop a distributed database system capable of managing data with varying levels of trust in a unified manner. In such a system, a robust data verification mechanism is crucial for ensuring the security of all data. However, there are concerns about performance degradation caused by the increased volume of data being verified at each node. To address this issue, we propose limiting the scope of data verification to metadata. In this paper, we implement a metadata management method that ensures reliability. This is achieved by integrating a series of processes related to metadata generation and updating as built-in commands within the Hyperledger Iroha blockchain platform. This integration also enables the implementation of a metadata verification function in a distributed database system. Additionally, using this metadata, we propose methods for both naive and simplified data verification.
This research aims to investigate the challenge of recognizing and evaluating distributeddatabase management system (DDBMS) security and privacy challenges and also to describe important danger indicators that may le...
详细信息
The query optimiser is a vital part of any distributeddatabase mechanism. Reducing the execution period of the query depends on reaching an ideal query execution plan. Due to this issue's NP-hard nature, a hybrid...
详细信息
The query optimiser is a vital part of any distributeddatabase mechanism. Reducing the execution period of the query depends on reaching an ideal query execution plan. Due to this issue's NP-hard nature, a hybrid harmony search and an artificial bee colony algorithm can be useful. The harmony is used to call query plans and signify them by S-dimension real vectors. A harmony memory is a place for creating and storing a primary population of harmony vectors. Then, bees explore harmony memory as a food source. The production of a novel nominate harmony out of all query plans in the harmony memory requires a pitch adjustment principle, a memory consideration one, and a random re-initialisation. Lastly, the new candidate vector replaces the worst harmony vector when it works better. The simulation outcomes have indicated that the introduced method reduces the expenses of evaluating a query compared to the harmony search and bee colony optimisation algorithms. However, this method has a longer execution time.
Dynamic map system is one of key elements for autonomous driving (AD) car. Semi-dynamic data and semi-static data of Dynamic Map are locally collected and used. Volume of these is some Megabytes, and the time-life of ...
详细信息
ISBN:
(纸本)9781728194417
Dynamic map system is one of key elements for autonomous driving (AD) car. Semi-dynamic data and semi-static data of Dynamic Map are locally collected and used. Volume of these is some Megabytes, and the time-life of these is sub second to several minutes. A concept of such localized data transfer system is proposed. An effect of Vehicle to Vehicle communication (V2V) using wide range communication technology for Vehicle to Network communication (V2N) is estimated and an effect of off-loading cloud computing to edge computing and vehicle computing is estimated. Require for data base system of these data transfer system realization is proposed.
Checkpointing Algorithm is widely used in distributed database system (DDBS) to recover DDBS from fault. In traditional algorithms, normal DDBS activities, i.e. message exchanging between distributed sites should be b...
详细信息
ISBN:
(纸本)9780769547923
Checkpointing Algorithm is widely used in distributed database system (DDBS) to recover DDBS from fault. In traditional algorithms, normal DDBS activities, i.e. message exchanging between distributed sites should be blocked in checkpointing, thus the DDBS service are interfered with and the DDBS availability is dropped down. This paper presents a new algorithm that does not impact DDBS normal processing in checkpointing, and hence promote the DDBS availability.
In this paper we address the topic of identification of cohorts of similar patients in a database of electronic health records. We follow the conjecture that retrieval of similar patients can be supported by an underl...
详细信息
ISBN:
(纸本)9781450389228
In this paper we address the topic of identification of cohorts of similar patients in a database of electronic health records. We follow the conjecture that retrieval of similar patients can be supported by an underlying distributeddatabase design. Hence we propose a fragmentation based on partitioning the health records and present a benchmark of two implementation variants in comparison to an off-the-shelf data distribution approach provided by Apache Ignite. While our main use case in this paper is cohort identification, our approach has advantages for taxonomy-based query answering in other (non-medical) domains.
Machine learning (ML) is ubiquitous, and has powered the recent success of artificial intelligence. However, the state of affairs with respect to distributed ML is far from ideal. TensorFlow and PyTorch simply crash w...
详细信息
Machine learning (ML) is ubiquitous, and has powered the recent success of artificial intelligence. However, the state of affairs with respect to distributed ML is far from ideal. TensorFlow and PyTorch simply crash when an operation’s inputs and outputs cannot fit on a GPU for model parallelism, or when a model cannot fit on a single machine for data parallelism. A TensorFlow code that works reasonably well on a single machine with eight GPUs procured from a cloud provider often runs slower on two machines totaling sixteen GPUs. In this thesis, I propose solutions at both algorithm and system levels in order to scale out distributed ML. At the algorithm level, I propose a new method to distributed neural network learning, called independent subnet training (IST). In IST, per iteration, a neural network is decomposed into a set of subnetworks of the same depth as the original network, each of which is trained locally, before the various subnets are exchanged and the process is repeated. IST training has many advantages including reduction of communication volume and frequency, implicit extension to model parallelism, and memory limit decrease in each compute site. At the system level, I believe that proper computational and implementation abstractions will allow for the construction of self-configuring, declarative ML systems, especially when the goal is to execute tensor operations for ML in a distributed environment, or partitioned across multiple AI accelerators (ASICs). To this end, I first introduce a tensor relational algebra (TRA), which is expressive to encode any tensor operation that can be written in the Einstein notation, and then consider how TRA expressions can be re-written into an implementation algebra (IA) that enables effective implementation in a distributed environment, as well as how expressions in the IA can be optimized. The empirical study shows that the optimized implementation provided by IA can reach or even out-perform carefully engineer
In this paper we address the topic of identification of cohorts of similar patients in a database of electronic health records. We follow the conjecture that retrieval of similar patients can be supported by an underl...
详细信息
ISBN:
(纸本)9781450389228
In this paper we address the topic of identification of cohorts of similar patients in a database of electronic health records. We follow the conjecture that retrieval of similar patients can be supported by an underlying distributeddatabase design. Hence we propose a fragmentation based on partitioning the health records and present a benchmark of two implementation variants in comparison to an off-the-shelf data distribution approach provided by Apache Ignite. While our main use case in this paper is cohort identification, our approach has advantages for taxonomy-based query answering in other (non-medical) domains.
暂无评论