An integrated approach to concurrency control adaptively allows classical pessimistic (two-phase locking) or optimistic (using certification) approaches. The principles for a distributed integrated method controlling ...
详细信息
An integrated approach to concurrency control adaptively allows classical pessimistic (two-phase locking) or optimistic (using certification) approaches. The principles for a distributed integrated method controlling both locking and optimistic transactions are defined. The implementation of these principles leads to a method for constructing the serialization order of transactions, using their conflicts. This dynamic construction prevents the systematic rejection of old (long) readers, as in the multiversion methods. On the other hand, applying Thomas' rule to control the write conflicts permits the presence of old (long) writers.< >
The persistent growth of big data applications has being raising new challenges in managing large volumes of datasets with high scalability, confidentiality protection, and flexible types of search queries. In this pa...
详细信息
The persistent growth of big data applications has being raising new challenges in managing large volumes of datasets with high scalability, confidentiality protection, and flexible types of search queries. In this paper, we propose a secure design to disassemble the private dataset with the aim to store them across geographically distributed servers while supporting secure multi-client Boolean queries. In this design, the data owner encrypts the private database with the searchable index attributes. The encrypted dataset will be disassembled and distributed evenly across multiple servers by leveraging the property of a distributed index framework. By constructing an encryption structure, generating search tokens, and enabling parallel query, we show how the proposed design performs the secure while efficient Boolean search. These queries are not only limited to those initiated by the data owner but also can be extended to support multiple authorized clients, where each client is allowed to access a necessary part of the private database. In this stage, we advocate a non-interactive authorization scheme where data owner is not required to stay online to process the query request. Moreover, the query operation can be executed in parallel, which significantly improves the search efficiency. We formally characterize the leakage profile, which allow us to follow the existing security analysis method to demonstrate that our system can guarantee data confidentiality and query privacy. To validate our protocol, we implement a system prototype and evaluate the efficiency of our construction. Through experimental results, we demonstrate the effectiveness of our protocol in terms of data outsourcing time and Boolean query time.
Privacy preservation in distributed database is an active area of research. With the advancement of technology, massive amounts of data are continuously being collected and stored in distributed database applications....
详细信息
Privacy preservation in distributed database is an active area of research. With the advancement of technology, massive amounts of data are continuously being collected and stored in distributed database applications. Indeed, temporal associations and correlations among items in large transactional datasets of distributed database can help in many business decision-making processes. One among them is mining frequent itemset and computing their association rules, which is a nontrivial issue. In a typical situation, multiple parties may wish to collaborate for extracting interesting global information such as frequent association, without revealing their respective data to each other. This may be particularly useful in applications such as retail market basket analysis, medical research, academic, etc. In the proposed work, we aim to find frequent items and to develop a global association rules model based on the genetic algorithm (GA). The GA is used due to its inherent features like robustness with respect to local maxima/minima and domain-independent nature for large space search technique to find exact or approximate solutions for optimization and search problems. For privacy preservation of the data, the concept of trusted third party with two offsets has been used. The data are first anonymized at local party end, and then, the aggregation and global association is done by the trusted third party. The proposed algorithms address various types of partitions such as horizontal, vertical, and arbitrary.
distributed databases on local area networks present additional considerations for query optimization over databases on geographically distributed, point-to-point networks. This paper surveys and evaluates the state o...
详细信息
distributed databases on local area networks present additional considerations for query optimization over databases on geographically distributed, point-to-point networks. This paper surveys and evaluates the state of current research on distributed query optimization for local area networks. A classification taxonomy is presented and used to analyze the proposed query-optimization algorithms. The unique features of each algorithm are highlighted and a qualitative comparison of the algorithms is given. Future research directions are discussed.
The problem of connecting together a number of different databases to produce an integrated information system has attracted a considerable amount of attention over the years and various approaches have been developed...
详细信息
The problem of connecting together a number of different databases to produce an integrated information system has attracted a considerable amount of attention over the years and various approaches have been developed to handle this. However, the general problem of gathering related information from a number of existing heterogeneous databases is complex because of the differences in representation and meaning of data in different data sets. Many different approaches have been described to resolve this problem, and some prototype systems built. However, it is difficult to compare the effectiveness of different approaches and prototypes. This paper is aimed at addressing the specific issue of assessing the generality of different approaches. To this end it presents a framework for classifying the differences between data in different databases and a test-suite which can be used to evaluate and compare the extent to which different approaches handle different aspects of this heterogeneity. (C) 2000 Elsevier Science B.V. All rights reserved.
Clustering of distributed databases facilitates knowledge discovery through learning of new concepts that characterise common features and differences between datasets. Hence, general patterns can be learned rather th...
详细信息
Clustering of distributed databases facilitates knowledge discovery through learning of new concepts that characterise common features and differences between datasets. Hence, general patterns can be learned rather than restricting learning to specific databases from which rules may not be generalisable. We cluster databases that hold aggregate count data on categorical attributes that have been classified according to homogeneous or heterogeneous classification schemes. Clustering of datasets is carried out via the probability distributions that describe their respective aggregates. The homogeneous case is straightforward. For heterogeneous data we investigate a number of clustering strategies, of which the most efficient avoid the need to compute a dynamic shared ontology to homogenise the classification schemes prior to clustering. (c) 2004 Elsevier B.V. All rights reserved.
For distributed databases, checkpointing is used to ensure an efficient way to perform global reconstruction. However, the need for global reconstruction is infrequent. Most current checkpointing approaches for distri...
详细信息
For distributed databases, checkpointing is used to ensure an efficient way to perform global reconstruction. However, the need for global reconstruction is infrequent. Most current checkpointing approaches for distributed databases are too expensive during run time. Some of them allow the checkpointing process to run in parallel with normal transactions at the cost of more data and resource contention, which in turn causes longer response time for normal transactions. Thus, an efficient way to checkpoint distributed databases is needed to avoid degrading the system performance. This paper presents a low-cost solution, called Loosely Synchronized Local Fuzzy Checkpointing (LSLFC), to these problems. LSLFC supports global reconstruction, and our performance study shows that LSLFC has little overhead during run time.
In a one-copy distributed database, each data item is stored at exactly one site. In a replicated database, some data items may be stored at multiple sites. The main motivation is improved reliability: by storing impo...
详细信息
In a one-copy distributed database, each data item is stored at exactly one site. In a replicated database, some data items may be stored at multiple sites. The main motivation is improved reliability: by storing important data at multiple sites, the DBS can operate even though some sites have *** paper describes an algorithm for handling replicated data, which allows users to operate on data so long as one copy is “available.” A copy is “available” when (i) its site is up, and (ii) the copy is not out-of-date because of an earlier *** algorithm handles clean, detectable site failures, but not Byzantine failures or network partitions.
Intelligent routing control is defined as the process in which the network interrogates the databases containing the relationships between logical numbers, such as personal or information identifiers, and physical add...
详细信息
Intelligent routing control is defined as the process in which the network interrogates the databases containing the relationships between logical numbers, such as personal or information identifiers, and physical addresses in the transport network to find the terminal having the information required to process a user request. The routing control system presented uses distributed databases, each of which manages a switching system and all of which are connected through high-speed signalling networks separate from the transport network. If the requested physical address cannot be found in one database, search requests are distributed at the same time to all other databases. For up to 100 million subscribers, the routing control system can find a physical address within 1 s when each database uses ten memories accessed at 200 ns with an interdatabase linkage speed of 14 Mb/s.< >
A deadlock detection algorithm is presented, by which each node in a network can decide locally whether a deadlock exists. Besides the problem of deadlock detection, we also take into account the problem of cycle dete...
详细信息
A deadlock detection algorithm is presented, by which each node in a network can decide locally whether a deadlock exists. Besides the problem of deadlock detection, we also take into account the problem of cycle detection in graphs distributed over several nodes. This problem arises in several locking protocols like the RAC-locking protocol in a distributed environment. The proposed algorithm is based on the idea of sending relevant paths of the wait-for graph by broadcast messages to each other node requiring only one physical transmission. In order to detect deadlocks only locally each node collects all paths sent in a local graph. As this graph has to be equal within all nodes a protocol has been developed which is responsible for ensuring equality. Finally, we determine the overhead caused by this protocol within the network. Besides, we also propose a broadcast two-phase commit protocol exploiting the facility of a broadcast message.
暂无评论