Privacy-preservation in distributed progressive databases is an active area of research in recent years. In a typical scenario, multiple parties may wish to collaborate to extract interesting global information such a...
详细信息
ISBN:
(纸本)9783642178566
Privacy-preservation in distributed progressive databases is an active area of research in recent years. In a typical scenario, multiple parties may wish to collaborate to extract interesting global information such as class labels without revealing their respective data to each other. This may be particularly useful in applications such as customer retention, medical research etc. In the proposed work, we aim to develop a global classification model based on the Naive Bayes classification scheme. The Naive Bayes classification has been used because of its simplicity, high efficiency. For privacy-preservation of the data, the concept of trusted third party with two offsets has been used. The data is first anonymized at local party end and then the aggregation and global classification is done at the trusted third party. The proposed algorithms address various types of fragmentation schemes such as horizontal, vertical and arbitrary distribution required format. The car-evaluation dataset is used to test the effectiveness of proposed algorithms.
In recent years, P2P architecture has been established as a major means for distributed data sharing systems, and has been broadly utilized for P2P file sharing (e.g. music, movies, etc.). For the last few years, the ...
详细信息
In recent years, P2P architecture has been established as a major means for distributed data sharing systems, and has been broadly utilized for P2P file sharing (e.g. music, movies, etc.). For the last few years, the database community has begun to exploit P2P paradigms for database applications and a few prototypes have been proposed for P2P database management systems. In this paper, we provide an overview of different challenging issues for building P2P database management systems by presenting existing research related to P2P database management.
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, an...
详细信息
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, and you do not worry about your space being inadequate or wasted because the storage being able to be adjusted dynamically is the most important feature of the Cloud Storage. In this paper, we present a solution about how to build a Cloud Storage Service System based on the open-source distributed database, it follows a stratum design that includes Web service front-end, transformation processing layer and data storing layer. Terminal users can access their own data in this system through three Web service interfaces. More over, a complete prototype system based on this architecture is demonstrated. (C) 2011 Published by Elsevier Ltd. Selection and/or peer-review under responsibility of Conference ESIAT2011 Organization Committee.
distributed databases use client/server architecture to process information requests. Query optimization is a difficult enough task in a distributed environment [1]. In order to optimize queries accurately, sufficient...
详细信息
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, an...
详细信息
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, and you do not worry about your space being inadequate or wasted because the storage being able to be adjusted dynamically is the most important feature of the Cloud Storage. In this paper, we present a solution about how to build a Cloud Storage Service System based on the open-source distributed database, it follows a stratum design that includes Web service front-end, transformation processing layer and data storing layer. Terminal users can access their own data in this system through three Web service interfaces. More over, a complete prototype system based on this architecture is demonstrated.
Privacy-preservation in distributed databases is an important area of research in recent years. In a typical scenario, multiple parties may wish to collaborate to extract interesting global information such as class l...
详细信息
ISBN:
(数字)9783642157660
ISBN:
(纸本)9783642157653
Privacy-preservation in distributed databases is an important area of research in recent years. In a typical scenario, multiple parties may wish to collaborate to extract interesting global information such as class labels without revealing their respective data to each other. This may be particularly useful in applications such as car selling units, medical research etc. In the proposed work, we aim to develop a global classification model based on the Naive Bayes classification scheme. The Naive Bayes classification has been used because of its simplicity and high efficiency. For privacy-preservation of the data, the concept of trusted third party with two offsets has been used. The data is first anonymized at local party end and then the aggregation and global classification is done at the trusted third party. The proposed algorithms address various types of fragmentation schemes such as horizontal, vertical and arbitrary distribution.
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, an...
详细信息
Cloud Storage services are increasingly noticed as they promise elastic capability and high reliability at low cost. In such services, you can store most of your files to authenticated Cloud Storage Service center, and you do not worry about your space being inadequate or wasted because the storage being able to be adjusted dynamically is the most important feature of the Cloud Storage. In this paper, we present a solution about how to build a Cloud Storage Service System based on the open-source distributed database, it follows a stratum design that includes Web service front-end, transformation processing layer and data storing layer. Terminal users can access their own data in this system through three Web service interfaces. More over, a complete prototype system based on this architecture is demonstrated.
Highly distributed data management platforms (e.g., PNUTS, Dynamo, Cassandra, and BigTable) are rapidly becoming the favorite choice for hosting modern web applications in the cloud. Among other features, these platfo...
详细信息
ISBN:
(纸本)9781920682958
Highly distributed data management platforms (e.g., PNUTS, Dynamo, Cassandra, and BigTable) are rapidly becoming the favorite choice for hosting modern web applications in the cloud. Among other features, these platforms rely on data partitioning, replication and relaxed consistency to achieve high levels of performance and scalability. However, these design choices often exhibit a trade-off between performance and data freshness. In this paper, in addition to performance SLAs, we also perceive an application tolerance to data staleness as another requirement determining the end-user satisfaction and our goal is to strike a fine balance between both the quality of service (QoS) and quality of data (QoD) perceived by the end-user. Towards that, we propose scheduling policies and mechanisms for efficiently allocating the recourses at each replica node so that to meet the conflicting requirements of user queries and replica updates. Our experimental results show that employing our scheduling strategies for resource allocation can provide significant improvements in the overall system utility when compared to the existing ones.
MassBank is the first public repository of mass spectra of small chemical compounds for life sciences (<3000 Da). The database contains 605 electron-ionization mass spectrometry(EI-MS), 137 fast atom bombardment MS...
详细信息
MassBank is the first public repository of mass spectra of small chemical compounds for life sciences (<3000 Da). The database contains 605 electron-ionization mass spectrometry(EI-MS), 137 fast atom bombardment MS and 9276 electrospray ionization (E51)-MSn data of 2337 authentic compounds of metabolites, 11 545 EI-MS and 834 other-MS data of 10 286 volatile natural and synthetic compounds, and 3045 ESI-MS2 data of 679 synthetic drugs contributed by 16 research groups (January 2010). ESI-MS2 data were analyzed under nonstandardized, independent experimental conditions. MassBank is a distributed database. Each research group provides data from its own MassBank data servers distributed on the Internet. MassBank users can access either all of the MassBank data or a subset of the data by specifying one or more experimental conditions. In a spectral search to retrieve mass spectra similar to a query mass spectrum, the similarity score is calculated by a weighted cosine correlation in which weighting exponents on peak intensity and the mass-to-charge ratio are optimized to the ESI-MS2 data. MassBank also provides a merged spectrum for each compound prepared by merging the analyzed ESI-MS2 data on an identical compound under different collision-induced dissociation conditions. Data merging has significantly improved the precision of the identification of a chemical compound by 21-23% at a similarity score of 0.6. Thus, MassBank is useful for the identification of chemical compounds and the publication of experimental data. Copyright (C) 2010 John Wiley & Sons, Ltd.
Data distribution has a direct impact on improving the entire distributed database application system, data availability, and efficiency and reliability of distributed database. In order to solve the data distribution...
详细信息
ISBN:
(纸本)9783642163357
Data distribution has a direct impact on improving the entire distributed database application system, data availability, and efficiency and reliability of distributed database. In order to solve the data distribution better, this paper adopts adaptive mutation operator to maintain the balance between colony diversity and searching random of the algorism, and presents a strategy based on genetic algorithm. During the study, the paper has improved the genetic algorithm, and proved strategy to be close to the optimal solution by experiment.
暂无评论