This paper proposes a novel query optimization method based on Hadoop-HANA hybrid data management *** method to establish the hybird data management system based on the distributed database middleware is provided at *...
详细信息
This paper proposes a novel query optimization method based on Hadoop-HANA hybrid data management *** method to establish the hybird data management system based on the distributed database middleware is provided at *** the Hadoop-HANA index and the hot-data methods are introduced to speed up the query execution on the proposed system,which makes the massive heterogeneous data process more efficient.
Network teaching based on computer has been a new teaching mode put forward in recent *** design and organization of multimedia database is the foundation of the whole network construction and the important standards ...
详细信息
Network teaching based on computer has been a new teaching mode put forward in recent *** design and organization of multimedia database is the foundation of the whole network construction and the important standards for network teaching quality *** network transmission bandwidth is limited,which makes a higher request for database redundancy,calculating method of inquiry and structure *** paper makes specific instructions for the system structure and construction method of the application of multimedia database in the network teaching of university sports elective course.
Query processing over uncertain data has gained growing attention, because it is necessary to deal with uncertain data in many real-life applications. In this paper, we investigate skyline queries over uncertain data ...
详细信息
Query processing over uncertain data has gained growing attention, because it is necessary to deal with uncertain data in many real-life applications. In this paper, we investigate skyline queries over uncertain data in distributed environments (DSUD query) whose research is only in an early stage. The state-of-the-art algorithm, called e-DSUD algorithm, is designed for processing this query. It has the desirable characteristics of progressiveness and minimum bandwidth consumption. However, it still needs to be perfected in three aspects. (1) Progressiveness. Each time it only returns one query result at most. (2) Efficiency. There are a significant amount of redundant I/O cost and numerous iterations which causes a long total query time. (3) Universality. It is restricted to the case where local skyline tuples are incomparability. To address these concerns, we first present a detailed analysis of the e-DSUD algorithm and then develop an improved framework for the DSUD query, namely IDSUD. Based on the new framework, we propose an adaptive algorithm, called ADSUD, for the DSUD query. In the algorithm, we redefine the approximate global skyline probability and choose local representative tuples due to minimum probabilistic bounding rectangle adaptively. Furthermore, we design a progressive pruning method and apply the reuse mechanism to improve its efficiency. The results of extensive experiments verify the better overall performance of our algorithm than the e-DSUD algorithm.
Since several years, there is an increasing interest for new services based on the analysis of data coming from online social networks. Such services can, for example, provide the e-reputation of a product or a compan...
详细信息
ISBN:
(纸本)9781509028467
Since several years, there is an increasing interest for new services based on the analysis of data coming from online social networks. Such services can, for example, provide the e-reputation of a product or a company, detect new trends in a commercial, social or political context, etc. The huge quantity of data is an opportunity in term of representativeness but is also difficult to manage. Within Twitter, for example, it appears that the huge stream of data is, most of the time, incompatible with a flexible analysis unless to have high computer resources. The only practical solution is often to observe in a static way a limited portion of a phenomenon in a limited time slot. This paper is devoted to the study of necessary conditions to provide an equilibrium between the computer architecture complexity and the analysis flexibility.
This essay studies on the data distribution methods in the distributed *** the current methods of data distribution in distributed database,the problems in the distribution methods such as limitations,complex cost for...
详细信息
This essay studies on the data distribution methods in the distributed *** the current methods of data distribution in distributed database,the problems in the distribution methods such as limitations,complex cost formula,low running efficiency of the algorithm and so on lead to that the calculation results are quite different from the optimal distribution ***,this essay proposes a method to apply genetic algorithm in the data distribution in the distributed *** also carries out some improvements in the genetic algorithm:improvement in initializing colony,comprehensive mechanism of fitness ratio and optimal value reserve,the use of self-regulating cross factor and self-regulating variation *** improvements further increase the accuracy of the data distribution and the calculation *** simulation experiment in this distribution method,the result has shown that the result of the data distribution method obtained in this essay is the closest to the best *** the performance is better than the commonly used data distribution method based on data segment access features and so on.
There have been proposed protocols to achieve causal consistency with a distributed data store that does not make safety guarantees. Such a protocol works with an unmodified data store if it is implemented as middlewa...
详细信息
ISBN:
(纸本)9781467388450
There have been proposed protocols to achieve causal consistency with a distributed data store that does not make safety guarantees. Such a protocol works with an unmodified data store if it is implemented as middleware or a shim layer while it can be implemented inside a data store. But the middleware approach has required modifications to applications. Applications have to specify explicitly data dependency to be managed. On the contrary, our Letting-It-Be protocol handles all the implicit dependency naturally resulting from data accesses though it is implemented as middleware. Our protocol does not require any modifications to either data stores or applications. It trades performance for the merit to some extent. Throughput declines from a bare data store were 21% in the best case and 78% in the worst case.
Nowadays, with the evolution of data and their geographical distribution, distributed database Management Systems (DDBMS) have become undoubtedly a need for Information Systems (IS) users. Unfortunately, query optimiz...
详细信息
ISBN:
(纸本)9783319198576;9783319198569
Nowadays, with the evolution of data and their geographical distribution, distributed database Management Systems (DDBMS) have become undoubtedly a need for Information Systems (IS) users. Unfortunately, query optimization remains a handicap for existing DDBMS, given the high cost of network traffic caused by the access to geographically distributed data in different sites. To remedy this problem, we propose a new effective approach of querying distributed database (DDB) based on the definition of relevant sites to the query knowing fragmentation and /or duplication of distributed data. This approach allows us to minimize the volume of transferred data via network and consequently reduces the query execution cost. This approach has been validated by implementing a layer "effective-query" on Oracle DDBMS.
We are going to propose an advanced architecture sensing real time temperature of a particular location for transmitting the data to a cloud database. Current data have been analysed based on previously recorded data....
详细信息
ISBN:
(纸本)9788132227557;9788132227533
We are going to propose an advanced architecture sensing real time temperature of a particular location for transmitting the data to a cloud database. Current data have been analysed based on previously recorded data. If any abnormal data is observed, then the system produces an alarming message to the concerned authorities. Analytical data guide users to solve real time problems observing anomalies in the system.
The focus of this research work is to investigate the problem of providing partition tolerance in cloud-based applications while maintaining application data integrity. This study looks at developing a cloud applicati...
详细信息
ISBN:
(纸本)9781908320735
The focus of this research work is to investigate the problem of providing partition tolerance in cloud-based applications while maintaining application data integrity. This study looks at developing a cloud application to track sales of admission tickets to a battlefield as a motivating example. Web browsers are run inside the premises of the enterprise selling the tickets, while the rest of the application architecture is stored in the cloud. The internet connection represents a single point of failure for the application. Often humanities attractions such as battlefields are physically located in rural areas where internet redundancy is expensive at best. Developing a cloud application that can tolerate network partitions needs to be considered in the modeling phase of cloud software project.
Data Mining is the technique of automated extraction of interesting data patterns used to represent knowledge, from the large data sets but sometimes these datasets are divided among various parties. Association rule ...
详细信息
ISBN:
(纸本)9781479985531
Data Mining is the technique of automated extraction of interesting data patterns used to represent knowledge, from the large data sets but sometimes these datasets are divided among various parties. Association rule mining is a popular mining technique that identifies interesting correlations between database attributes. In this paper, proposed a protocol Privacy Preserving Fast distributed Mining (PPFDM) for association rules mining in horizontally distributed databases which is based on the Fast distributed Mining (FDM) algorithm. FDM is an unsecured distributed version of the Apriori algorithm devoted to generate a small number of candidate sets and considerably cut down the number of messages to be passed at mining association rules. PPFDM adopts two major ideas: one that computes the union of private subsets that each of the interacting player holds and another that evaluate the inclusion of an element held by one player in a subset held by another. An implementation of a PPDM algorithm is developed in Java framework and performance results are presented for synthetic data generation and association rules as well as indexing is provided to the user. It is simpler and significantly more efficient in the matter of communication rounds, communication cost and computational cost.
暂无评论