The performance of online analytical processing (OLAP) is critical for meeting the increasing requirements of massive volume analytical applications. Typical techniques, such as in-memory processing, column-storage,...
详细信息
The performance of online analytical processing (OLAP) is critical for meeting the increasing requirements of massive volume analytical applications. Typical techniques, such as in-memory processing, column-storage, and join indexes focus on high perfor- mance storage media, efficient storage models, and reduced query processing. While they effectively perform OLAP applications, there is a vital limitation: main- memory database based OLAP (MMOLAP) cannot provide high performance for a large size data set. In this paper, we propose a novel memory dimension table model, in which the primary keys of the dimension table can be directly mapped to dimensional tuple addresses. To achieve higher performance of dimensional tuple access, we optimize our storage model for dimension tables based on OLAP query workload features. We present directly dimensional tuple accessing (DDTA) based join (DDTA- JOIN), a technique to optimize query processing on the memory dimension table by direct dimensional tuple access. We also contribute by proposing an optimization of the predicate tree to shorten predicate operation length by pruning useless predicate processing. Our experimental results show that the DDTA-JOIN algorithm is superior to both simulated row-store main memory query processing and the open-source column-store main memory database MonetDB, thanks to the reduced join cost and simple yet efficient query processing.
Subjective logic provides a means to describe the trust relationship of the realworld. However, existing fusion operations it offers Weal fused opiniotts equally, which makes it impossible to deal with the weighted op...
详细信息
Subjective logic provides a means to describe the trust relationship of the realworld. However, existing fusion operations it offers Weal fused opiniotts equally, which makes it impossible to deal with the weighted opinions effectively. A. Jcsang presents a solution, which combines the discounting operator and the fusion operator to produce the consensus to the problem. In this paper, we prove that this approach is unsuitable to deal with the weighted opinions because it increases the uncertainty of the consensus. To address the problem, we propose two novel fusion operators that are capable of fusing opinions according to the weight of opinion in a fair way, and one of the strengths of them is improving the trust expressiveness of subjective logic. Furthermore, we present the justification on their definitions with the mapping between the evidence space and the opinion space. Comparisons between existing operators and the ones we proposed show the effectiveness of our new fusion operations.
Unemployment rate prediction has become critically important, because it can help government to make decision and design policies. In recent years, forecast of unemployment rate attracts much attention from government...
详细信息
To generate large number of reports in a limited time window, four techniques were proposed, including ROLAP&SQL, Shared Scanning, Hadoop based Solution, and MOLAP&Cube Sharding, an algorithm that performs in ...
详细信息
To generate large number of reports in a limited time window, four techniques were proposed, including ROLAP&SQL, Shared Scanning, Hadoop based Solution, and MOLAP&Cube Sharding, an algorithm that performs in memory aggregation was designed for the second solution. The experiment results show that all techniques except ROLAP&SQL can meet the time window constraint, the Hadoop based solution is a promising technique owe to its highly scalability. Considering maturity of the techniques and their performance, we put MOLAP&Cube Sharding into practice while keeping an eye on Hadoop for future adoption.
Trajectories representing the motion of moving objects are typically obtained via location sampling, e.g. using GPS or road-side sensors, at discrete time-instants. In-between consecutive samples, nothing is known abo...
详细信息
ISBN:
(纸本)9781450305280
Trajectories representing the motion of moving objects are typically obtained via location sampling, e.g. using GPS or road-side sensors, at discrete time-instants. In-between consecutive samples, nothing is known about the whereabouts of a given moving object. Various models have been proposed (e.g., sheared cylinders;spacetime prisms) to represent the uncertainty of the moving objects both in unconstrained Euclidian space, as well as road networks. In this paper, we focus on representing the uncertainty of the objects moving along road networks as time-dependent probability distribution functions, assuming availability of a maximal speed on each road segment. For these settings, we introduce a novel indexing mechanism - UTH (Uncertain Trajectories Hierarchy), based upon which efficient algorithms for processing spatio-temporal range queries are proposed. We also present experimental results that demonstrate the benefits of our proposed methodologies.
Visualization technique is a powerful method used by science and technology intelligence analysis experts to identify technical competitor groups. Common visualization methods tend to create graphs meeting the aesthet...
详细信息
Trusted platform module (TPM) has little computation capability, and it is the performance bottleneck of remote attestation. In the scenario where the server is the attestation-busy entity which answers attestation re...
详细信息
LS2 is the logic to reason about the property of trusted computing. However, it lacks the capability of modeling the isolation provided by virtualization which is often involved in previous trusted computing system. W...
详细信息
Domain terms play a crucial role in many research areas, which has led to a rise in demand for automatic domain terms extraction. In this paper, we present a two-level evaluation approach based on term hood and unit h...
详细信息
In many areas, a lot of data have been modeled by graphs which are subject to uncertainties, such as molecular compounds and protein interaction networks. While many real applications, for example, collaborative filte...
详细信息
In many areas, a lot of data have been modeled by graphs which are subject to uncertainties, such as molecular compounds and protein interaction networks. While many real applications, for example, collaborative filtering, fraud detection, and link prediction in social networks etc, rely on efficiently answering k-nearest neighbor queries (kNN), which is the problem of computing the most "similar" k nodes to a given query node. To solve the problem, in this paper a novel method based on measurement of SimRank is proposed. However, because graphs evolve over time and are uncertainly, the computing cost can be very high in practice to solve the problem using the existing algorithms of SimRank. So the paper presents an optimization algorithm. Introducing path threshold, which is suitable in both determined graph and uncertain graph, the algorithm merely considers the local neighborhood of a given query node instead of whole graph to prune the search space. To further improving efficiency, the algorithm adopts sample technology in uncertain graph. At the same time, theory and experiments interpret and verify that the optimization algorithm is efficient and effective.
暂无评论