To provide timely results for ‘Big Data Analytics’, it is crucial to satisfy deadline requirements for MapReduce jobs in production environments. In this paper, we propose a deadline-oriented task scheduling approac...
详细信息
While researchers have concentrated on the optimization of joint redundancy and maintenance mechanism, maintenance in computing systems is quite different from that in traditional systems. Considering a routine monito...
详细信息
Fingerprint has been widely used in a variety of biometric identification systems. However, withthe rapid development of fingerprint identification systems, the amount of fingerprints information stored in systems ha...
详细信息
Distributed storage systems can provide large-scale data storage and high data reliability by redundant schemes, such as replica and erasure codes. Redundant data may get lost due to frequent node failures in the syst...
详细信息
Kirchhoff pre-stack depth migration (KPSDM) algorithm, as one of the most widely used migration algorithms, plays an important part in getting the real image of the earth. However, this program takes considerable time...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Kirchhoff pre-stack depth migration (KPSDM) algorithm, as one of the most widely used migration algorithms, plays an important part in getting the real image of the earth. However, this program takes considerable time due to its high computational cost;hence the working efficiency of the oil industry is affected. the general purpose Graphic processing Unit (GPU) and the Compute Unified Device Architecture (CUDA) developed by NVIDIA have provided a new solution to this problem. In this study, we have proposed a parallel algorithm of the Kirchhoff pre-stack depth migration and an optimization strategy based on the CUDA technology. Our experiments indicate that for large data computations, the accelerated algorithm achieves a speedup of 8 similar to 15 times compared with NVIDIA GPU.
Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy...
详细信息
ISBN:
(纸本)9783319111971;9783319111964
Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy to enable efficient data layout in hybrid PFSs. the basic idea of PADP is to dispatch data on different file servers with adaptive varied-size file stripes based on the server storage performance. By using an effective data access cost model and a linear programming optimization method, the appropriate stripe sizes for each file server are determined effectively. We have implemented PADP within OrangeFS, a widely used parallel file system in HPC domain. Experimental results of representative benchmark show that PADP can significantly improve the I/O performance of hybrid PFSs.
the main contribution of this paper is to present an implementation that performs the exhaustive search to verify the Collatz conjecture using a GPU. Consider the following operation on an arbitrary positive number: i...
详细信息
ISBN:
(纸本)9783319111971;9783319111964
the main contribution of this paper is to present an implementation that performs the exhaustive search to verify the Collatz conjecture using a GPU. Consider the following operation on an arbitrary positive number: if the number is even, divide it by two, and if the number is odd, triple it and add one. the Collatz conjecture asserts that, starting from any positive number m, repeated iteration of the operations eventually produces the value 1. We have implemented it on NVIDIA GeForce GTX TITAN and evaluated the performance. the experimental results show that, our GPU implementation can verify 5.01x10(11) 64-bit numbers per second, while the CPU implementation on Intel Xeon X7460 can verify 1.80 x 10(9) 64-bit numbers per second. thus, our implementation on the GPU attains a speed-up factor of 278 over the single CPU implementation.
Conventional software speculative parallel models are facing challenges due to the increasing number of the processor core and the diversification of the application. the speculation accuracy is one of the key factors...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Conventional software speculative parallel models are facing challenges due to the increasing number of the processor core and the diversification of the application. the speculation accuracy is one of the key factors to the performance of software speculative parallel model. In this paper, we proposed a novel value prediction mechanism named Inter-thread Fetching Value Prediction(IFVP). It supports a speculative thread to read the values of conflict variables speculatively from another speculative thread. this method can remarkably reduce the miss speculation rate in a loop to be parallelized with cross-iter dependencies. We have proved that the IFVP can improve the speculation accuracy by about 19.1% on the average, and can improve the performance by about 37.1% on the average, compared withthe conventional models without value prediction.
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n &...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n >= 4, two node-disjoint 3D meshes of size 2 x 2 x 2(n-3) can be embedded into LTQ(n) with dilation 1 and expansion 2. (2) For any integer n = 6, four node-disjoint 4x2x2(n-5) meshes can be embedded into LTQ(n) with dilation 1 and expansion 4. Further, an embedding algorithm can be constructed based on our embedding method. the obtained results are optimal in the sense that the dilations of the embeddings are 1.
there is no dedicated thread mapping method for Many Integrated Core (MIC) heterogeneous system in the traditional multithread programming model. the unreasonable thread mapping will lead the promising computing power...
详细信息
ISBN:
(纸本)9783319111940;9783319111933
there is no dedicated thread mapping method for Many Integrated Core (MIC) heterogeneous system in the traditional multithread programming model. the unreasonable thread mapping will lead the promising computing power of MIC coprocessor not to be fully exploited. In order to fully exploit the computing potential of MIC coprocessor, this paper discussed effective multi threads mapping strategies through comparing the computing performance and analyzing the performance differences between various mapping methods. Meanwhile, for the further exploiting the high computing power of MIC heterogeneous system, the specific program porting and performance optimization strategies were explored by using the k-means application program. Experimental results show that the proposed mapping and parallel optimization strategies are effective, which can be guide the programmer to port and optimize applications effectively to MIC heterogeneous parallel system.
暂无评论