Kirchhoff pre-stack depth migration (KPSDM) algorithm, as one of the most widely used migration algorithms, plays an important part in getting the real image of the earth. However, this program takes considerable time...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Kirchhoff pre-stack depth migration (KPSDM) algorithm, as one of the most widely used migration algorithms, plays an important part in getting the real image of the earth. However, this program takes considerable time due to its high computational cost;hence the working efficiency of the oil industry is affected. the general purpose Graphic processing Unit (GPU) and the Compute Unified Device Architecture (CUDA) developed by NVIDIA have provided a new solution to this problem. In this study, we have proposed a parallel algorithm of the Kirchhoff pre-stack depth migration and an optimization strategy based on the CUDA technology. Our experiments indicate that for large data computations, the accelerated algorithm achieves a speedup of 8 similar to 15 times compared with NVIDIA GPU.
Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy...
详细信息
ISBN:
(纸本)9783319111971;9783319111964
Hybrid parallel file systems (PFS), which consist of both HDD and SSD servers, provide a promising solution for data-intensive applications. In this study, we propose a performance-aware data placement (PADP) strategy to enable efficient data layout in hybrid PFSs. the basic idea of PADP is to dispatch data on different file servers with adaptive varied-size file stripes based on the server storage performance. By using an effective data access cost model and a linear programming optimization method, the appropriate stripe sizes for each file server are determined effectively. We have implemented PADP within OrangeFS, a widely used parallel file system in HPC domain. Experimental results of representative benchmark show that PADP can significantly improve the I/O performance of hybrid PFSs.
Due to the exhaustion of IPv4 address resources, the transition from IPv4 to IPv6 is inevitable and fairly urgent. Numerous transition mechanisms have been proposed, especially the tunnel scheme which is the focus of ...
详细信息
ISBN:
(纸本)9783319111940;9783319111933
Due to the exhaustion of IPv4 address resources, the transition from IPv4 to IPv6 is inevitable and fairly urgent. Numerous transition mechanisms have been proposed, especially the tunnel scheme which is the focus of research efforts in IETF and academia recently. However, because of the diverse characteristics and transition requirements of practical networks and the lack of applicability analysis, the selection and deployment of transition mechanisms are facing with grand challenges. Targeting at those challenges, this paper investigates the basic issues and key elements of IPv6 tunnel transition mechanisms, and presents its first applicability index system. In particular, we analyze the applicability of existing proposed tunnel techniques based on the presented index system, which has significant guidance in the practical deployment of IPv6 transition. Moreover, as the key factors in realistic working environment, the analysis for the security issues of tunnel transition scheme, which was seldom taken into account before, is provided in this study.
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n &...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
In this paper, embeddings of a family of 3D meshes in locally twisted cubes are studied. Let LTQ(n)(V, E) denotes the n-dimensional locally twisted cube. We find two major results in this paper:(1) For any integer n >= 4, two node-disjoint 3D meshes of size 2 x 2 x 2(n-3) can be embedded into LTQ(n) with dilation 1 and expansion 2. (2) For any integer n = 6, four node-disjoint 4x2x2(n-5) meshes can be embedded into LTQ(n) with dilation 1 and expansion 4. Further, an embedding algorithm can be constructed based on our embedding method. the obtained results are optimal in the sense that the dilations of the embeddings are 1.
the main contribution of this paper is to present an implementation that performs the exhaustive search to verify the Collatz conjecture using a GPU. Consider the following operation on an arbitrary positive number: i...
详细信息
ISBN:
(纸本)9783319111971;9783319111964
the main contribution of this paper is to present an implementation that performs the exhaustive search to verify the Collatz conjecture using a GPU. Consider the following operation on an arbitrary positive number: if the number is even, divide it by two, and if the number is odd, triple it and add one. the Collatz conjecture asserts that, starting from any positive number m, repeated iteration of the operations eventually produces the value 1. We have implemented it on NVIDIA GeForce GTX TITAN and evaluated the performance. the experimental results show that, our GPU implementation can verify 5.01x10(11) 64-bit numbers per second, while the CPU implementation on Intel Xeon X7460 can verify 1.80 x 10(9) 64-bit numbers per second. thus, our implementation on the GPU attains a speed-up factor of 278 over the single CPU implementation.
Conventional software speculative parallel models are facing challenges due to the increasing number of the processor core and the diversification of the application. the speculation accuracy is one of the key factors...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Conventional software speculative parallel models are facing challenges due to the increasing number of the processor core and the diversification of the application. the speculation accuracy is one of the key factors to the performance of software speculative parallel model. In this paper, we proposed a novel value prediction mechanism named Inter-thread Fetching Value Prediction(IFVP). It supports a speculative thread to read the values of conflict variables speculatively from another speculative thread. this method can remarkably reduce the miss speculation rate in a loop to be parallelized with cross-iter dependencies. We have proved that the IFVP can improve the speculation accuracy by about 19.1% on the average, and can improve the performance by about 37.1% on the average, compared withthe conventional models without value prediction.
there is no dedicated thread mapping method for Many Integrated Core (MIC) heterogeneous system in the traditional multithread programming model. the unreasonable thread mapping will lead the promising computing power...
详细信息
ISBN:
(纸本)9783319111940;9783319111933
there is no dedicated thread mapping method for Many Integrated Core (MIC) heterogeneous system in the traditional multithread programming model. the unreasonable thread mapping will lead the promising computing power of MIC coprocessor not to be fully exploited. In order to fully exploit the computing potential of MIC coprocessor, this paper discussed effective multi threads mapping strategies through comparing the computing performance and analyzing the performance differences between various mapping methods. Meanwhile, for the further exploiting the high computing power of MIC heterogeneous system, the specific program porting and performance optimization strategies were explored by using the k-means application program. Experimental results show that the proposed mapping and parallel optimization strategies are effective, which can be guide the programmer to port and optimize applications effectively to MIC heterogeneous parallel system.
Agent-based modeling (ABM) has been widely used in stock market simulation. However, traditional simulations of stock markets with ABM on single computers are limited by the computing capability as breakthroughs in fi...
详细信息
ISBN:
(数字)9783319111940
ISBN:
(纸本)9783319111940;9783319111933
Agent-based modeling (ABM) has been widely used in stock market simulation. However, traditional simulations of stock markets with ABM on single computers are limited by the computing capability as breakthroughs in financial research need much larger amount of agents. this paper introduces a platform for stock market simulation with ABM focusing on large scale parallel agents in a distributed computing environment such as Cluster and Mpp. Withthe customized trade strategies inside the agents, the runtime system of the platform can distribute the massive amount of agents to multiple computing nodes automatically during the execution of the simulation. And agents exchange information with each other and the market through a uniform communication system. Withthis platform financial researchers can design their own financial model without caring about the complexity of parallelization and related problems. the sample simulation on the platform is verified to be compatible withthe data from Euronext-NYSE and the platform shows fair scalability and performance under different parallelism configurations.
Nowadays RAID is widely used withthe increasing requirements of the reliability in storage systems and the fast development of cloud computing. Among various levels and implementations of RAID systems, RAID-6 is one ...
详细信息
ISBN:
(纸本)9783319111940;9783319111933
Nowadays RAID is widely used withthe increasing requirements of the reliability in storage systems and the fast development of cloud computing. Among various levels and implementations of RAID systems, RAID-6 is one of the most significant category withthe ability to tolerate concurrent failures of any two disks. However, the scalability of RAID-6 is a big challenge. Although many approaches are proposed to accelerate the scaling process and reduce the overhead, how to efficiently remove disks (refers to scale-down process) from existing array is still an open problem. To address the scalability problem, we propose an Advanced Data Redistribution (ADR) approach. the basic idea of ADR is to reorganize previous stripes in RAID-6 systems to achieve higher scalability. ADR is a stripe-level scheme and can be combined with other approaches as SDM and MDS-Frame. It can minimize the overhead of data migration and parity modification. We have conducted mathematical analysis by comparing ADR to various popular RAID-6 codes. the results show that, compared to typical approach (Round-Robin), ADR decreases more than 52.1% migration I/O operations, saves the migration time by up to 63.5%, and speeds up of the scaling process by up to 1.91.
Due to high storage efficiency, erasure codes are recently used to provide high data reliability in distributed storage systems. When multiple data loses in system, regeneration time for them demands to be as short as...
详细信息
ISBN:
(纸本)9783319111940;9783319111933
Due to high storage efficiency, erasure codes are recently used to provide high data reliability in distributed storage systems. When multiple data loses in system, regeneration time for them demands to be as short as possible so as to keep data availbility and reliability. Common way is to repair them one by one, which prolongs the regeneration time. Tree-structured regeneration may reduce regeneration time when regenerating one single node failure by relaying the network traffic, and is also extended to regenerate multiple data losses. In this paper, based on regenerating codes which achieve minimal network traffic during the regeneration, we consider reducing regeneration time by using multiple max-min trees to parallel regenerate multiple data losses. And we proposed an algorithm: bandwidth-sharing max-min algorithm (BSM2RC) to construct multiple parallel max-min trees. It realizes efficient bandwidth utilization by maximizing the minimal bottleneck edge weight of multiple regeneration trees, thus improve regeneration efficiency. Our simulation experiment shows that multiple parallel max-min trees reduce total regeneration time for multiple data losses significantly, and thus enhance system reliability, compared with existing regeneration scheme.
暂无评论