In this paper we show the design of a 3 dimensional optoeleotronic hardware approach to realize a fix point processing unit. For that we show the main ideas of the low level algorithm. We will introduce several concep...
详细信息
ISBN:
(纸本)3540664432
In this paper we show the design of a 3 dimensional optoeleotronic hardware approach to realize a fix point processing unit. For that we show the main ideas of the low level algorithm. We will introduce several concepts and evaluate them with regard to the highest throughput. At the end we will focus on an application of our 3d approach, especially on an algorithm for volume rendering of medical image sets.
In terms of program verification data-flow analysis (DFA) is commonly understood as the computation of the strongest postcondition for every program point with respect to a precondition which is assured to be valid at...
详细信息
ISBN:
(纸本)3540664432
In terms of program verification data-flow analysis (DFA) is commonly understood as the computation of the strongest postcondition for every program point with respect to a precondition which is assured to be valid at the entry of the program. Here, we consider DFA under the dual weakest precondition view of program verification. Based on this view we develop an approach for demand-driven DFA of explicitly parallel programs, which we exemplify for the large and practically most important class of bitvector problems. this approach can directly be used for the construction of online debugging tools. Moreover, it is tailored for parallelization. For bitvector problems, this allows us to compute the information provided by conventional, strongest postcondition-centered DFA at the costs of answering a data-flow query, which are usually much smaller. In practice, this can lead to a remarkable speed-up of analysing and optimizing explicitly parallel programs.
the fast Poisson solvers based on FFT computations are among the fastest techniques to solve Poisson problems on uniform grids. In this paper, we present a parallel distributed implementation of a 3D fast Poisson solv...
详细信息
ISBN:
(纸本)3540664432
the fast Poisson solvers based on FFT computations are among the fastest techniques to solve Poisson problems on uniform grids. In this paper, we present a parallel distributed implementation of a 3D fast Poisson solver in the context of the atmospheric simulation code Meso-NH [3]. this parallel implementation consists in implementing data movement between each computational step so that no elementary computational routine implements communication. Experimental results are given on a 128 node Gray T3E to illustrate the advantages of this method.
the aim of this work is to report on a parallel implementation of methods for tolerance analysis in the framework of a microelectronics design center. the methods were designed to run parallelly on different platforms...
详细信息
A fully adaptive router with hybrid buffers at the input and output channels was designed, which improves the throughput of its input buffer counterpart by up to 40% and has only 10% higher base latency. An in-depth a...
详细信息
ISBN:
(纸本)3540664432
A fully adaptive router with hybrid buffers at the input and output channels was designed, which improves the throughput of its input buffer counterpart by up to 40% and has only 10% higher base latency. An in-depth analysis of different router buffer organization was carried out for a toroidal network, which uses either a deterministic (DOR) or a fully adaptive routing scheme. Each proposal is described in VHDL and evaluated withthe Synopsys synthesis tool. Technological restrictions obtained were used to evaluate network performance under both synthetic loads and real applications.
We have designed an algorithm which allows the OpenGL geometry transformations to be processed on a multiprocessor system. We have integrated it in Mesa, a 3D graphics library with an API which is very similar to that...
详细信息
the aim of the High Performance Banking (HYPERBANK) project is to provide the banking sector withthe requisite toolset for the increased understanding of existing and prospective customers, the approach exploits and ...
详细信息
ISBN:
(纸本)3540664432
the aim of the High Performance Banking (HYPERBANK) project is to provide the banking sector withthe requisite toolset for the increased understanding of existing and prospective customers, the approach exploits and integrates three areas: business knowledge modelling, data warehousing and data mining, together withparallel computing. Business knowledge modelling formally describes the enterprise in terms of roles, goals and rules. A generic customer-profiling model has been produced and has been instrumental in informing and guiding data mining experiments performed on the banks' data. parallel computing is required to manipulate and analyse to maximum effect the vast amounts of data collected by banks. A parallel data warehousing tool has been produced and work is ongoing to integrate the customer profiling model withthis tool. In this paper, we present work done in the development and implementation of a variety of parallel data mining techniques.
In this paper we introduce a student exercise that is devoted to compare two parallel languages, namely MPI (Message Passing Interface) and BSP (Bulk Synchronous parallel language). the work to accomplish is integrate...
详细信息
We present a new parallelizable preconditioner that is used as the local component of a two-level preconditioner similar to BPS. On 2D model problems that exhibit either high anisotropy or discontinuity, we demonstrat...
详细信息
the Modified Gram-Schmidt (MGS) orthogonalization process -used for example in the Arnoldi algorithm - constitutes often the bottleneck that limits parallel efficiencies. Indeed, a number of communications, proportion...
详细信息
ISBN:
(纸本)3540664432
the Modified Gram-Schmidt (MGS) orthogonalization process -used for example in the Arnoldi algorithm - constitutes often the bottleneck that limits parallel efficiencies. Indeed, a number of communications, proportional to the square of the problem size, is required to compute the dot-products. A block formulation is attractive but it suffers from potential numerical instability. In this paper, we address this issue and propose a simple procedure that allows the use of a Block Gram-Schmidt algorithm while guaranteeing a numerical accuracy similar to MGS. the main idea is to dynamically determine the size of the blocks. the main advantage of this dynamic procedure are two-folds: first, high performance matrix-vector multiplications can be used to decrease the execution time. Next, in a parallel environment, the number of communications is reduced. Performance comparisons withthe alternative Iterated CGS also show an improve ment for moderate number of processors.
暂无评论