We present a method and an accompanying algorithm for scalable parallel generation of sparse matrices intended primarily for benchmarking purposes, namely for evaluation of performance and scalability of generic massi...
详细信息
ISBN:
(纸本)9783642552243
We present a method and an accompanying algorithm for scalable parallel generation of sparse matrices intended primarily for benchmarking purposes, namely for evaluation of performance and scalability of generic massively parallelalgorithmsthat involve sparse matrices. the proposed method is based on enlargement of small input matrices, which are supposed to be obtained from public sparse matrix collections containing numerous matrices arising in different application domains and thus having different structural and numerical properties. the resulting matrices are distributed among processors of a parallel computer system. the enlargement process is designed so its users may easily control structural and numerical properties of resulting matrices as well as the distribution of their nonzero elements to particular processors.
the efforts of the research community and the software industry to make the art of parallel programming easier continue. Measuring the usability of contemporary parallel programming languages and libraries by empirica...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
the efforts of the research community and the software industry to make the art of parallel programming easier continue. Measuring the usability of contemporary parallel programming languages and libraries by empirical studies is the key to understanding how programmers are thinking, designing, coding, and debugging parallel programs. In this paper we take apart into their component ingredients the empirical experiments done in the recent years. By analyzing each component separately we can better understand what is missing in these experiments and thereby improve the outcome of future studies. the result of this work is a set of recommendations that aims to make usability studies more convincing so that parallel language designers will take them seriously.
the goal of this paper is to propose and test a new memetic algorithm for the capacitated vehicle routing problem in parallel computing environment. In this paper we consider a simple variation of the vehicle routing ...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
the goal of this paper is to propose and test a new memetic algorithm for the capacitated vehicle routing problem in parallel computing environment. In this paper we consider a simple variation of the vehicle routing problem in which the only parameter is the capacity of the vehicle and each client only needs one package. We analyze the efflciency of the algorithm using the hierarchical parallel Random Access Machine (PRAM) model and run experiments with code written in CUDA.
Images with high visual quality are often generated by a ray tracing algorithm. Despite its conceptual simplicity, designing an efficient mapping of ray tracing computations to massively parallel hardware architecture...
详细信息
ISBN:
(纸本)9789897580024
Images with high visual quality are often generated by a ray tracing algorithm. Despite its conceptual simplicity, designing an efficient mapping of ray tracing computations to massively parallel hardware architectures is a challenging *** this paper we investigate the performance of state-of-the-art ray traversal algorithms for bounding volume hierarchies on GPUs and discuss their potentials and limitations. Based on this analysis, a novel ray traversal scheme called batch tracing is proposed. It decomposes the task into multiple kernels, each of which is designed for efficient parallel execution. Our algorithm achieves comparable performance to currently prevailing approaches and represents a promising avenue for future research.
We reconsider the familiar problem of executing a perfectly parallel workload consisting of N independent tasks on a parallel computer with P << N processors. We show that there are memory-bound problems for whi...
详细信息
ISBN:
(纸本)9783642552243
We reconsider the familiar problem of executing a perfectly parallel workload consisting of N independent tasks on a parallel computer with P << N processors. We show that there are memory-bound problems for which the runtime can be reduced by the forced parallelization of individual tasks across a small number of cores. Specific examples include solving differential equations, performing sparse matrix-vector multiplications, and sorting integer keys.
the number of space debris has increased tremendously in the last decade, arousing the interest of the experts in the field. the surveillance of the space is a first step in monitoring the traffic of floating objects ...
详细信息
ISBN:
(纸本)9781479965694
the number of space debris has increased tremendously in the last decade, arousing the interest of the experts in the field. the surveillance of the space is a first step in monitoring the traffic of floating objects and has several applications such as the correction of orbit coordinates for satellites or collision avoidance. An improved and flexible framework for real-time detection of satellites using a cheap optical surveillance system is proposed in this paper. the detection method is based on the Radon Transform. the satellite candidates resulted after processingthe Radon space are validated by imposing constraints over the satellites length and brightness, and over the stereo matching. We additionally propose a parallel approach for Radon transform on GPU in order to fulfill the real-time constraints. We test our method on a large and variate data set, containing satellites from different orbit ranges, namely medium and high orbits. A high accuracy over 95% was obtained in average for real time satellites detection with minimal false positives.
In this paper we present a parallel preconditioner for the standard Finite Volume (FV) discretization of elliptic problems, using the standard continuous piecewise linear Finite Element (FE) function space. the propos...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
In this paper we present a parallel preconditioner for the standard Finite Volume (FV) discretization of elliptic problems, using the standard continuous piecewise linear Finite Element (FE) function space. the proposed preconditioner is constructed using an abstract framework of the Additive Schwarz Method, and is fully parallel. the convergence rate of the Generalized Minimal Residual (GMRES) method withthis preconditioner is shown to be almost optimal, i.e., it depends poly-logarithmically on the mesh sizes.
Since the silicon technology entered the many-core era, new computing platforms are exploiting higher and higher levels of parallelism. thanks to scalable, clustered architectures, embedded systems and high-performanc...
详细信息
作者:
Jeljeli, HamzaUniv Lorraine
CARAMEL Project Team LORIA INRIACNRS Campus SciBP 239 F-54506 Vandoeuvre Les Nancy France
In cryptanalysis, solving the discrete logarithm problem (DLP) is key to assessing the security of many public-key cryptosystems. the index-calculus methods, that attack the DLP in multiplicative subgroups of finite f...
详细信息
ISBN:
(数字)9783319098739
ISBN:
(纸本)9783319098739;9783319098722
In cryptanalysis, solving the discrete logarithm problem (DLP) is key to assessing the security of many public-key cryptosystems. the index-calculus methods, that attack the DLP in multiplicative subgroups of finite fields, require solving large sparse systems of linear equations modulo large primes. this article deals with how we can run this computation on GPU- and multi-core-based clusters, featuring InfiniBand networking. More specifically, we present the sparse linear algebra algorithmsthat are proposed in the literature, in particular the block Wiedemann algorithm. We discuss the parallelization of the central matrix-vector product operation from both algorithmic and practical points of view, and illustrate how our approach has contributed to the recent record-sized DLP computation in GF(2(809)).
this paper concerns a new approach to evaluation of Option Price sensitivities using the Monte Carlo simulation, based on the parallel GPU architecture and Automatic Differentiation methods. In order to study rounding...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
this paper concerns a new approach to evaluation of Option Price sensitivities using the Monte Carlo simulation, based on the parallel GPU architecture and Automatic Differentiation methods. In order to study rounding errors, the interval arithmetic is used. Considerations are based on two implementations of the algorithm - the sequential and parallel ones. For efficient differentiation, the Adjoint method is employed. Computational experiments include analysis of performance, uncertainty error and rounding error and consider Black-Scholes and Heston models.
暂无评论