Many recent studies have revealed that the Optical Transpose Interconnection Systems (OTIS) are promising candidates for future high-performance parallel computers. In this paper, we present and evaluate a general met...
详细信息
ISBN:
(纸本)9783642131356
Many recent studies have revealed that the Optical Transpose Interconnection Systems (OTIS) are promising candidates for future high-performance parallel computers. In this paper, we present and evaluate a general method for algorithm development on the OTIS-Arrangement network (OTIS-AN) as an example of OTIS network. the proposed method could be used and customized for any other OTIS network. Furthermore it allows efficient mapping of a wide class of algorithms into the OTIS-AN. this method is based on grids as popular structure that support a vast body of parallel applications including linear algebra, divide-and-conquer type of algorithms, sorting, and FFT computation. this study confirms the viability of the OTIS-AN as an attractive alternative for large-scale parallelarchitectures.
Image processing is often considered a good candidate for the application of parallelprocessing because of the large volumes of data and the complex algorithms commonly encountered. this paper presents a tutorial int...
详细信息
Image processing is often considered a good candidate for the application of parallelprocessing because of the large volumes of data and the complex algorithms commonly encountered. this paper presents a tutorial introduction to the field of parallel image processing. After introducing the classes of parallelprocessing a brief review of architectures for parallel image processing is presented. Software design for low-level image processing and parallelism in high-level image processing are discussed and an application of parallelprocessing to handwritten postcode recognition is described. the paper concludes with a look at future technology and market trends.
In this paper MPI is used on PC Cluster to compute all the eigenvalues of Hermitian Toeplitz Matrices. the parallelalgorithms presented were implemented in C++ with MPI functions inserted and run on a cluster of Leno...
详细信息
ISBN:
(纸本)9783642131189
In this paper MPI is used on PC Cluster to compute all the eigenvalues of Hermitian Toeplitz Matrices. the parallelalgorithms presented were implemented in C++ with MPI functions inserted and run on a cluster of Lenovo thinkCentre machines running RedHat Linux. the two methods, MAHT-P one embarrassingly parallel and the other MPEAHT using master/ slave scheme are compared for performance and results presented. It is seen that computation time is reduced and speedup factor increases withthe number of computers used for the two parallel schemes presented. Load balancing becomes an issue as number of computers in a cluster are increased. A solution is provided to overcome such a case.
Window-based parallelarchitectures are here considered as target structures for the computation of low and medium level image processingalgorithms. their definition stems from a general reformulation of algorithms, ...
详细信息
Image processingalgorithms are widely used in the automotive field for ADAS (Advanced Driver Assistance System) purposes. To embed these algorithms, semiconductor companies offer heterogeneous architectures which are...
详细信息
ISBN:
(纸本)9781467375894
Image processingalgorithms are widely used in the automotive field for ADAS (Advanced Driver Assistance System) purposes. To embed these algorithms, semiconductor companies offer heterogeneous architectures which are composed of different processing units, often with massively parallel computing unit. However, embedding complex algorithms on these SoCs (System on Chip) remains a difficult task due to heterogeneity, it is not easy to decide how to allocate parts of a given algorithm on processing units of a given SoC. In order to help automotive industry in embedding algorithms on heterogeneous architectures, we propose a novel approach to predict performances of image processingalgorithms on different computing units of a given heterogeneous SoC. Our methodology is able to predict a more or less wide interval of execution time with a degree of confidence using only high level description of algorithms to embed, and a few characteristics of computing units.
the Pattern Matching with Swaps problem is a variation of the classical pattern matching problem in which a match is allowed to include disjoint local swaps. In 2009, Cantone and Faro devised a new dynamic programming...
详细信息
ISBN:
(纸本)9783642131189
the Pattern Matching with Swaps problem is a variation of the classical pattern matching problem in which a match is allowed to include disjoint local swaps. In 2009, Cantone and Faro devised a new dynamic programming algorithm for this problem that runs in time O(nm), where n is the length of the text and m is the length of the pattern. In this paper, first, we present an improved dynamic programming formulation of the approach of Cantone and Faro. then, we present an optimal parallelization of our algorithm, based on a linear array model, that runs in time O(m(2)) using [n/m-1] processors.
Multishift QR, algorithms are efficient for solving the symmetric tridiagonal eigenvalue problem on a parallel computer. In this paper, we focus on three variants of the multishift QR. algorithm, namely, the conventio...
详细信息
ISBN:
(纸本)9783642131356
Multishift QR, algorithms are efficient for solving the symmetric tridiagonal eigenvalue problem on a parallel computer. In this paper, we focus on three variants of the multishift QR. algorithm, namely, the conventional multishift QR algorithm, the deferred shift QR, algorithm and the fully pipelined multishift QR, algorithm, and construct performance models for them. Our models are designed for shared-memory parallel machines, and given the basic performance characteristics of the target;machine and the problem size, predict the execution time of these algorithms. Experimental results show that our models can predict the relative performance of these algorithms to the accuracy of 10% in many cases. thus our models are useful for choosing the best algorithm to solve a given problem in a specified computational enviromnent, as well as for finding the best value of the performance parameters.
Given two sorted arrays A = (a(1), a(2), ..., a(n)) and B = (b(1), b(2), ..., b(n)) of records such that (1) the n records are sorted according to one field which is called the key, and (2) the values of the keys are ...
详细信息
ISBN:
(纸本)9783642131356
Given two sorted arrays A = (a(1), a(2), ..., a(n)) and B = (b(1), b(2), ..., b(n)) of records such that (1) the n records are sorted according to one field which is called the key, and (2) the values of the keys are serial numbers. Merging data records has many applications in computer science especially in database. We develop an algorithm that runs in O(log n) time on EREW PRAM to merge two sorted arrays of records using n/log n processors even the keys of the data records are repeated. the algorithm is cost-optimal, deterministic, stable and uses linear number of space.
A parallelprocessing system is being developed. Unlike conventional parallelarchitectures, the parallelism is at the level of an algorithm. A sorting algorithm for this machine is presented. the architecture of the ...
详细信息
A parallelprocessing system is being developed. Unlike conventional parallelarchitectures, the parallelism is at the level of an algorithm. A sorting algorithm for this machine is presented. the architecture of the parallelprocessing system consists of an elevated single instruction multiple data (SIMD) model that works at the level of a basic algorithm. Viewed from another angle, the architecture is of the multiple instruction multiple data (MIMD) type, since synchronization is not at the instruction level. In reality, it is midway between SIMD and MIMD, which the author calls an SAMD (same algorithm multiple data) architecture. the philosophy is to divide the problem into a number of algorithms.
A variety of parallel approaches have been used to support database processing across a spectrum of machine architectures. In this talk, we begin by describing areas where parallelism is potentially important in deali...
详细信息
ISBN:
(纸本)0818654007
A variety of parallel approaches have been used to support database processing across a spectrum of machine architectures. In this talk, we begin by describing areas where parallelism is potentially important in dealing with very large databases, including loading, query/update, and database administration. We then discuss hardware tradeoffs, including multicomputers versus multiprocessors, distributed versus centralized memory, and specialized versus general-purpose architectures. At the software level, we cover a number of approaches, including running multiple transactions in parallel, decomposing queries into parallel subqueries, executing low-level query operations in parallel, running multiple instances of the DBMS, and partitioning data over disks. We characterize the impact of these approaches on performance, scalability, and ease of use, for both decision support and transaction processing. Finally, the approaches taken in several commercial DBMSs will be described, as well as extensions such as the Kendall Square Query Decomposer.
暂无评论