Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. this paper presents an analysis of a texture segmentation application containing ...
详细信息
ISBN:
(纸本)0818656026
Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. this paper presents an analysis of a texture segmentation application containing a 96x96 convolution. Sequential execution required several hours on single processors systems with over 99% of the time spent performing the large convolution. 70% to 75% of execution time is attributable to cache misses within the convolution. We implemented the same application on CM-5, iPSC/860 and PVM distributed memory multicomputers, tailoring the parallelalgorithms to each machine's architectures. parallelization significantly reduced execution time, taking 49 second on a 512 node CM-5 and 6.5 minutes on a 32 node iPSC/860.
One of the most important problems in data mining is discovery of association rules in large database. We had proposed parallelalgorithms for mining generalized association rules with classification hierarchy. In thi...
详细信息
ISBN:
(纸本)3540664432
One of the most important problems in data mining is discovery of association rules in large database. We had proposed parallelalgorithms for mining generalized association rules with classification hierarchy. In this paper, we implemented the proposed algorithms on a large scale PC cluster which consists of one hundred PCs interconnected by an ATM switch, and analyzed the performance of our algorithms using a large amount of transaction dataset. Performance evaluations show our parallelalgorithms are effective for handling skew for such large scale parallel systems.
this paper discusses the effect of processor failures on computation performed on two-dimensional VLSI processor arrays. Previously established properties of catastrophic fault patterns are used to study inherent limi...
详细信息
In data analytics applications, join is a general and time consuming operation. Optimizing join algorithms can benefit the query processing significantly. the emerging of GPUs provides a massive parallelism solution f...
详细信息
the paper focuses on the problem of the multi-spectral image segmentation, which leads - through the data fusion of several mono-spectral images - to reliable and robust vision systems for military or industrial purpo...
详细信息
the paper focuses on the problem of the multi-spectral image segmentation, which leads - through the data fusion of several mono-spectral images - to reliable and robust vision systems for military or industrial purposes. the proposed approach does not fit the classical taxinomy of image data fusion methods: indeed, data fusion is performed during the segmentation, in parallel, of different images. the presented algorithm has been implemented on the Connection Machine CM5 withthe data programming style.
Raman spectrometry is a technique that allows detecting chemical products through a number of representative peaks found in an image spectrum or numeric series of data. the Raman spectrum machine generates a CSV file ...
详细信息
ISBN:
(纸本)9781538663929
Raman spectrometry is a technique that allows detecting chemical products through a number of representative peaks found in an image spectrum or numeric series of data. the Raman spectrum machine generates a CSV file or an image as a curve which is the result of the diagnosis product. the analysis of the spectrum peaks permits to detect the chemical origin of the concerned product. Scientists do this operation manually, which makes it hard and long in terms of time. Graphics processing Units (GPUs) allow us to make the processing faster and more efficient, thanks to its multicore architecture. the aim of the present paper is to propose a new GPU based approach to automate the molecule detection operation using image-processing techniques with OpenCL parallel implementation. We propose two parallel solutions, which will be compared to each other. We apply the exploited approach to analyze ionic liquids and biomaterials samples.
Lattice sieving is currently the leading class of algorithms for solving the shortest vector problem over lattices. the computational difficulty of this problem is the basis for constructing secure post-quantum public...
详细信息
ISBN:
(纸本)9783030602451;9783030602444
Lattice sieving is currently the leading class of algorithms for solving the shortest vector problem over lattices. the computational difficulty of this problem is the basis for constructing secure post-quantum public-key cryptosystems based on lattices. In this paper, we present a novel massively parallel approach for solving the shortest vector problem using lattice sieving and hardware acceleration. We combine previously reported algorithms with a proper caching strategy and develop hardware architecture. the main advantage of the proposed approach is eliminating the overhead of the data transfer between a CPU and a hardware accelerator. the authors believe that this is the first such architecture reported in the literature to date and predict to achieve up to 8 times higher throughput when compared to a multi-core high-performance CPU. Presented methods can be adapted for other sieving algorithms hard to implement in FPGAs due to the communication and memory bottleneck.
In this paper, we propose an implementation method with high throughput for a single-chip 4096 complex point FFT. In order to increase transform speed, a parallel FFT architecture has been used. there are eight parall...
详细信息
ISBN:
(纸本)078037889X
In this paper, we propose an implementation method with high throughput for a single-chip 4096 complex point FFT. In order to increase transform speed, a parallel FFT architecture has been used. there are eight parallel basic processing modules in the entire FFT chip, which can work at the same time independently. the proposed structure can compute 4096 complex point forward or inverse FFT in real time with up to 320MHZ sampling frequency. and will be applied widely in high-speed signal processing.
this paper examines the scalable parallel implementation of the QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-blocks are presented. Each impleme...
详细信息
ISBN:
(纸本)9780769530895
this paper examines the scalable parallel implementation of the QR factorization of a general matrix, targeting SMP and multi-core architectures. Two implementations of algorithms-by-blocks are presented. Each implementation views a block of a matrix as the fundamental unit of data, and likewise, operations over these blocks as the primary unit of computation. the first is a conventional blocked algorithm similar to those included in libFLAME and LAPACK but expressed in a way that allows operations in the so-called critical path of execution to be computed as soon as their dependencies are satisfied. the second algorithm captures a higher degree of parallelism with an approach based on Givens rotations while preserving the performance benefits of algorithms based on blocked Householder transformations. We show that the implementation effort is greatly simplified by expressing the algorithms in code withthe FLAME/FLASH API, which allows matrices stared by blocks to be viewed and managed as matrices of matrix blocks. the SuperMatrix run-time system utilizes FLASH to assemble and represent matrices but also provides out-of-order scheduling of operations that is transparent to the programmer Scalability of the solution is demonstrated on ccNUMA platform with 16 processors and an SMP architecture with 16 cores.
this volume presents the proceedings of the 5thinternationalconferenceparallelarchitectures and Languages Europe (PARLE '94), held in Athens, Greece in July 1994. PARLE is the main Europe-based event on parall...
详细信息
ISBN:
(数字)9783540484776
ISBN:
(纸本)9783540581840
this volume presents the proceedings of the 5thinternationalconferenceparallelarchitectures and Languages Europe (PARLE '94), held in Athens, Greece in July 1994. PARLE is the main Europe-based event on parallelprocessing. parallelprocessing is now well established within the high-performance computing technology and of stategic importance not only to the computer industry, but also for a wide range of applications affecting the whole economy. the 60 full papers and 24 poster presentations accepted for this proceedings were selected from some 200 submissions by the international program committee; they cover the whole field and give a timely state-of-the-art report on research and advanced applications in parallel computing.
暂无评论