作者:
Lookin, N. A.Urals Fed Univ
Inst Radioelect & Informat Technol 32 Mira Str Ekaterinburg 620002 Russia RAS
Inst Engn Sci Ural Branch Ekaterinburg 620049 Russia
Continuous complication of data processingalgorithms for space and air vehicles results to special-purpose processor embedded on the different levels of board hardware. their architectures and instructions sets are a...
详细信息
ISBN:
(纸本)9789663354125
Continuous complication of data processingalgorithms for space and air vehicles results to special-purpose processor embedded on the different levels of board hardware. their architectures and instructions sets are adopted for effective implementation of the most complicated computational algorithms. It is important for real-time digital signal and image processing when computer performance must perform about 10(12) op/sec. these processors are intended for real-time data flow computation with large massive of signals or pixels and considered as a VLSI. Some problems of development of the functional-oriented processors with homogeneous architecture are discussed in the paper.
Separable 2-D transforms (such as the 2-D Fourier transform) are widely used in fields such as data analysis and image processing. Fast processors and divide-and-conquer algorithms have made these 2-D transforms acces...
详细信息
ISBN:
(纸本)9781880843970
Separable 2-D transforms (such as the 2-D Fourier transform) are widely used in fields such as data analysis and image processing. Fast processors and divide-and-conquer algorithms have made these 2-D transforms accessible on desktop computers. Nonetheless, the widespread use of multi-core architectures makes significant efficiency improvements possible. In the past, parallelprocessing in C++ has been restricted to external libraries. But the recent release of C++11 introduces concurrency constructs into the language itself, providing obvious benefits to software development, optimization, and portability. this paper examines the high level concurrency interface in C++11, and demonstrates that significant efficiency gains are achievable for the 2-D Fourier transform on standard multi-core processors. this approach is readily extensible to other separable 2-D transforms such as wavelet transforms and the discrete cosine transform, and is equally applicable to other separable 2-D operations such as convolution and correlation. It promises to scale well to future multi-core processors, as additional CPUs become available. Copyright ISCA, CAINE 2014.
Graph/hypergraph partitioning models and methods have been successfully used to minimize the communication among processors in several parallel computing applications. parallel sparse matrix-vector multiplication (SpM...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
Graph/hypergraph partitioning models and methods have been successfully used to minimize the communication among processors in several parallel computing applications. parallel sparse matrix-vector multiplication (SpMxV) is one of the representative applications that renders these models and methods indispensable in many scientific computing contexts. We investigate the interplay of the partitioning metrics and execution times of SpMxV implementations in three libraries: Trilinos, PETSc, and an in-house one. We carry out experiments with up to 512 processors and investigate the results with regression analysis. Our experiments show that the partitioning metrics influence the performance greatly in a distributed memory setting. the regression analyses demonstrate which metric is the most influential for the execution time of the libraries.
In this paper, we investigate multi-user interference cancellation (MU) schemes for deployment at roadside unit in vehicular ad hoc networks. Generally, MITI schemes can be divided into linear and nonlinear groups. In...
详细信息
ISBN:
(纸本)9781479973941
In this paper, we investigate multi-user interference cancellation (MU) schemes for deployment at roadside unit in vehicular ad hoc networks. Generally, MITI schemes can be divided into linear and nonlinear groups. In linear MUI schemes, successive and parallel interference cancellation schemes are widely used. In successive interference cancellation (SIC) schemes, the receiver will detect the user's data on a per user base, and immediately cancel the interference of the detected user for the next detection. On the contrary, parallel interference cancellation (PIC) schemes detect a group of users' data simultaneously, then cancel the interference of all users in the next round of operation. there exists error propagation problem in both successive and parallel interference cancellation schemes. To address the problem, ordered successive interference cancellation scheme has been proposed to improve the bit error rate (HER) performance by the receiver detecting the user withthe highest instantaneous signal-to-interference-plus-noise ratio (SINR) and canceling the interference of that user for another continuously. the process repeats until all users' data are detected. Besides, iterative processing techniques are also introduced to further improve the system BER performance. In this paper, we study and compare the interference cancellation schemes for uplink transmission in vehicular ad hoc networks. Simulation results show that the BER performance is much better in ordered successive interference cancellation schemes than boththe SIC and PIC schemes, especially in high signal-to-noise ratio regions where the multi-user interference becomes dominant. It is also shown through the study that the ordering procedure can efficiently avoid the error propagation problem in vehicular net works.
Planners need to become faster as we seek to tackle increasingly complicated problems. Much of the recent improvements in computer speed is due to multi-core processors. For planners to take advantage of these types o...
详细信息
ISBN:
(纸本)9781577356608
Planners need to become faster as we seek to tackle increasingly complicated problems. Much of the recent improvements in computer speed is due to multi-core processors. For planners to take advantage of these types of architectures, we must adapt algorithms for parallelprocessing. there are a number of planning domains where state expansions are slow. One example is robot motion planning, where most of the time is devoted to collision checking. In this work, we present PA*SE, a novel, parallel version of A* (and weighted A*) which parallelizes state expansions by taking advantage of this property. While getting close to a linear speedup in the number of cores, we still preserve completeness and optimality of A* (bounded sub-optimality of weighted A*). PA*SE applies to any planning problem in which significant time is spent on generating successor states and computing transition costs. We present experimental results on a robot navigation domain (x,y,heading) which requires expensive 3D collision checking for the PR2 robot. We also provide an in-depth analysis of the algorithm's performance on a 2D navigation problem as we vary the number of cores (up to 32) as well as the time it takes to collision check successors during state expansions.
In this paper we design two-level additive Schwarz method method for non-symmetric, elliptic problem in two dimensions withthe use of discerization by local discontinuous Galerkin method (LDG). To construct the preco...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
In this paper we design two-level additive Schwarz method method for non-symmetric, elliptic problem in two dimensions withthe use of discerization by local discontinuous Galerkin method (LDG). To construct the preconditioner, we use the domain decomposition method. We also want to show the result of numerical tests regarding to this preconditioner. Condition of the preconditioned system does not depend on the size of fine mesh h, but only on the ratio of the coarse mesh size H and the overlap measure delta.
Image resizing algorithms are a classic case of algorithms involving local operations over a region of pixels in an image. the objective is to produce a reduced or enlarged image while maintaining original information...
详细信息
ISBN:
(纸本)9781479949236
Image resizing algorithms are a classic case of algorithms involving local operations over a region of pixels in an image. the objective is to produce a reduced or enlarged image while maintaining original information content or minimizing the mean square error between corresponding pixels of original and resized images. Most resizing algorithms rely on pixel values within a pre-defined neighborhood of a pixel in the original image to compute pixel values in target images. High frequency or high energy pixel regions in an image are more prone to distortions/errors in the resized image. Content-aware algorithms minimize this impact at the cost of more computational complexity and cost. parallel/distributed implementations of such algorithms require an efficient methodology of image data partitioning to minimize interdependency of the processing units and/or memory storage to avoid shared memory access bottleneck. A restricted shared memory model is described herein that is well-tailored for most of the computational techniques used in image resizing algorithms. Implementation results are described for some well-known algorithmsthat demonstrate the suitability of the model and its scalability to cater for large image sizes.
Finding similarities between protein structures is a crucial task in molecular biology. Many tools exist for finding an optimal alignment between two proteins. these tools, however, only find one alignment even when m...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
Finding similarities between protein structures is a crucial task in molecular biology. Many tools exist for finding an optimal alignment between two proteins. these tools, however, only find one alignment even when multiple similar regions exist. We propose a new parallel heuristic-based approach to structural similarity detection between proteins that discovers multiple pairs of similar regions. We prove that returned alignments have RMSDc and RMSDd lower than a given threshold. Computational complexity is addressed by taking advantage of both fine-and coarse-grain parallelism.
Since magnetic materials are often composed of magnetically isolated chains, their magnetic properties can be described by the one-dimensional quantum Heisenberg model. the quantum transfer matrix (QTM) method based o...
详细信息
ISBN:
(数字)9783642551956
ISBN:
(纸本)9783642551956
Since magnetic materials are often composed of magnetically isolated chains, their magnetic properties can be described by the one-dimensional quantum Heisenberg model. the quantum transfer matrix (QTM) method based on a checkerboard structure has been applied for quantum alternating spin chains. To increase the length of the transfer matrix in the Trotter direction we apply the density-matrix renormalization technique and check the efficiency of parallelization for a part of the code: the construction of the transfer matrix. Next, using the Matrix Product State representation, the time evolution of the ground-state magnetization has been performed after the sudden change in applied field.
One of the key unresolved issues of image processing is the lack of methods for searching images similar to the reference image. this paper focuses on objects that there are in images and presents a method to compare ...
详细信息
ISBN:
(纸本)9783642552243
One of the key unresolved issues of image processing is the lack of methods for searching images similar to the reference image. this paper focuses on objects that there are in images and presents a method to compare the objects and search for images that contain objects belonging to the same classes. Taking advantage of the fact that local keypoints of images constitute a very good basis for further processing images, we use them for objects comparison. More precisely, the comparison of images is based on histograms, that are generated on the basis of the keypoints of objects contained in images. We present results of experiments which have been conducted for various classes of objects and histograms generated using the proposed method.
暂无评论