Pointing to dimension limit of serial feature fusion method,and quantity limit of parallel complex vector feature fusion method,an evolution of parallel vector feature fusion method based on quater
ISBN:
(纸本)0780397371
Pointing to dimension limit of serial feature fusion method,and quantity limit of parallel complex vector feature fusion method,an evolution of parallel vector feature fusion method based on quater
this paper discusses fast parallelalgorithms for evaluating several centrality indices frequently used in complex network analysis. these algorithms have been optimized to exploit properties typically observed in rea...
详细信息
ISBN:
(纸本)0769526365
this paper discusses fast parallelalgorithms for evaluating several centrality indices frequently used in complex network analysis. these algorithms have been optimized to exploit properties typically observed in real-world large scale networks, such as the low average distance, high local density, and heavy-tailed power law degree distributions. We test our implementations on real datasets such as the web graph, protein-interaction networks, movie-actor and citation networks, and report impressive parallel performance for evaluation of the computationally intensive centrality metrics (betweenness and closeness centrality) on high-end shared memory symmetric multiprocessor and multithreaded architectures. To our knowledge, these are the first parallel implementations of these widely-used social network analysis metrics. We demonstrate that it is possible to rigorously analyze networks three orders of magnitude larger than instances that can be handled by existing network analysis (SNA) software packages. For instance, we compute the exact betweenness centrality value for each vertex in a large US patent citation network (3 million patents, 16 million citations) in 42 minutes on 16 processors, utilizing 20GB RAM of the IBM p5 570. Current SNA packages on the other hand cannot handle graphs with more than hundred thousand edges.
the wavelet packet provides an accurate method for image fusion. However, the computation and time cost will increase withthe size of image. therefore it is difficult to achieve real-time fusion. After having analyse...
详细信息
ISBN:
(纸本)9780780397361
the wavelet packet provides an accurate method for image fusion. However, the computation and time cost will increase withthe size of image. therefore it is difficult to achieve real-time fusion. After having analysed the wavelet packet based image fusion algorithm on the single computer, according to the time complexity and the parallel character of the wavelet packet algorithm, we propose a wavelet packet based parallel image fusion algorithm under the situation of distributed storage. Based on the data local property of wavelet packet transform, we achieve the algorithm on the MPI (Message Passing Interface) parallel computing platform, which is composed by Pentium PC and 1000 Mbps Ethernet. the efficiency of parallel computing is studied under different image size and different cluster size. the experimental result shows that the algorithm has good parallel computing property.
this paper introduces the design of floating-point (FP) arithmetic units in common use based on FPGA, including the conversion between FP data and fixingpoint data,FP addition,subtraction,multiplication and *** of the...
详细信息
ISBN:
(纸本)0780397371
this paper introduces the design of floating-point (FP) arithmetic units in common use based on FPGA, including the conversion between FP data and fixingpoint data,FP addition,subtraction,multiplication and *** of them are pipeline architectures and specified in VHDL,are fully synthesizable with performance comparable to other available high speed *** emphasis is put on the application of FP data in *** an example,the FP operation modules are used in quadrature sampling of Intermediate Frequency(IF) signal,to show that a much higher performance can be obtained.
this paper presented an improved word-level sequential scheme and parallel architecture for bit plane coding of EBCOT used in JPEG 2000. the bit plane coding adopted by EBCOT is divided into two stages: coding pass pr...
详细信息
ISBN:
(纸本)9780780397361
this paper presented an improved word-level sequential scheme and parallel architecture for bit plane coding of EBCOT used in JPEG 2000. the bit plane coding adopted by EBCOT is divided into two stages: coding pass prediction and context formation, which work in parallel and pipelined. Word-level and sequential bit plane coding could be achieved that coefficient bits modelling in different bit plane are performed concurrently and, all three passes coding included in each bit plane are completed in one scan. Experimental result demonstrates that the proposed architecture could efficiently reduce hardware complexity, compared to the up-to-date design.
To combine presented MIMO scheme with multiuser detectors for uplink will suffer from the problems of high computation complexity and channenl coherency. So, in this paper we propose a MIMO multiuser detection (MUD) s...
详细信息
ISBN:
(纸本)9780780397361
To combine presented MIMO scheme with multiuser detectors for uplink will suffer from the problems of high computation complexity and channenl coherency. So, in this paper we propose a MIMO multiuser detection (MUD) scheme that reduces considerably the system computation complexity. the proposed algorithm adopts inverse channel matrix for MIMO decoding, which is not sensitive to the coherency of channels. Because of the scattering characteristic of the MIMO channel, the inverse channel matrices are always nonsingular, which keeps the receivers can get stable spatial diversity gain. the MUD algorithms can be realized using a parallel modular architecture. MUD is based on a Minimum Mean Square Error (MMSE) algorithm. Simulation results show that our MIMO-MUD performs much better than presented MIMO-MUD for the same order of complexity, though the MIMO CDMA system has only two antennas at each BS and two antennas at each mobile station.
Methods for an efficient mapping of algorithms to parallelarchitectures are of utmost importance because many state-of-the-art embedded digital systems deploy parallelism to increase their computational power. this p...
详细信息
ISBN:
(纸本)9780889866386
Methods for an efficient mapping of algorithms to parallelarchitectures are of utmost importance because many state-of-the-art embedded digital systems deploy parallelism to increase their computational power. this paper deals withthe mapping of loop programs onto processor arrays implemented in an FPGA or available as (reconfigurable) coarsegrained processor architectures. Most existing work is closely related to approaches from the DSP domain and is not able to exploit the full parallelism of a given algorithm and the computational potential of a typical 2-dimensional array. In contrast, we present a mapping methodology which incorporates many important parameters of the target architecture in one approach. these are: number of processing elements, resources of the data path and memory within a processing element, and interconnection within the processor array. Based on these parameters, we formulate an optimization problem whose solution specifies an efficient mapping of an algorithm to the target architecture. We can optimize for speed of the algorithm and/or hardware cost caused by the communication and computation resources of the architecture.
A language for semi-structured documents, XML has emerged as the core of the web services architecture, and is playing crucial roles in messaging systems, databases, and document processing. However, the processing of...
详细信息
ISBN:
(纸本)9781424403431
A language for semi-structured documents, XML has emerged as the core of the web services architecture, and is playing crucial roles in messaging systems, databases, and document processing. However, the processing of XML documents has a reputation for poor performance, and a number of optimizations have been developed to address this performance problem from different perspectives, none of which have been entirely satisfactory. In this paper, we present a seemingly quixotic, but novel approach: parallel XML parsing. parallel XML parsing leverages the growing prevalence of multicore architectures in all sectors of the computer market, and yields significant performance improvements. this paper presents our design and implementation of parallel XML parsing. Our design consists of an initial preparsing phase to determine the structure of the XML document, followed by a full, parallel parse. the results of the preparsing phase are used to help partition the XML document for data parallelprocessing. Our parallel parsing phase is a modification of the libxml2 [1] XML parser, which shows that our approach applies to real-world, production quality parsers. Our empirical study shows our parallel XML parsing algorithm can improved the XML parsing performance significantly and scales well.
Based on Horner' rule and Baugh-Wooley algorithm, this paper presents two novel bit-level parallel array algorithms of 2's complement multiplication, and the algorithms have been mapped to systolic arrays by u...
详细信息
ISBN:
(纸本)0780395840
Based on Horner' rule and Baugh-Wooley algorithm, this paper presents two novel bit-level parallel array algorithms of 2's complement multiplication, and the algorithms have been mapped to systolic arrays by using linear mapping techniques. We propose two efficient systolic arrays of multiply and accumulate (MAC) operation and we also describe the vector-vector and matrix-matrix multiplication that can be efficiently implemented by using the MAC arrays. the two systolic arrays have high performance (low time complexity, space complexity and latency) and consume smaller gate-area in comparison to other architectures. It is suitable for VLSI implementation for its regularity and modularity.
An efficient parallelprocessing method for deblocking filter design in H.264 video coding standard is presented in this paper. In order to reduce the memory reference and make the intermediate data reused as soon as ...
详细信息
暂无评论