Complex networks are a technique for the modeling and analysis of large data sets in many scientific and engineering disciplines. Due to their excessive size conventional algorithms and single core processors struggle...
详细信息
ISBN:
(纸本)9781479904945;9781479904938
Complex networks are a technique for the modeling and analysis of large data sets in many scientific and engineering disciplines. Due to their excessive size conventional algorithms and single core processors struggle withthe efficient processing of such networks. Employing multi-core graphic processing units (GPUs) could provide sufficient processing power for the analysis of such networks. However, commonly designed algorithms cannot exploit these massively parallelprocessing power for the analysis of such networks. In this paper, we present the Multi Layer Network Decomposition (MLND) approach which provides a general approach for parallel network analysis using multi-core processors via efficient partitioning and mapping of networks onto GPU architectures. Evaluation using a 336 core GPU graphic card demonstrated a 16x speed-up in complex network analysis relative to a CPU based approach.
Many hardware efficient algorithms exists for hardware signal processing architecture. Among these algorithm is a set of shift-add algorithms collectively known as CORDIC (COrdinate Rotation for Digital Computers) for...
详细信息
Duplication has proved to be a vital technique for scheduling task graphs on a network of unrelated parallel machines. Few attempts have been made to model duplication in a Mixed Integer Linear Program (MILP) to reduc...
详细信息
ISBN:
(纸本)9780769548982;9781467345668
Duplication has proved to be a vital technique for scheduling task graphs on a network of unrelated parallel machines. Few attempts have been made to model duplication in a Mixed Integer Linear Program (MILP) to reduce schedule length. Other known optimal MILPs duplicate a job on all the available processing elements and this increases their complexities. this paper proposes a new REStricted Duplication (RESDMILP) approach to model duplication in a MILP. the complexity of this model increases withthe increase in the amount of duplication. Experiments conducted have revealed that RESDMILP achieves better runtimes when the problem instance is solved optimally and provides better lower bound and percentage gap if it is run for a fixed amount of time. the percentage gap is defined as, (UB-LB)/UB where UB and LB are the upper and lower bounds achieved by the MILPs respectively.
A novel fast scheme for Discrete Wavelet Transform (DWT) was lately introduced under the name of lifting scheme [4, 10]. this new scheme presents many advantages over the convolution-based approach [10, 11]. For insta...
详细信息
ISBN:
(纸本)9780769532875
A novel fast scheme for Discrete Wavelet Transform (DWT) was lately introduced under the name of lifting scheme [4, 10]. this new scheme presents many advantages over the convolution-based approach [10, 11]. For instance it is very suitable for parallelization. In this paper we present two new FPGA-based parallel implementations of the DWT lifting-based scheme. the first implementation uses pipelining, parallelprocessing and data reuse to increase the speed up of the algorithm. In the second architecture a controller is introduced to deploy dynamically a suitable number of clones accordingly to the available hardware resources on a targeted environment. these two architectures are able of processing large size incoming images or multi framed images in real-time. the simulations driven on a Xilinx Virtex-5 FPGA environment has proven the practical efficiency of our contribution. In fact, the first architecture has given an operating frequency of 289 MHz, and the second architecture demonstrated the controller's capabilities of determining the true available resources needed for a successful deployment of independent clones, over a targeted FPGA environment and processingthe task in parallel.
Intrusion detection systems need to be both accurate and fast. Speed is important especially when operating at the network level. Additionally, many intrusion detection systems rely on signature based detection approa...
详细信息
One of the problems with using parallel Paradigms to program parallelarchitectures is the choice of the paradigm which is best suited to the characteristics of the program to be developed/parallelized, and of the tar...
详细信息
ISBN:
(纸本)3540663630
One of the problems with using parallel Paradigms to program parallelarchitectures is the choice of the paradigm which is best suited to the characteristics of the program to be developed/parallelized, and of the target architecture, in terms of performance of the parallel implementation. Another problem arising withparallelization of legacy codes is the attempt to minimize the effort needed for program comprehension, and thus to achieve the minimum restructuring of the sequential code when producing the parallel version. In this paper we address these issues for the Divide and Conquer class of algorithms/programs.
In this paper we present a novel approach for designing highly reliable and optimal fault - tolerant systolic array architectures. In our approach, fault - tolerant algorithms are designed by introducing redundant com...
详细信息
Work-efficient task-parallelalgorithms enforce ordering between tasks using queuing primitives. Such algorithms offer limited parallelism due to queuing constraints that result in data movement and synchronization bo...
详细信息
ISBN:
(纸本)9781728136134
Work-efficient task-parallelalgorithms enforce ordering between tasks using queuing primitives. Such algorithms offer limited parallelism due to queuing constraints that result in data movement and synchronization bottlenecks. Speculatively relaxing order of tasks across cores using the Galois framework shows promise as false dependencies generated by strict queuing constraints are mitigated to unlock task parallelism. However, relaxed ordering results in redundant work, for which Galois relies on static measures to improve work-efficiency. this paper proposes a dynamic multi-level parent-child task dependency checking mechanism in Galois to prune redundant work by exploiting monotonic properties of shared data values. Evaluation on a 40-core Intel Xeon multicore shows an average of 2x performance improvements over state-of-the-art ordered and relax ordered graph algorithms.
processing of big scale-free graphs on parallelarchitectures with high parallelization opportunities connected with a lot of overheads. Due to skewed degree distribution each thread receives different amount of compu...
详细信息
ISBN:
(纸本)9783319654829;9783319654812
processing of big scale-free graphs on parallelarchitectures with high parallelization opportunities connected with a lot of overheads. Due to skewed degree distribution each thread receives different amount of computational workload. In this paper we present a method devoted to address this challenge by modificating CSR data structure and redistributing work across threads. the method was implemented in breadth-first search and single source shortest pathalgorithms for GPU architecture.
Simultaneous Localization and Mapping plays an integral role in the field of autonomous robotics for ensuring proper navigation and exploration in unknown environments. Applications of SLAM range from surveillance and...
详细信息
暂无评论