In this paper, we are extending the NOD sorting algorithm which is implemented on Diamond architecture. this algorithm which we named it ENOD (Extended Neighborhood sort On Diamond) sorts n data elements with 7/4n pro...
详细信息
ISBN:
(纸本)9789604741670
In this paper, we are extending the NOD sorting algorithm which is implemented on Diamond architecture. this algorithm which we named it ENOD (Extended Neighborhood sort On Diamond) sorts n data elements with 7/4n processors. However, most popular environments provide little explicit support for parallelism, leading to the common view that "concurrency is hard". there are architectures and sorting algorithmsthat are used but we always endeavor to find new and optimal ones. the algorithm on the Diamond architecture sorts data elements using 7/4n processors with a running time of O(log2 n). this architecture is heterogeneous and uses cheaper processors more than expensive ones. though, this architecture withthe issued algorithm makes a tradeoff between number of processors and their cost. the ENOD algorithm is simpler and more intuitive than the algorithmsthat are available.
As the fast development of Bluetooth networks and wireless communications, the mobile devices share information with each other easier than ever before. However, the handy communication technology accompanies privacy ...
详细信息
ISBN:
(纸本)9783642131189
As the fast development of Bluetooth networks and wireless communications, the mobile devices share information with each other easier than ever before. However, the handy communication technology accompanies privacy and security issues. Nowadays, a Bluetooth adopts peer-to-peer and Frequency Hopping Spread Spectrum (FHSS) mechanisms to avoid data reveal, but the malicious attacks collect the transmission data of the relay station for a long period of time and then can break into the system. In this study, we take a Piconet as a cube, and transform a Scatternet into a cluster (N-cube) structure. Subsequently, this study exploits the Elliptic Curve Diffie-Hellman (ECDH) [1] and the conference Key (CK) schemes to perform session key agreements and secure data transmissions. the proposed scheme only needs a small key length 160-bit to achieve compatible security levels on 1024-bit Diffee-Hellman (DH) [2], and each node uses few CPU, memory and bandwidth to complete security operations. As a result, the proposed fault-tolerant routing algorithm with secure data transmissions can perform rapidly and efficiently, and is quite suited for Bluetooth networks with limited resources.
the contribution deals withthe development of a 3-D finite-element package called GEM and its aspirations in demanding mathematical modelling and simulations arising in geosciences. On the background of two complex a...
详细信息
ISBN:
(纸本)9783642143892
the contribution deals withthe development of a 3-D finite-element package called GEM and its aspirations in demanding mathematical modelling and simulations arising in geosciences. On the background of two complex applications from the presently running projects, formulated as linear elasticity and thermo-elasticity problems, the most;important;characteristics, especially those of the solvers, are presented. Features related to high performance computing, including parallelprocessing, are focused on.
this work presents a new method to numerically calculate the signal to jitter distortion ratio (SDjR) for any m odulated input which is bandpass sampled in the presence of ji tter. the numerical method matches the der...
详细信息
Withthe advent of multi-core processors the problem of designing application that efficiently can utilize it performance become more and more important. Moreover developing programs for these processors requires from...
详细信息
ISBN:
(纸本)9783642143892
Withthe advent of multi-core processors the problem of designing application that efficiently can utilize it performance become more and more important. Moreover developing programs for these processors requires from the programmers some additional, specific knowledge about the processor architecture. In multi-core systems efficient program execution is the main issue. It can even happen that switching from sequential to parallel computation can lead to decreasing of performance. the paper deals withthe short description of SliCer, the hardware independent tool that parallelizes serial programs in automatic way depending on the number of available processing units by creating the proper number of threads that can be later execute in parallel.
the paper describes results of minimax tree searching algorithm implemented within CUDA platform. the problem regards move choice strategy in the game of Reversi. the parallelization scheme and performance aspects are...
详细信息
ISBN:
(纸本)9783642143892
the paper describes results of minimax tree searching algorithm implemented within CUDA platform. the problem regards move choice strategy in the game of Reversi. the parallelization scheme and performance aspects are discussed, focusing mainly on warp divergence problem and data transfer size. Moreover, a method of minimizing warp divergence and performance degradation is described. the paper contains boththe results of test performed on multiple CPUs and GPUs. Additionally, it discusses alpha beta parallel pruning implementation.
Efficient utilization of the inherent parallelism of multi-core architectures is a grand challenge in the field of electronic design automation (EDA). One EDA algorithm associated with a high computational cost is aut...
详细信息
Efficient utilization of the inherent parallelism of multi-core architectures is a grand challenge in the field of electronic design automation (EDA). One EDA algorithm associated with a high computational cost is automatic test pattern generation (ATPG). We present the ATPG tool TIGUAN based on a thread-parallel SAT solver. Due to a tight integration of the SAT engine into the ATPG algorithm and a carefully chosen mix of various optimization techniques, multi-million-gate industrial circuits are handled without aborts. TIGUAN supports both conventional single-stuck-at faults and sophisticated conditional multiple stuck-at faults which allows to generate patterns for non-standard fault models. We demonstrate how TIGUAN can be combined with conventional structural ATPG to extract full benefit of the intrinsic strengths of both approaches.
Finite-Difference Time-Domain (FDTD) has been proved to be a very useful computational electromagnetic algorithm. However, the scheme based on traditional general purpose processors can be computationally prohibitive ...
详细信息
ISBN:
(纸本)9783642131189
Finite-Difference Time-Domain (FDTD) has been proved to be a very useful computational electromagnetic algorithm. However, the scheme based on traditional general purpose processors can be computationally prohibitive and require thousands of CPU hours, which hinders the large-scale application of FDTD. With rapid progress on GPU hardware capability and its programmability, we propose in this paper a novel scheme in which GPU is applied to accelerate three-dimensional FDTD with UPML absorbing boundary conditions. this GPU-based scheme can reduce the computation time significantly, while obtaining high accuracy as compared withthe CPU-based scheme. With only one AMD ATI HD4850 GPU, when computational domain is up to (180x80x180), our implementation of the GPU-based FDTD performs approximately 93 times faster than the one running with Intel E2180 dual cores CPU.
the eigenvalues and eigenvectors of a symmetric matrix are of interest in a myriad of applications. One of the fastest and most accurate numerical techniques for the eigendecomposition is the Algorithm of Multiple Rel...
详细信息
ISBN:
(纸本)9783642143892
the eigenvalues and eigenvectors of a symmetric matrix are of interest in a myriad of applications. One of the fastest and most accurate numerical techniques for the eigendecomposition is the Algorithm of Multiple Relatively Robust Representations (MRRR), the first stable algorithm that computes the eigenvalues and eigenvectors of a tridiagonal symmetric matrix in O(n(2)) arithmetic operations. In this paper we present a parallelization of the MRRR algorithm for data parallel coprocessors using the CUDA programming environment. the results demonstrate the potential of data-parallel coprocessors for scientific computations: compared to routine sstemr, LAPACK's implementation of MRRR, our parallel algorithm provides 10-fold speedups.
this paper describes a Field Programmable Gate Array hardware based Deep Packet Inspection Engine that uses regular expression matchers to simultaneously categorize and look for malicious signatures in Ethernet packet...
详细信息
暂无评论