Traditionally, the block-based medial axis transform (BB-MAT) and the chessboard distance transform (CDT) were usually viewed as two completely different image computation problems, especially for three dimensional (3...
详细信息
Traditionally, the block-based medial axis transform (BB-MAT) and the chessboard distance transform (CDT) were usually viewed as two completely different image computation problems, especially for three dimensional (3D) space. In fact, there exist some equivalent properties between them. The relationship between both of them is first derived and proved in this paper. One of the significant properties is that CDT for 3D binary image V is equal to BB-MAT for image V' where it denotes the inverse image of V. In a parallel algorithm, a cost is defined as the product of the time complexity and the number of processors used. The main contribution of this work is to reduce the costs of 3D BB-MAT and 3D CDT problems proposed by Wang [65]. Based on the reverse-dominance technique which is redefined from dominance concept, we achieve the computation of the 3D CDT problem by implementing the 3D BB-MAT algorithm first. For a 3D binary image of size N-3, our parallel algorithm can be run in O(logN) time using N3 processors on the concurrent read exclusive write (CREW) parallel random access machine (PRAM) model to solve both 3D BB-MAT and 3D CDT problems, respectively. The presented results for the cost are reduced in comparison with those of Wang's. To the best of our knowledge, this work is the lowest costs for the 3D BB-MAT and 3D CDT algorithms known. In parallel algorithms, the running time can be divided into computation time and communication time. The experimental results of the running, communication and computation times for the different problem sizes are implemented in an HP Superdome with SMP/CC-NUMA (symmetric multiprocessor/cache coherent non-uniform memory access) architecture. We conclude that the parallel computer (i.e., SMP/CC-NUMA architecture or cluster system) is more suitable for solving problems with a large amount of input size. (C) 2010 Elsevier BM. All rights reserved.
Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show that efficient simulation algorithms can be realized on CPUs either by CUDA or by OpenCL programming. We consider a depo...
详细信息
Restricted solid on solid surface growth models can be mapped onto binary lattice gases. We show that efficient simulation algorithms can be realized on CPUs either by CUDA or by OpenCL programming. We consider a deposition/evaporation model following Kardar-Parisi-Zhang growth in 1 + 1 dimensions related to the Asymmetric Simple Exclusion Process and show that for sizes, that fit into the shared memory of CPUs one can achieve the maximum parallelization speedup (similar to x 100 for a Quadro FX 5800 graphics card with respect to a single CPU of 2.67 GHz). This permits us to study the effect of quenched columnar disorder, requiring extremely long simulation times. We compare the CUDA realization with an OpenCL implementation designed for processor clusters via MPI. A two-lane traffic model with randomized turning points is also realized and the dynamical behavior has been investigated. (C) 2011 Elsevier B.V. All rights reserved.
The independent spanning trees (ISTs) problem attempts to construct a set of pairwise independent spanning trees and it has numerous applications in networks such as data broadcasting, scattering and reliable communic...
详细信息
The independent spanning trees (ISTs) problem attempts to construct a set of pairwise independent spanning trees and it has numerous applications in networks such as data broadcasting, scattering and reliable communication protocols. The well-known ISTs conjecture, Vertex/Edge Conjecture, states that any n-connected/n-edge-connected graph has n vertex-ISTs/edge-ISTs rooted at an arbitrary vertex r. It has been shown that the Vertex Conjecture implies the Edge Conjecture. In this paper, we consider the independent spanning trees problem on the n-dimensional locally twisted cube LTQ(n). The very recent algorithm proposed by Hsieh and Tu (2009) [12] is designed to construct n edge-ISTs rooted at vertex 0 for LTQ(n). However, we find out that LTQ(n) is not vertex-transitive when n >= 4;therefore Hsieh and Tu's result does not solve the Edge Conjecture for LTQ(n),,. In this paper, we propose an algorithm for constructing n vertex-ISTs for LTQ(n);consequently, we confirm the Vertex Conjecture (and hence also the Edge Conjecture) for LTQ(n),. (C) 2011 Elsevier B.V. All rights reserved.
Mission-driven sensor networks usually have special lifetime requirements. However, the density of the sensors may not be large enough to satisfy the coverage requirement while meeting the lifetime constraint at the s...
详细信息
Mission-driven sensor networks usually have special lifetime requirements. However, the density of the sensors may not be large enough to satisfy the coverage requirement while meeting the lifetime constraint at the same time. Sometimes, coverage has to be traded for network lifetime. In this paper, we study how to schedule sensors to maximize their coverage during a specified network lifetime. Unlike sensor deployment, where the goal is to maximize the spatial coverage, our objective is to maximize the spatial-temporal coverage by scheduling sensors' activity after they have been deployed. Since the optimization problem is NP-hard, we first present a centralized heuristic whose approximation factor is proved to be 1 2, and then, propose a distributed parallel optimization protocol (POP). In POP, nodes optimize their schedules on their own but converge to local optimality without conflict with one another. Theoretical and simulation results show that POP substantially outperforms other schemes in terms of network lifetime, coverage redundancy, convergence time, and event detection probability.
As processing power becomes cheaper and more available by using cluster of computers, the needs for parallel algorithms, which can harness these computing potentials, are increasing. Automatic database normalization i...
详细信息
ISBN:
(纸本)9781424455690
As processing power becomes cheaper and more available by using cluster of computers, the needs for parallel algorithms, which can harness these computing potentials, are increasing. Automatic database normalization is an application of parallel algorithms. Normalization is the most exercised technique for the analysis of relational databases. It aims at creating a set of relational tables with minimum data redundancy that preserve consistency and facilitate correct insertion, deletion, and modification. While existing sequential algorithms are usually much time consuming, especially the process of transforming relations into 3NF, in this paper, we have proposed parallel algorithms for automatic database normalization. The proposed algorithms have been examined with MPI and its implementation results on EDM showed that parallel approach reduces the time, efficiently. Exploiting p processors has reduced the time of Automatic Database Normalization to n(2).m/p+c in which c is the communication overhead between the processors, m is the number of simple keys, and n is the number of determinant keys.
All pairs shortest path (APSP) is a classical problem with diverse applications. Traditional algorithms are not suitable for real time applications, so it is necessary to investigate parallel algorithms. This paper pr...
详细信息
All pairs shortest path (APSP) is a classical problem with diverse applications. Traditional algorithms are not suitable for real time applications, so it is necessary to investigate parallel algorithms. This paper presents an improved matrix multiplication method to solve the APSO problem. Afterwards, the pulse coupled neural network (PCNN) is employed to realize the parallel computation. The time complexity of our strategy is only O (log(2) n), where n stands for the number of nodes. It is the fastest parallel algorithm compared to traditional PCNN, MOPCNN, and MPCNN methods. (C) 2011 Elsevier Inc. All rights reserved.
The availability of a newcarry-lessmultiplication instruction in the latest Intel desktop processors significantly accelerates multiplication in binary fields and hence presents the opportunity for reevaluating algori...
详细信息
The availability of a newcarry-lessmultiplication instruction in the latest Intel desktop processors significantly accelerates multiplication in binary fields and hence presents the opportunity for reevaluating algorithms for binary field arithmetic and scalar multiplication over elliptic curves. We describe how to best employ this instruction in field multiplication and the effect on performance of doubling and halving operations. Alternate strategies for implementing inversion and half-trace are examined to restore most of their competitiveness relative to the new multiplier. These improvements in field arithmetic are complemented by a study on serial and parallel approaches for Koblitz and random curves, where parallelization strategies are implemented and compared. The contributions are illustrated with experimental results improving the state-of-the-art performance of halving and doubling-based scalar multiplication on NIST curves at the 112-and 192-bit security levels and a newspeed record for side-channel-resistant scalar multiplication in a random curve at the 128-bit security level. The algorithms presented in this work were implemented on Westmere and Sandy Bridge processors, the latest generation Intel microarchitectures.
The Orbit problem is defined as follows: Given a matrix A is an element of Q(nxn) and vectors x, y is an element of Q(n), does there exist a non-negative integer i such that A(i)x = y. This problem was shown to be in ...
详细信息
The Orbit problem is defined as follows: Given a matrix A is an element of Q(nxn) and vectors x, y is an element of Q(n), does there exist a non-negative integer i such that A(i)x = y. This problem was shown to be in deterministic polynomial time by Kannan and Lipton (J. ACM 33(4): 808-821, 1986). In this paper we place the problem in the logspace counting hierarchy GapLH. We also show that the problem is hard for C(=)L with respect to logspace many-one reductions.
A new VLSI algorithm and its associated systolic array architecture for a prime length type IV discrete cosine transform is presented. They represent the basis of an efficient design approach for deriving a linear sys...
详细信息
A new VLSI algorithm and its associated systolic array architecture for a prime length type IV discrete cosine transform is presented. They represent the basis of an efficient design approach for deriving a linear systolic array architecture for type IV DCT. The proposed algorithm uses a regular computational structure called pseudoband correlation structure that is appropriate for a VLSI implementation. The proposed algorithm is then mapped onto a linear systolic array with a small number of I/O channels and low I/O bandwidth. The proposed architecture can be unified with that obtained for type IV DST due to a similar kernel. A highly efficient VLSI chip can be thus obtained with good performance in the architectural topology, computing parallelism, processing speed, hardware complexity and I/O costs similar to those obtained for circular correlation and cyclic convolution computational structures.
Based on the full domain partition, a parallel finite element algorithm for the stationary Stokes equations is proposed and analyzed. In this algorithm, each subproblem is defined in the entire domain. Majority of the...
详细信息
Based on the full domain partition, a parallel finite element algorithm for the stationary Stokes equations is proposed and analyzed. In this algorithm, each subproblem is defined in the entire domain. Majority of the degrees of freedom are associated with the relevant subdomain. Therefore, it can be solved in parallel with other subproblems using an existing sequential solver without extensive recoding. This allows the algorithm to be implemented easily with low communication costs. Numerical results are given showing the high efficiency of the parallel algorithm.
暂无评论