We present parallel algorithms to find cut vertices, bridges, and Hamiltonian Path in bounded interval tolerance graphs. For a graph with n vertices, the algorithms require O (log n) time and use O (n) processors to r...
详细信息
ISBN:
(纸本)0769511538
We present parallel algorithms to find cut vertices, bridges, and Hamiltonian Path in bounded interval tolerance graphs. For a graph with n vertices, the algorithms require O (log n) time and use O (n) processors to run OR. Concurrent Read Exclusive Write parallel RAM (CREW PRAM) model of computation. Our approach transforms the original graph problem to a problem in computational geometry. The total work done by the parallel algorithms is comparable to the work done by the best known sequential algorithms for the more restricted class of graphs, namely, interval graphs and permutation graphs. In this sense our algorithms have optimal complementary.
In this paper by means of an abstract model of the SIMD type with vertical data processing (the STAR-machine), we present a simple associative parallel algorithm for finding tree paths in undirected graphs. We study a...
详细信息
ISBN:
(纸本)0769517315
In this paper by means of an abstract model of the SIMD type with vertical data processing (the STAR-machine), we present a simple associative parallel algorithm for finding tree paths in undirected graphs. We study applications of this algorithm to update minimum spanning trees in undirected graphs, to determine maximum flow values in a multiterminal network, and to find a fundamental set of circuits with respect to a given spanning tree. These algorithms are given as the corresponding STAR procedures whose correctness is proved and time complexity is evaluated.
For a given ordered graph (G, <), we consider the smallest (strongly) chordal graph G' containing G with < as a (strongly) perfect elimination ordering. We call (G, <) a compact representation of G'. ...
详细信息
ISBN:
(纸本)3540626166
For a given ordered graph (G, <), we consider the smallest (strongly) chordal graph G' containing G with < as a (strongly) perfect elimination ordering. We call (G, <) a compact representation of G'. We show that the computation of a depth-first search tree and a breadth-first search tree can be done in polylogarithmic time with a linear processor number with respect to the size of the compact representation in parallel. We consider also the problems to find a maximum clique and to develop a data structure extension that allows an adjacency query in polylogarithmic time.
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of CUDA-enabled GPU architecture. It has multiple streaming multiprocessors with a shared memory, and the globa...
详细信息
ISBN:
(纸本)9781479984909
The Hierarchical Memory Machine (HMM) is a theoretical parallel computing model that captures the essence of CUDA-enabled GPU architecture. It has multiple streaming multiprocessors with a shared memory, and the global memory that can be accessed by all threads. The HMM has several parameters: the number d of streaming multiprocessors, the number p of threads per streaming multiprocessor, the number w of memory banks of each shared memory and the global memory, shared memory latency 1, and global memory latency L. The main purpose of this paper is to discuss optimality of fundamental parallel algorithms running on the HMM. We first show that image convolution for an image with n x n pixels using a filter of size (2v +1) x (2v +1) can be done in O(n(2)/w + n(2)L/dp + n(2)v(2/)dw + n(2)v(2)l/dp) time units on the HMM. Further, we show that this parallel implementation is time optimal by proving the lower bound of the running time. We then go on to show that the product of two n x n matrices can be computed in O (n(3)/mw + n(3)L/mdp + n(3/)dw - n(3)l/dp) time units on the HMM if the capacity of the shared memory in each streaming multiprocessor is O(m(2)). This implementation is also proved to be time optimal. We further clarify the conditions for image convolution and matrix multiplication to hide the memory access latency overhead and to maximize the global memory throughput and the parallelism. Finally, we provide experimental results on GeForce GTX Titan to support our theoretical analysis.
The complete Voronoi map of a binary image with black and white pixels is a matrix of the same size such that each element is the closest black pixel of the corresponding pixel. The complete Voronoi map visualizes the...
详细信息
ISBN:
(纸本)9781538610428
The complete Voronoi map of a binary image with black and white pixels is a matrix of the same size such that each element is the closest black pixel of the corresponding pixel. The complete Voronoi map visualizes the influence region of each black pixel. However, each region may not be connected due to exclave pixels. The connected Voronoi map is a modification of the complete Voronoi map so that all regions are connected. The Euclidean distance map of a binary image is a matrix, in which each element is the distance to the closest black pixel. It has many applications of image processing such as dilation, erosion, blurring effects, skeletonization and matching. The main contribution of this paper is to present simple and fast parallel algorithms for computing the complete/connected Voronoi maps and the Euclidean distance map and implement them in the GPU. Our parallel algorithm first computes the mixed Voronoi map, which is a mixture of the complete and connected Voronoi maps, and then converts it into the complete/connected Voronoi by exposing/hiding all exclave pixels. After that, the complete Voronoi map is converted into the Euclidean distance map by computing the distance to the closest black pixel for every pixel in an obvious way. The experimental results on GeForce GTX 1080 GPU show that the computing time for these conversions is relatively small. The throughput of our GPU implementation for computing the Euclidean distance maps of 2K x 2K binary images is up to 2.08 times larger than the previously published best GPU implementation, and up to 172 times larger than CPU implementation using Intel Core i7-4790.
We are interested in the fringe analysis of synchronized parallel insertion algorithms on 2-3 trees, namely the algorithm of W. Paul, U. Vishkin and H. Wagener (PVW). This algorithm inserts k keys into a tree of size ...
详细信息
ISBN:
(纸本)354065142X
We are interested in the fringe analysis of synchronized parallel insertion algorithms on 2-3 trees, namely the algorithm of W. Paul, U. Vishkin and H. Wagener (PVW). This algorithm inserts k keys into a tree of size n with parallel time O(log n + log K). Fringe analysis studies the distribution of the bottom subtrees and it is still an open problem for parallel algorithms on search trees. To tackle this problem we introduce a new kind of algorithms whose two extreme cases seems to upper and lower bounds the performance of the PVW algorithm. We extend the fringe analysis to parallel algorithms and we get a rich mathematical structure giving new interpretations even in the sequential case. The process of insertions is modeled by a Markov chain and the coefficients of the transition matrix are related with the expected local behavior of our algorithm. Finally we show that this matrix has a power expansion over (n+1)(-1) where the coefficients are the binomial transform of the expected local behavior. This expansion shows that the parallel case can be approximated by iterating the sequential case.
Frequent itemset mining is a classic problem in data mining. It is a non-supervised process which concerns in finding frequent patterns (or itemsets) hidden in large volumes of data in order to produce compact summari...
详细信息
ISBN:
(纸本)0769520464
Frequent itemset mining is a classic problem in data mining. It is a non-supervised process which concerns in finding frequent patterns (or itemsets) hidden in large volumes of data in order to produce compact summaries or models of the database. These models are typically used to generate association rules, but recently they have also been used in far reaching domains like e-commerce and bio-informatics. Because databases are increasing in terms of both dimension (number of attributes) and size (number of records), one of the main issues in a frequent itemset mining algorithm is the ability to analyze very large databases. Sequential algorithms do not have this ability, especially in terms of run-time performance, for such very large databases. Therefore, we must rely on high performance parallel and distributed computing. We present new parallel algorithms for frequent itemset mining. Their efficiency is proven through a series of experiments on different parallel environments, that range from shared-memory multiprocessors machines to a set of SMP clusters connected together through a high speed network. We also briefly discuss an application of our algorithms to the analysis of large databases collected by a Brazilian web portal.
In a cellular network, the base stations are not necessarily uniformly distributed, and their corresponding cell sizes are not necessarily the same. For example, a cell in a well-populated city cell is usually smaller...
详细信息
ISBN:
(纸本)0769517609
In a cellular network, the base stations are not necessarily uniformly distributed, and their corresponding cell sizes are not necessarily the same. For example, a cell in a well-populated city cell is usually smaller than a cell in a rural area. To study a cellular network with non-uniform cell sizes, one approach is to use a virtual cellular network with a uniform cell size such that each virtual cell contains at most one base station. This paper has proposed parallel algorithms for meshes with multiple broadcasting to construct virtual mesh and honeycomb cellular networks for non-uniformly distributed base stations. The constructed virtual cellular networks are optimal in the sense that their corresponding uniform cell sizes reach the largest possible. The algorithms run in O(logn) time on a mesh with multiple broadcasting of size nxn to construct optimal virtual mesh and honeycomb cellular networks for n non-uniformly distributed base stations. Furthermore, those algorithms are time-optimal.
The multidimensional assignment problem (MAP) is a combinatorial optimization problem arising in diverse applications such as computer vision and motion tracking. In the MAP, the objective is to match tuples of object...
详细信息
The multidimensional assignment problem (MAP) is a combinatorial optimization problem arising in diverse applications such as computer vision and motion tracking. In the MAP, the objective is to match tuples of objects with minimum total cost. Randomized parallel algorithms are proposed to solve MAPs appearing in multi-sensor multi-target applications. A parallel construction heuristic is described, together with some variations, as well as a parallel local search heuristic. Experimental results using the proposed algorithms are discussed. (C) 2003 IMACS. Published by Elsevier B.V. All rights reserved.
In this paper, We proposed a new parallel algorithm for computing path expression, named parallel Cascade Semijoin (PCSJ). Moreover, a new scheduling strategy called right-deep zigzag tree is designed to further impro...
详细信息
ISBN:
(纸本)0769509967;0769509975
In this paper, We proposed a new parallel algorithm for computing path expression, named parallel Cascade Semijoin (PCSJ). Moreover, a new scheduling strategy called right-deep zigzag tree is designed to further improve the performance of the PCSJ algorithm. The experiments have been implemented in a NOW distributed and parallel environment. The results show that the PCSJ algorithm out-performs the other two parallel algorithms (the parallel version of forward pointer chasing algorithm (PFPC) and the index splitting parallel algorithm (IndexSplit)) when computing path expressions with restrictive predicates and that the right-deep zigzag tree scheduling strategy has the better performance that the right-deep tree scheduling strategy.
暂无评论