This paper develops techniques for reasoning about graph functional dependencies (GFDs). We study the satisfiability problem, to decide whether a given set of GFDs has a model, and the implication problem, to decide w...
详细信息
ISBN:
(纸本)9781538655207
This paper develops techniques for reasoning about graph functional dependencies (GFDs). We study the satisfiability problem, to decide whether a given set of GFDs has a model, and the implication problem, to decide whether a set of GFDs entails another GFD. While these fundamental problems are important in practice, they are coNP-complete and NP-complete, respectively. We establish a small model property for satisfiability, showing that if a set Sigma of GFDs is satisfiable, then it has a model of a size bounded by the size vertical bar Sigma vertical bar of Sigma;similarly we prove a small model property for implication. Based on the properties, we develop algorithms for checking the satisfiability and implication of GFDs. Moreover, we provide parallel algorithms that guarantee to reduce running time when more processors are used, despite the intractability of the problems. We experimentally verify the efficiency and scalability of the algorithms.
We consider the problem of tensor factorization in the cases where one of the factors is constrained to have orthonormal columns. We adopt the alternating optimization framework and derive an efficient algorithm that ...
详细信息
ISBN:
(纸本)9781538678794
We consider the problem of tensor factorization in the cases where one of the factors is constrained to have orthonormal columns. We adopt the alternating optimization framework and derive an efficient algorithm that is also suitable for parallel implementation. We describe in detail a distributed memory implementation of the algorithm on a three-dimensional processor grid. The speedup attained by a message-passing implementation of the algorithm is significant, indicating that it is a competitive candidate for the solution of very large tensor factorization problems with orthogonality constraints.
The block Lanczos method for huge sparse linear systems over large prime finite fields is accelerated on GPU. Calculations on GPU are used for the operations with the dense matrices and blocks. The achieved accelerati...
详细信息
The block Lanczos method for huge sparse linear systems over large prime finite fields is accelerated on GPU. Calculations on GPU are used for the operations with the dense matrices and blocks. The achieved acceleration of block operations significantly increases the parallel resource of the entire method, expanding the scalability area close to linear. The numerical experiments were carried out on supercomputers Lomonosov and Lomonosov-2.
Recently, clustering in graphs and networks have been widely investigated in Computer Science, Artificial Intelligence, Social and Biological sciences. Newman's modularity concept is now widely used in an increasi...
详细信息
ISBN:
(纸本)9781509060177
Recently, clustering in graphs and networks have been widely investigated in Computer Science, Artificial Intelligence, Social and Biological sciences. Newman's modularity concept is now widely used in an increasing number of applications. However, developments in Modularity Density Maximization have recently shown improved and better than expected clustering results. In this paper, we investigate and report results on the solution of Li's modularity clustering problem by using Anytime A* search, which offers a novel strategy to tackle this type of problem. In order to do so, we propose two versions of the Anytime A* method, a sequential and a parallel one. In the parallel approach, a pair of connected nodes is used as constraints to divide the search space among the threads. Our results indicate that Anytime A* is a promising search method in this context, leading to small gaps to the best-known solution. Further, the proposed methods add flexibility to the use of parallel strategies that effectively speed up searches.
Finding connected components in an undirected graph has many practical applications. For example in a graph representing a social network, a connected component represents a group of related individuals with common in...
详细信息
ISBN:
(数字)9781728108582
ISBN:
(纸本)9781728108599
Finding connected components in an undirected graph has many practical applications. For example in a graph representing a social network, a connected component represents a group of related individuals with common interest. Also, finding connected components forms the basis for other clustering algorithms. In this paper, we will present a parallel algorithm which uses the well known sequential algorithm as the basis for finding connected components in an undirected graph. The algorithm can be adopted to run on a single computer with multiple cores or MapReduce. It is robust in the sense that it honors memory limits. This is important in today's containerized environments. It balances the workload even in the presence of data skew. For the best known algorithm running in MapReduce, the number of iterations is the square of the logarithmic function of the number of vertices in the graph. For our algorithm, we will prove that the upper bound on the number of iterations is a logarithmic function of the maximum size of a connected component. In each iteration, the amount of data read from or written to a file system is bounded by four times the number of edges in the graph.
We present the first conditional hardness results for massively parallel algorithms for some central graph problems including (approximating) maximum matching, vertex cover, maximal independent set, and coloring. In s...
详细信息
We present the first conditional hardness results for massively parallel algorithms for some central graph problems including (approximating) maximum matching, vertex cover, maximal independent set, and coloring. In some cases, these hardness results match or get close to the state of the art algorithms. Our hardness results are conditioned on a widely believed conjecture in massively parallel computation about the complexity of the connectivity problem. We also note that it is known that an unconditional variant of such hardness results might be somewhat out of reach for now, as it would lead to considerably improved circuit complexity lower bounds and would concretely imply that NC 1 is a proper subset of P. We obtain our conditional hardness result via a general method that lifts unconditional lower bounds from the well-studied LOCAL model of distributed computing to the massively parallel computation setting.
Given a test suite TS and a set of mutants that can be derived from the specification FSM S with respect to an assumed type of faults. Mutants' elimination deals with killing each mutant of the fault domain that h...
详细信息
ISBN:
(纸本)9781538678398
Given a test suite TS and a set of mutants that can be derived from the specification FSM S with respect to an assumed type of faults. Mutants' elimination deals with killing each mutant of the fault domain that has an output behavior different than that of S in respect to some test case of TS. This process is time consuming, especially when the number of mutants is in the order of millions. Thus, we present and assess two parallel implementations for the considered problem based on the OpenMP and GPU with CUDA technologies. Experiments are conducted to assess the speedup (and execution time) of proposed implementations using both random and real machines. CUDA implementation is shown to be scalable. Experiments also conducted to determine the experimental setup attributes such as TS length, number of - test cases, threads in OpenMP, and inputs of a test case that will be applied to the mutants in each GPU invocation.
Quadratic Assignment Problem (QAP) is one of the most difficult combinatorial problems in the literature and has a diverse field of applications. This paper presents the results of experiments on the impact of paralle...
详细信息
ISBN:
(纸本)9781728126289
Quadratic Assignment Problem (QAP) is one of the most difficult combinatorial problems in the literature and has a diverse field of applications. This paper presents the results of experiments on the impact of parallelization of a sequential GA using island model. Both of the genetic algorithms are applied to the QAP. For the island model parallel GA, we systematically change the number of islands and investigate the effects of dividing the same global population into a number of subpopulations. The number of islands is gradually increased to observe the effects on solution quality and speedup in total execution time using different problem instances. The results clearly indicate that, while parallelized version outperforms sequential counterpart in both solution quality and total execution time, an increasing number of subpopulations also positively effects the results until a critical point where every subpopulation has a certain number of individuals to be able to evolve independently. Beyond that point, the performance of the algorithm begins to decrease.
This paper introduces a novel parallel matrices multiplication algorithm (SMPMM) implies dividing the problem of matrices multiplication into smaller independent tasks where each processor in the parallel environment ...
详细信息
ISBN:
(数字)9781728137780
ISBN:
(纸本)9781728137797
This paper introduces a novel parallel matrices multiplication algorithm (SMPMM) implies dividing the problem of matrices multiplication into smaller independent tasks where each processor in the parallel environment executes one single task a time, once done, the processor receives another task to process it. As opposed to previous algorithms, like Cannon, Fox, PUMMA, SUMMA, DIMMA and HSUMMA algorithms, where the decomposition is carried out on the data, i.e. the multiplied matrices are decomposed into small blocks, where each processor multiplies some blocks and sends the result to neighbor processors; SMPMM does include any data decomposing. In addition, SMPMM contradicts with previous algorithms where there is no data exchange and no communication among processors on in the parallel environment. One more important advantage is SMPMM multiplies non-square matrices in parallel, which is not available by any previous parallel matrices multiplication algorithms.
Gaussian convolutions computation is required in several scientific fields and, to this aim, efficient approximation methods, based on Recursive Filters (RFs), have been developed recently. Among them, Gaussian Recurs...
详细信息
ISBN:
(数字)9781728156866
ISBN:
(纸本)9781728156873
Gaussian convolutions computation is required in several scientific fields and, to this aim, efficient approximation methods, based on Recursive Filters (RFs), have been developed recently. Among them, Gaussian Recursive Filters (RFs) are designed to approximate the Gaussian convolution in a very efficient way. The accuracy of these methods, as is well known, can be improved by means of the use of the so-called K-iterated Gaussian recursive filters, that is in the repeated application of the basic RF. To improve the provided accuracy, K-iterated versions of these methods are also considered. Since it is often necessary to handle large size one-dimensional input signals, a parallel approach becomes mandatory. Recently, we proposed a parallel algorithm for the implementation of the K-iterated first-order Gaussian RF on multicore architectures. Here, using a similar parallelization strategy, based on a domain decomposition with overlapping, we propose a new implementation that would exploit, in terms of both accuracy and performance, the GPU (Graphics Processing Unit) capabilities on CUDA environment. Tests and experiments confirm the reliability and the efficiency of the proposed implementation.
暂无评论