In the maximum satisfiability problem (MAX-SAT) we are given a propositional formula in conjunctive normal form and have to find an assignment that satisfies as many clauses as possible. We study the parallel paramete...
详细信息
In the maximum satisfiability problem (MAX-SAT) we are given a propositional formula in conjunctive normal form and have to find an assignment that satisfies as many clauses as possible. We study the parallel parameterized complexity of various versions of MAX-SAT and provide the first constant-time algorithms parameterized either by the solution size or by the allowed excess relative to some guarantee. For the dual parameterized version where the parameter is the number of clauses we are allowed to leave unsatisfied, we present the first parallel algorithm for MAX-2SAT (known as ALMOST-2SAT). The difficulty in solving ALMOST-2SAT in parallel comes from the fact that the iterative compression method, originally developed to prove that the problem is fixed-parameter tractable at all, is inherently sequential. We observe that a graph flow whose value is a parameter can be computed in parallel and develop a parallel algorithm for the vertex cover problem parameterized above the size of a given matching. Finally, we study the parallel complexity of MAX-SAT parameterized by the vertex cover number, the treedepth, the feedback vertex set number, and the treewidth of the input's incidence graph. While MAX-SAT is fixed parameter tractable for all of these parameters, we show that they allow different degrees of possible parallelization. For all four we develop dedicated parallel algorithms that are constructive, meaning that they output an optimal assignment - in contrast to results that can be obtained by parallel meta-theorems, which often only solve the decision version.
The advent of high-performance computing via many-core processors and distributed processing emphasizes the possibility for exhaustive search by multiple search agents. Despite the occurrence of elegant algorithms for...
详细信息
The advent of high-performance computing via many-core processors and distributed processing emphasizes the possibility for exhaustive search by multiple search agents. Despite the occurrence of elegant algorithms for solving complex problems, exhaustive search has retained its significance since many real-life problems exhibit no regular structure and exhaustive search is the only possible solution. Here we analyze the performance of exhaustive search when it is conducted by multiple search agents. Several strategies for joint search with parallel agents are evaluated. We discover that the performance of the search improves with the increase in the level of mutual help between agents. The same search performance can be achieved with homogeneous and heterogeneous search agents provided that the lengths of subregions allocated to individual search regions follow the differences in the speeds of heterogeneous search agents. We also demonstrate how to achieve the optimum search performance by means of increasing the dimensions of the search region.
algorithms for parallel unconstrained minimization of molecular systems are examined. The overall framework of minimization is the same except for the choice of directions for updating the quasi-Newton Hessian. Ideall...
详细信息
algorithms for parallel unconstrained minimization of molecular systems are examined. The overall framework of minimization is the same except for the choice of directions for updating the quasi-Newton Hessian. Ideally these directions are chosen so the updated Hessian gives steps that are same as using the Newton method. Three approaches to determine the directions for updating are presented: the straightforward approach of simply cycling through the Cartesian unit vectors (finite difference), a concurrent set of minimizations, and the Lanczos method. We show the importance of using preconditioning and a multiple secant update in these approaches. For the Lanczos algorithm, an initial set of directions is required to start the method, and a number of possibilities are explored. To test the methods we used the standard 50-dimensional analytic Rosenbrock function. Results are also reported for the histidine dipeptide, the isoleucine tripeptide, and cyclic adenosine monophosphate. All of these systems show a significant speed-up with the number of processors up to about eight processors. (C) 2010 American Institute of Physics. [doi:10.1063/1.3455719]
Most parallel algorithms have lately been analysed using the parallel Random Access Machine Model (PRAM). Due to the popularity of the PRAM, several basic PRAM techniques were formulated in order to facilitate the des...
详细信息
Most parallel algorithms have lately been analysed using the parallel Random Access Machine Model (PRAM). Due to the popularity of the PRAM, several basic PRAM techniques were formulated in order to facilitate the design and analysis of most problems on lists, trees, and graphs. Prefix, list ranking, and tree contraction are three of the most basic of these techniques. In this paper we survey the existing algorithms and applications of these three basic PRAM techniques. We also show that list ranking can be reduced to tree contraction and vice versa. To make this reduction result useful, a new tree contraction algorithm that does not use list ranking as a subprocedure is introduced Then, we combine same of these basic techniques to form another higher-level technique called parallel tree-structured computation. Included in tree-structured computation are three types of tree computations, namely: adjacency lists computation, bottom-up computation tree evaluation, and top-down computation tree evaluation. Applications of this higher-level technique are also surveyed.
This paper introduces a new parallel algorithm based on the Gram-Schmidt orthogonalization method. This parallel algorithm can find almost exact solutions of tridiagonal linear systems of equations in an efficient way...
详细信息
This paper introduces a new parallel algorithm based on the Gram-Schmidt orthogonalization method. This parallel algorithm can find almost exact solutions of tridiagonal linear systems of equations in an efficient way. The system of equations is partitioned proportional to number of processors, and each partition is solved by a processor with a minimum request from the other partitions' data. The considerable reduction in data communication between processors causes interesting speedup. The relationships between partitions approximately disappear if some columns are switched. Hence, the speed of computation increases, and the computational cost decreases. Consequently, obtained results show that the suggested algorithm is considerably scalable. In addition, this method of partitioning can significantly decrease the computational cost on a single processor and make it possible to solve greater systems of equations. To evaluate the performance of the parallel algorithm, speedup and efficiency are presented. The results reveal that the proposed algorithm is practical and efficient.
An emerging datacenter network (DCN) with high scalability called HSDC is a server-centric DCN that can help cloud computing in supporting many inherent cloud services. For example, a server-centric DCN can initiate r...
详细信息
An emerging datacenter network (DCN) with high scalability called HSDC is a server-centric DCN that can help cloud computing in supporting many inherent cloud services. For example, a server-centric DCN can initiate routing for data transmission. This paper investigates the construction of independent spanning trees (ISTs for short), a set of the rooted spanning trees associated with the disjoint-path property, in HSDC. Regarding multiple spanning trees as routing protocol, ISTs have applications in data transmission, e.g., fault-tolerant broadcasting and secure message distribution. We first establish the vertex-symmetry of HSDC. Then, by the structure that n-dimensional HSDC is a compound graph of an n-dimensional hypercube Q(n) and n-clique K-n, we amend the algorithm constructing ISTs for Q(n) to obtain the algorithm required by HSDC. Unlike most algorithms of recursively constructing tree structures, our algorithm can find every node's parent in each spanning tree directly via an easy computation relied upon only the node address and tree index. Consequently, we can implement the algorithm for constructing n ISTs in O(nN) time, where N = n2(n) is the number of vertices of n-dimensional HSDC;or parallelize the algorithm in O(n) time using Nprocessors. Remarkably, the diameter of the constructed ISTs is about twice the diameter of Q(n). (C) 2021 Elsevier Inc. All rights reserved.
We propose a parallel version of the cross interpolation algorithm and apply it to calculate high-dimensional integrals motivated by Ising model in quantum physics. In contrast to mainstream approaches, such as Monte ...
详细信息
We propose a parallel version of the cross interpolation algorithm and apply it to calculate high-dimensional integrals motivated by Ising model in quantum physics. In contrast to mainstream approaches, such as Monte Carlo and quasi Monte Carlo, the samples calculated by our algorithm are neither random nor form a regular lattice. Instead we calculate the given function along individual dimensions (modes) and use these values to reconstruct its behaviour in the whole domain. The positions of the calculated univariate fibres are chosen adaptively for the given function. The required evaluations can be executed in parallel along each mode (variable) and over all modes. To demonstrate the efficiency of the proposed method, we apply it to compute high-dimensional Ising susceptibility integrals, arising from asymptotic expansions for the spontaneous magnetisation in two-dimensional Ising model of ferromagnetism. We observe strong superlinear convergence of the proposed method, while the MC and qMC algorithms converge sublinearly. Using multiple precision arithmetic, we also observe exponential convergence of the proposed algorithm. Combining high-order convergence, almost perfect scalability up to hundreds of processes, and the same flexibility as MC and qMC, the proposed algorithm can be a new method of choice for problems involving high-dimensional integration, e.g. in statistics, probability, and quantum physics. (C) 2019 The Authors. Published by Elsevier B.V.
A parallel particle swarm algorithm designed to solve a kind of combinatorial optimization problem was presented to overcome the heavy computational time disadvantage of general serial algorithm. The parallel algorith...
详细信息
A parallel particle swarm algorithm designed to solve a kind of combinatorial optimization problem was presented to overcome the heavy computational time disadvantage of general serial algorithm. The parallel algorithm performed asynchronously by dividing the whole particle swarm into several sub-swarms and updated the particle velocity with a variety of local optima. A local search strategy that prevented particle librating in the neighborhood of optimum was proposed. The parallel algorithm's validity was proved by a simulation test comparison with other algorithms. It was also applied to hot rolling planning, and a satisfactory result was achieved in production.
PRAM is the most popular model of parallel computation. Of its three variants most commonly used, CREW is more powerful than EREW, and CRCW is the most powerful. BSR is another PRAM model, which is more powerful than ...
详细信息
PRAM is the most popular model of parallel computation. Of its three variants most commonly used, CREW is more powerful than EREW, and CRCW is the most powerful. BSR is another PRAM model, which is more powerful than CRCW. BSRk and BSR+ are models extended from BSR, and in this paper a theorem is shown on the relation between BSRk and BSR+. (C) 1999 Elsevier Science B.V. All rights reserved.
Massively-parallel, distributed-memory algorithms for the Lagrangian particle hydrodynamic method (Samulyak et al., 2018) have been developed, verified, and implemented. The key component of parallel algorithms is a p...
详细信息
Massively-parallel, distributed-memory algorithms for the Lagrangian particle hydrodynamic method (Samulyak et al., 2018) have been developed, verified, and implemented. The key component of parallel algorithms is a particle management module that includes a parallel construction of octree databases, dynamic adaptation and refinement of octrees, and particle migration between parallel subdomains. The particle management module is based on the p4est (parallel forest of k-trees) library. The massively-parallel Lagrangian particle code has been applied to a variety of fundamental science and applied problems. A summary of Lagrangian particle code applications to the injection of impurities into thermonuclear fusion devices and to the simulation of supersonic hydrogen jets in support of laser-plasma wakefield acceleration research has also been presented.
暂无评论