The problem of enumerating minimal unsatisfiable subsets of a constraint system (MUSes) is a natural candidate for parallelization: as an enumeration problem, it allows for concurrent solving of independent subproblem...
详细信息
ISBN:
(纸本)9781509044603
The problem of enumerating minimal unsatisfiable subsets of a constraint system (MUSes) is a natural candidate for parallelization: as an enumeration problem, it allows for concurrent solving of independent subproblems, and as a typically intractable problem w.r.t. completion (which parallelization cannot transcend), the speed or rate of output (which parallelization can improve) is often the most important performance characteristic. In this work, we explore the parallelization of partial MUS enumeration (aiming to enumerate some MUSes within given resource constraints) via two extensions to a recently-developed sequential algorithm - one employing an existing parallel single-MUS extraction algorithm, the other parallelizing the entire enumeration algorithm-- and we discuss variants and implementation details as well. Results of experiments run with up to 16 cores show that the full parallelization of the entire enumeration algorithm scales well, reaching an average of 92% of perfect scaling with 4 cores and 70% at 16 cores. Evaluating variants and implementation details illuminates how those choices impact performance, including a potentially counterintuitive result that sharing results between threads to avoid duplicate work is not beneficial in the general case.
This paper presents a novel recursive divide and conquer formulation for the simulation of complex constrained multibody system dynamics based on the Hamilton's canonical equations (HDCA). The systems under consid...
详细信息
ISBN:
(纸本)9788494424403
This paper presents a novel recursive divide and conquer formulation for the simulation of complex constrained multibody system dynamics based on the Hamilton's canonical equations (HDCA). The systems under consideration are subjected to holonomic constraints and may include serial chains, tree chains or closed-loop topologies. Although the Hamilton's canonical equations exhibit many advantageous features compared to their acceleration based counterparts, it appears that there is a lack of dedicated parallel algorithms for multi-rigid body system dynamics based on the Hamiltonian formulation. The developed HDCA formulation leads to a two-stage procedure. In the first phase, the approach utilizes the divide and conquer scheme, i.e. a hierarchic assembly-disassembly process by traversing the multibody system topology in a binary tree manner to evaluate the joint velocities and constraint impulsive loads. The process exhibits linear O{n) (n - number of bodies) and logarithmic O(log2n) numerical cost, in serial and parallel implementations, respectively. The time derivatives of the total momenta are directly evaluated in the second parallelizable step of the algorithm. Sample closed-loop test cases indicate very well constraint satisfaction at the position and velocity level as well as marginal energy drift without any additional form of constraint stabilization techniques involved in the solution process. The results are comparatively set against more standard acceleration based Featherstone's DCA approach to indicate the performance of the HDCA algorithm.
In this Letter, by generalizing the notion of Zhang functions (ZFs) from previous work, a novel general-form Zhang function (NGFZF) is proposed, developed and investigated. Specifically, based on the NGFZF, infinitely...
详细信息
In this Letter, by generalizing the notion of Zhang functions (ZFs) from previous work, a novel general-form Zhang function (NGFZF) is proposed, developed and investigated. Specifically, based on the NGFZF, infinitely many ZFs (as error functions) can be readily generated by successively selecting the different values of its parameters. Besides, by employing the NGFZF, a novel general-form Zhang neural net (NGFZNN) is proposed and studied for real-time solution of a time-varying matrix inverse (also termed, Zhang matrix inverse, ZMI). Moreover, a link between ZMI and Drazin inverse is discovered and further generalized to solve for the time-varying Drazin inverse (TVDI). (C) 2015 Elsevier B.V. All rights reserved.
There is a large-scale information data in microblog systems to be processed in real time. Processing large-scalemicroblog data needs high-performance computing architectures and parallel algorithms. Graphic processin...
详细信息
Three-dimensional thermo physical model of the abrasive treating processes was formed on the basis of generating integral solving the task in the form of resultant function of the heat source effectiveness with Green ...
详细信息
ISBN:
(纸本)9789881404701
Three-dimensional thermo physical model of the abrasive treating processes was formed on the basis of generating integral solving the task in the form of resultant function of the heat source effectiveness with Green function for semi space. Generated model was numerically realized in the form of a software complex for abrasive treating thermo physics simulating stochastic modeling. Software complex structure is divided into three units;each of them is a set of nested loops (nesting level from 2 to 4). Cycles from nesting level 2 allow equivalent transformation to form, containing independent iterations. Upper nesting level cycle allows paralleling under condition of synchronizing input data in the beginning of each iteration. parallel implementation of the software complex uses combination of MPI and OpenMP technologies. Open MP technology is used for decomposition lower level cycles. paralleling efficiency on this level is determined by core cache-memory and by RAM access speed. MPI technology is used for paralleling cycles of upper level and is implemented for calculating with high degree of accuracy. Obtained results demonstrated sufficient calculating time decreasing. For example, 500 reductions under average process input parameters of round grinding with radial feed take approximately 0.5 - 1.2 minutes, while high performance PC processed 50 reductions during 520 minutes.
Dictionary learning for sparse representations is traditionally approached with sequential atom updates, in which an optimized atom is used immediately for the optimization of the next atoms. We propose instead a Jaco...
详细信息
ISBN:
(纸本)9781509012893
Dictionary learning for sparse representations is traditionally approached with sequential atom updates, in which an optimized atom is used immediately for the optimization of the next atoms. We propose instead a Jacobi version, in which groups of atoms are updated independently, in parallel. Extensive numerical evidence for sparse image representation shows that the parallel algorithms, especially when all atoms are updated simultaneously, give better dictionaries than their sequential counterparts.
The maximum common subgraph of two graphs, G 1 and G 2 , is the largest subgraph in G 1 that is isomorphic to a subgraph in G 2 . Finding the maximum common subgraph of two given graphs is known to be a NP-complete ...
详细信息
ISBN:
(纸本)9781509036837
The maximum common subgraph of two graphs, G 1 and G 2 , is the largest subgraph in G 1 that is isomorphic to a subgraph in G 2 . Finding the maximum common subgraph of two given graphs is known to be a NP-complete problem. An exact solution for the maximum common subgraph problem can be found by an algorithm that transforms the maximum common subgraph problem into a maximal clique enumeration problem. However, as the size of the graph increases, the solution space of the maximal clique enumeration problem increases combinatorially. A serial solution to the computationally intensive problem of complete maximal clique enumeration is tedious. This paper presents a parallel approach using Graphic Processing Unit to compute the maximum common subgraph of the given graphs. The parallel procedure achieves more than tenfold improvement in computational performance. As an application of the proposed parallel maximum common subgraph algorithm, two new tools, LIGANDMATCHER and GRAPHSCREEN are developed. These tools can be used to narrow down the large ligand search space to a small number in the screening phase of drug discovery process.
The algorithm of probabilistic load flow simulated by serial Monte-Carlo is more timeconsuming due to the more simulation times. It’s difficult to meet the demand for rapid analysis and calculation of large-scale pow...
详细信息
Verifying the correctness of the executions of a concurrent program is difficult because of its nondeterministic behavior. One of the verification methods is predicate detection, which predicts whether the user specif...
详细信息
ISBN:
(纸本)9781450332057
Verifying the correctness of the executions of a concurrent program is difficult because of its nondeterministic behavior. One of the verification methods is predicate detection, which predicts whether the user specified condition (predicate) could become true in any global states of the program. The method is predictive because it generates inferred execution paths from the observed execution path and then checks the predicate on the global states of inferred paths. One important part of predicate detection is global states enumeration, which generates the global states on inferred paths. Cooper and Marzullo gave the first enumeration algorithm based on a breadth first strategy (BFS). Later, many algorithms have been proposed to improve space and time complexity. None of them, however, takes parallelism into consideration. In this paper, we present the first parallel and online algorithm, named ParaMount, for global state enumeration. Our experimental results show that ParaMount speeds up the existing sequential algorithms by a factor of 6 with 8 threads. We have implemented an online predicate detector using ParaMount. For predicate detection, our detector based on ParaMount is 10 to 50 times faster than RV runtime (a verification tool that uses Cooper and Marzullo's BFS enumeration algorithm). Copyright 2015 ACM.
Highly-accurate numerical methods that can efficiently handle problems with interfaces and/or problems in domains with complex geometry are crucial for the resolution of different temporal and spatial scales in many p...
详细信息
Highly-accurate numerical methods that can efficiently handle problems with interfaces and/or problems in domains with complex geometry are crucial for the resolution of different temporal and spatial scales in many problems from physics and biology. In this paper we continue the work started in [8], and we use modest one-dimensional parabolic problems as the initial step towards the development of high-order accurate methods based on the Difference Potentials approach. The designed methods are well-suited for variable coefficient parabolic models in heterogeneous media and/or models with non-matching interfaces and with non-matching grids. Numerical experiments are provided to illustrate high-order accuracy and efficiency of the developed schemes. While the method and analysis are simpler in the one-dimensional settings, they illustrate and test several important ideas and capabilities of the developed approach. (C) 2014 Published by Elsevier B.V. on behalf of IMACS.
暂无评论