This paper proposes a class of graph association rules, denoted by GARs, to specify regularities between entities in graphs. A GAR is a combination of a graph pattern and a dependency;it may take as predicates ML (mac...
详细信息
This paper proposes a class of graph association rules, denoted by GARs, to specify regularities between entities in graphs. A GAR is a combination of a graph pattern and a dependency;it may take as predicates ML (machine learning) classifiers for link prediction. We show that GARs help us catch incomplete information in schemaless graphs, predict links in social graphs, identify potential customers in digital marketing, and extend graph functional dependencies (GFDs) to capture both missing links and inconsistencies. We formalize association deduction with GARs in terms of the chase, and prove its Church-Rosser property. We show that the satisfiability, implication and association deduction problems for GARs are coNP-complete, NP-complete and NP-complete, respectively, retaining the same complexity bounds as their GFD counterparts, despite the increased expressive power of GARs. The incremental deduction problem is DP-complete for GARs versus coNP-complete for GFDs. In addition, we provide parallel algorithms for association deduction and incremental deduction. Using real-life and synthetic graphs, we experimentally verify the effectiveness, scalability and efficiency of the parallel algorithms.
Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks ...
详细信息
Partitioning graphs into blocks of roughly equal size such that few edges run between blocks is a frequently needed operation in processing graphs. Recently, size, variety, and structural complexity of these networks has grown dramatically. Unfortunately, previous approaches to parallel graph partitioning have problems in this context since they often show a negative trade-off between speed and quality. We present an approach to multi-level shared-memory parallel graph partitioning that produces balanced solutions, shows high speedups for a variety of large graphs and yields very good quality independently of the number of cores used. For example, in an extensive experimental study, at 79 cores, one of our closest competitors is faster but fails to meet the balance criterion in the majority of cases and another is mostly slower and incurs about 13 percent larger cut size. Important ingredients include parallel label propagation for both coarsening and refinement, parallel initial partitioning, a simple yet effective approach to parallel localized local search, and fast locality preserving hash tables.
We consider the asynchronous prediction problem for some automaton as the one consisting in determining, given an initial configuration, if there exists a non-zero probability that some selected site changes its state...
详细信息
We consider the asynchronous prediction problem for some automaton as the one consisting in determining, given an initial configuration, if there exists a non-zero probability that some selected site changes its state, when the network is updated picking one site at a time uniformly at random. We show that for the majority automaton, the asynchronous prediction problem is in NC in the two-dimensional lattice with von Neumann neighborhood. Later, we show that in three or more dimensions the problem is NP-Complete. (C) 2020 Elsevier Inc. All rights reserved.
This paper proposes a parallel algorithm of the electromagnetic electromechanical group decoupling based on the electromechanical electromagnetic dual master control, including the hybrid simulation architecture of th...
详细信息
Distributed and parallel algorithms have been frequently investigated in the recent years, in particular in applications like machine learning. Nonetheless, only a small subclass of the optimization algorithms in the ...
详细信息
Distributed and parallel algorithms have been frequently investigated in the recent years, in particular in applications like machine learning. Nonetheless, only a small subclass of the optimization algorithms in the literature can be easily distributed, for the presence, e.g., of coupling constraints that make all the variables dependent from each other with respect to the feasible set. Augmented Lagrangian methods are among the most used techniques to get rid of the coupling constraints issue, namely by moving such constraints to the objective function in a structured, well-studied manner. Unfortunately, standard augmented Lagrangian methods need the solution of a nested problem by needing to (at least inexactly) solve a subproblem at each iteration, therefore leading to potential inefficiency of the algorithm. To fill this gap, we propose an augmented Lagrangian method to solve convex problems with linear coupling constraints that can be distributed and requires a single gradient projection step at every iteration. We give a formal convergence proof to at least epsilon-approximate solutions of the problem and a detailed analysis of how the parameters of the algorithm influence the value of the approximating parameter epsilon. Furthermore, we introduce a distributed version of the algorithm allowing to partition the data and perform the distribution of the computation in a parallel fashion.
Aircraft are complex systems with, in some cases, high-dimensional nonlinear interactions between control surfaces. When a failure occurs, adaptive flight control methods can be utilised to stabilise and make the airc...
详细信息
Aircraft are complex systems with, in some cases, high-dimensional nonlinear interactions between control surfaces. When a failure occurs, adaptive flight control methods can be utilised to stabilise and make the aircraft controllable. Adaptive flight control methods, however, require accurate aerodynamic models - where first-order continuity is necessary for estimating the control derivatives and mitigating chattering that can reduce the longevity of components. Additionally, high-dimensional offline model identification with current approaches can take several hours for a few dimensions and this means model iterating and hyper-parameter tuning is often not feasible. Current approaches to smooth high-dimensional functional approximation are not scalable, require global communication between iteration steps, and are ill-conditioned in higher dimensions. This research develops the Distributed Asynchronous B-spline (DAB) algorithm that is more robust to ill-conditioning, due to low data coverage, by using first-order methods with acceleration and weighted constraint application. This algorithm is also suitable for continuous state-spaces. Smooth aerodynamic models can be determined in exactly n·r iterations, where r is the number of continuity equations in a single dimension and n is the number of dimensions. Moreover, memory reorganisation is proposed to avoid false sharing and conflict-free use of shared memory on the GPU to ensure that the algorithm runs efficiently in parallel.
In this paper, we articulate a novel plastic phase-field (PPF) method that can tightly couple the phase-field with plastic treatment to efficiently simulate ductile fracture with GPU optimization. At the theoretical l...
详细信息
In this paper, we articulate a novel plastic phase-field (PPF) method that can tightly couple the phase-field with plastic treatment to efficiently simulate ductile fracture with GPU optimization. At the theoretical level of physically-based modeling and simulation, our PPF approach assumes the fracture sensitivity of the material increases with the plastic strain accumulation. As a result, we first develop a hardening-related fracture toughness function towards phase-field evolution. Second, we follow the associative flow rule and adopt a novel degraded von Mises yield criterion. In this way, we establish the tight coupling of the phase-field and plastic treatment, with which our PPF method can present distinct elastoplasticity, necking, and fracture characteristics during ductile fracture simulation. At the numerical level towards GPU optimization, we further devise an advanced parallel framework, which takes the full advantages of hierarchical architecture. Our strategy dramatically enhances the computational efficiency of preprocessing and phase-field evolution for our PPF with the material point method (MPM). Based on our extensive experiments on a variety of benchmarks, our novel method's performance gain can reach 1.56x speedup of the primary GPU MPM. Finally, our comprehensive simulation results have confirmed that this new PPF method can efficiently and realistically simulate complex ductile fracture phenomena in 3D interactive graphics and animation.
We describe a new Monte Carlo method based on a multilevel method for computing the action of the resolvent matrix over a vector. The method is based on the numerical evaluation of the Laplace transform of the matrix ...
详细信息
We describe a new Monte Carlo method based on a multilevel method for computing the action of the resolvent matrix over a vector. The method is based on the numerical evaluation of the Laplace transform of the matrix exponential, which is computed efficiently using a multilevel Monte Carlo method. Essentially, it requires generating suitable random paths which evolve through the indices of the matrix according to the probability law of a continuous-time Markov chain governed by the associated Laplacian matrix. The convergence of the proposed multilevel method has been discussed, and several numerical examples were run to test the performance of the algorithm. These examples concern the computation of some metrics of interest in the analysis of complex networks, and the numerical solution of a boundary-value problem for an elliptic partial differential equation. In addition, the algorithm was conveniently parallelized, and the scalability analyzed and compared with the results of other existing Monte Carlo method for solving linear algebra systems.
An important concept in finite state machine based testing is synchronization which is used to initialize an implementation to a particular state. Usually, synchronizing sequences are used for this purpose and the len...
详细信息
An important concept in finite state machine based testing is synchronization which is used to initialize an implementation to a particular state. Usually, synchronizing sequences are used for this purpose and the length of the sequence used is important since it determines the cost of the initialization process. Unfortunately, the shortest synchronization sequence problem is NP-Hard. Instead, heuristics are used to generate short sequences. However, the cubic complexity of even the fastest heuristic algorithms can be a problem in practice. In order to scale the performance of the heuristics for generating short synchronizing sequences, we propose algorithmic improvements together with a parallel implementation of the cheapest heuristics existing in the literature. To identify the bottlenecks of these heuristics, we experimented on random and slowly synchronizing automata. The identified bottlenecks in the algorithms are improved by using algorithmic modifications. We also implement the techniques on multicore CPUs and Graphics Processing Units (GPUs) to take benefit of the modern parallel computation architectures. The sequential implementation of the heuristic algorithms are compared to our parallel implementations by using a test suite consisting of 1200 automata. The speedup values obtained depend on the size and the nature of the automaton. In our experiments, we observe speedup values as high as 340x by using a 16-core CPU parallelization, and 496x by using a GPU. Furthermore, the proposed methods scale well and the speedup values increase as the size of the automata increases. (C) 2020 Elsevier Inc. All rights reserved.
We present a Weak Constraint Gaussian Process (WCGP) model to integrate noisy inputs into the classical Gaussian Process (GP) predictive distribution. This model follows a Data Assimilation approach (i.e. by consideri...
详细信息
We present a Weak Constraint Gaussian Process (WCGP) model to integrate noisy inputs into the classical Gaussian Process (GP) predictive distribution. This model follows a Data Assimilation approach (i.e. by considering information provided by observed values of a noisy input in a time window). Due to the increased number of states processed from real applications and the time complexity of GP algorithms, the problem mandates a solution in a high performance computing environment. In this paper, parallelism is explored by defining the parallel WCGP model based on domain decomposition. Both a mathematical formulation of the model and a parallel algorithm are provided. We use the algorithm for an optimal sensor placement problem. Experimental results are provided for pollutant dispersion within a real urban environment. (C) 2020 Elsevier B.V. All rights reserved.
暂无评论