Although the resultant elimination method can get all the possible solutions for the selective harmonic elimination (SHE) problem without the selection of initial values, it still has some fatal shortcomings, such as ...
详细信息
Although the resultant elimination method can get all the possible solutions for the selective harmonic elimination (SHE) problem without the selection of initial values, it still has some fatal shortcomings, such as the high computation burden and the huge memory consumption caused by the intermediate expression swell in the procedure of computing the symbolic determinant of the Sylvester matrix. On the basis of the principle of polynomial interpolation, an algorithm framework is proposed to compute the resultant polynomials, which contains the following two major steps: the evaluation of numerical interpolation points and the solution of linear equations. This approach avoids symbolic computing whose computation complexity is usually very high, furthermore, both of these two steps are suitable for parallel implementing which can speed up the computing tremendously. By using the extended n-dimensional Bjorck-Pereyra's algorithm, this algorithm framework is implemented on a parallel computing system, and it has been used to solve the SHE equations for two-level, three-level, and multilevel inverters. As all the possible solutions can be found by this algorithm, the optimal solutions which have the lowest total harmonic distortion can be identified. Experiment results verify the correctness and effectiveness of the proposed method.
In [18], a membrane parallel theoretical framework for computing (co) homology information of foreground or background of binary digital images is developed. Starting from this work, we progress here in two senses: (a...
详细信息
In [18], a membrane parallel theoretical framework for computing (co) homology information of foreground or background of binary digital images is developed. Starting from this work, we progress here in two senses: (a) providing advanced topological information, such as (co) homology torsion and efficiently answering to any decision or classification problem for sum of k-xels related to be a (co) cycle or a (co) boundary;(b) optimizing the previous framework to be implemented in using GPGPU computing. Discrete Morse theory, Effective Homology Theory and parallel computing techniques are suitably combined for obtaining a homological encoding, called algebraic minimal model, of a Region-Of-Interest (seen as cubical complex) of a presegmented k-D digital image. (C) 2016 Elsevier B.V. All rights reserved.
Frequent sequence mining is well known and well studied problem in datamining. The output of the algorithm is used in many other areas like bioinformatics, chemistry, and market basket analysis. Unfortunately, the fre...
详细信息
Frequent sequence mining is well known and well studied problem in datamining. The output of the algorithm is used in many other areas like bioinformatics, chemistry, and market basket analysis. Unfortunately, the frequent sequence mining is computationally quite expensive. In this paper, we present a novel parallel algorithm for mining of frequent sequences based on a static load-balancing. The static load-balancing is done by measuring the computational time using a probabilistic algorithm. For reasonable size of instance, the algorithms achieve speedups up to approximate to 3/4 . P where P is the number of processors. In the experimental evaluation, we show that our method performs significantly better then the current state-of-the-art methods. The presented approach is very universal: it can be used for static load-balancing of other pattern mining algorithms such as itemset/tree/graph mining algorithms.
In this work, we study the efficiency of developed OpenFOAM-based parallel solver for the simulation of heat transfer in and around the electrical power cables. First benchmark problem considers three cables directly ...
详细信息
In this work, we study the efficiency of developed OpenFOAM-based parallel solver for the simulation of heat transfer in and around the electrical power cables. First benchmark problem considers three cables directly buried in the soil. We study and compare the efficiency of conjugate gradient solver with diagonal incomplete Cholesky (DIC) preconditioner, generalized geometric algebraic multigrid GAMG solver from OpenFOAM and conjugate gradient solver with GAMG multigrid solver used as preconditioner. The convergence and parallel scalability of the solvers are presented and analyzed on quadrilateral and acute triangle meshes. Second benchmark problem considers a more complicated case, when cables are placed into plastic pipes, which are buried in the soil. Then a coupled multi-physics problem is solved, which describes the heat transfer in cables, air and soil. Non-standard parallelization approach is presented for multi-physics solver. We show the robustness of selected parallel preconditioners. parallel numerical tests are performed on the cluster of multicore computers.
A software for the implementation of parallel genetic algorithms is presented in this article. The underlying genetic algorithm is aimed to locate the global minimum of a multidimensional function inside a rectangular...
详细信息
A software for the implementation of parallel genetic algorithms is presented in this article. The underlying genetic algorithm is aimed to locate the global minimum of a multidimensional function inside a rectangular hyperbox. The proposed software named PDoublePop implements a client server model for parallel genetic algorithms with advanced features for the local genetic algorithms such as: an enhanced stopping rule, an advanced mutation scheme and periodical application of a local search procedure. The user may code the objective function either in C++ or in Fortran77. The method is tested on a series of well-known test functions and the results are reported. Program summary Program title: PDoublePop Catalogue identifier: AFBJ_v1_0 Program summary URL: http://***/summaries/AFBLv1_*** Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: GNU GPL v3 No. of lines in distributed program, including test data, etc.: 26048 No. of bytes in distributed program, including test data, etc.: 161286 Distribution format: *** Programming language: GNU-C++, GNU-C, GNU Fortran - 77, MPI. Computer: The tool has been tested on Linux and FreeBSD. The tool is designed to be portable in all systems running the GNU C++ compiler, with Open MPI or LAM MPI. Operating system: Any running the GNU C++ compiler. Has the code been vectorised or parallelized?: Yes RAM: 200KB Classification: 4.9, 4.12, 6.5. Nature of problem: A series of problems in science and engineering usually can be formulated as a problem of minimizing a function of many variables. The so called local optimization techniques are frequently trapped in local minima, that it sub optimal solutions. For that reason researchers should use more advanced methods that aim to estimate the global minimum of the function. Solution method: A stopping rule and a periodical application of local search are utilized in conjunction with parallel genetic algorithms in o
A parallel 4D fMRI filtering algorithmis proposed to overcome the bottlenecks of large 4D volumetric fMRI data and its overlapping segments by input decimation, multidimensional intensive computation by parallel proce...
详细信息
A parallel 4D fMRI filtering algorithmis proposed to overcome the bottlenecks of large 4D volumetric fMRI data and its overlapping segments by input decimation, multidimensional intensive computation by parallel processing and the boundary conditions by output interpolation. Three spatial convolution architectures implement this parallel multidimensional filtering algorithm in Virtex-6 FPGA board, as automated 4D fMRI filtering systems. These three automated filtering systems are devised as "plug and develop" processors to filter any 4D volumetric data. Then, two sets of generic Edge and noise smoothing filtering operators are prototypically plugged and developed to be improved for filtering a dementia case study of color 256 x 256 x 4 x 3 volumetric fMRI. Accordingly, performance indices of the three architectures are evaluated as a complete package of area, speed, dynamic power, and throughput. Significant improvements have been achieved in keeping a stable speed, decreasing power consumption and increasing throughput in color fMRI filtering applications. All three architectures have an operating (225 MHz) maximum frequency. The power consumption improved more than two-fold using architecture 2 compared to 3. The highest throughput is achieved by architectures 2 and 3 almost (2.5) times than that of architecture 1. Evidently, all three architectures are performance-aware processors, and architecture 2 is optimal.
Many fluid flows of engineering interest, though very complex in appearance, can be approximated by low-order models governed by a few modes, able to capture the dominant behavior (dynamics) of the system. This featur...
详细信息
Many fluid flows of engineering interest, though very complex in appearance, can be approximated by low-order models governed by a few modes, able to capture the dominant behavior (dynamics) of the system. This feature has fueled the development of various methodologies aimed at extracting dominant coherent structures from the flow. Some of the more general techniques are based on data-driven decompositions, most of which rely on performing a singular value decomposition (SVD) on a formulated snapshot (data) matrix. The amount of experimentally or numerically generated data expands as more detailed experimental measurements and increased computational resources become readily available. Consequently, the data matrix to be processed will consist of far more rows than columns, resulting in a so-called tall-and-skinny (TS) matrix. Ultimately, the SVD of such a TS data matrix can no longer be performed on a single processor, and parallel algorithms are necessary. The present study employs the parallel TSQR algorithm of (Demmel et al. in SIAM J Sci Comput 34(1):206-239, 2012), which is further used as a basis of the underlying parallel SVD. This algorithm is shown to scale well on machines with a large number of processors and, therefore, allows the decomposition of very large datasets. In addition, the simplicity of its implementation and the minimum required communication makes it suitable for integration in existing numerical solvers and data decomposition techniques. Examples that demonstrate the capabilities of highly parallel data decomposition algorithms include transitional processes in compressible boundary layers without and with induced flow separation.
Motion planning is the problem of finding a valid path for a robot from a start position to a goal position. It has many uses such as protein folding and animation. However, motion planning can be slow and take a long...
详细信息
Motion planning is the problem of finding a valid path for a robot from a start position to a goal position. It has many uses such as protein folding and animation. However, motion planning can be slow and take a long time in difficult environments. paralleliza- tion can be used to speed up this process. This research focused on the implementation of a framework for the implementation and testing of parallel Motion Planning algorithms. Additionally, two methods were implemented to test this framework. The results showed a reasonable amount of speed-up and coverage and connectivity similar to sequential meth- ods.
Our group's recent quest has been to use P systems to model parallel and distributed algorithms. Several framework extensions are recalled or detailed, in particular, modular composition with information hiding, c...
详细信息
An essential ingredient for the discretization and numerical solution of coupled multiphysics or multiscale problems is stable and efficient techniques for the transfer of discrete fields between nonmatching volume or...
详细信息
An essential ingredient for the discretization and numerical solution of coupled multiphysics or multiscale problems is stable and efficient techniques for the transfer of discrete fields between nonmatching volume or surface meshes. Here, we present and investigate a new and completely parallel approach. It allows for the transfer of discrete fields between unstructured volume and surface meshes, which can be arbitrarily distributed among different processors. No a priori information on the relation between the different meshes is required. Our inherently parallel approach is general in the sense that it can deal with both classical interpolation and variational transfer operators, e.g., the L-2-projection and the pseudo-L-2-projection. It includes a parallel search strategy, output dependent load-balancing, and the computation of element intersections, as well as the parallel assembling of the algebraic representation of the respective transfer operator. We describe our algorithmic framework and its implementation in the library MOONOLITH. Furthermore, we investigate the efficiency and parallel scalability of our new approach using different examples in three dimensions. This includes the computation of a volume transfer operator between 2 meshes with 2 billion elements in total and the computation of a surface transfer operator between 14 different meshes with 5.9 billion elements in total. The experiments have been performed with up to 12,288 cores.
暂无评论