A 4-subiteration parallel thinning algorithm, based on 3×3 operations, is proposed. It is shown that by taking into account bidirectional compression in each subiteration, pixels belonging to a pair of successive...
详细信息
A 4-subiteration parallel thinning algorithm, based on 3×3 operations, is proposed. It is shown that by taking into account bidirectional compression in each subiteration, pixels belonging to a pair of successive contours, a 4-contour and an 8-contour, are removed from the pattern in every iteration. Therefore, contour pixel removal proceeds towards the inner part of the pattern according to the octagonal metric. This provides a resulting medial line which is centered in the pattern in a quasi-Euclidean sense and is less sensitive to pattern rotation. The performance of the algorithm is discussed and compared with that of some well-known parallel algorithms.
In this paper we describe the application of a parallel implementation of the implicit filtering algorithm to a control problem from hydrology. We seek to control the temperature at a group of drinking water wells by ...
详细信息
In this paper we describe the application of a parallel implementation of the implicit filtering algorithm to a control problem from hydrology. We seek to control the temperature at a group of drinking water wells by placing barrier wells between the drinking water wells and a well that injects heated water from an industrial site.
The GCA (Global Cellular Automata) model consists of a collection of cells which change their states synchronously depending on the states of their neighbors like in the classical CA model. In differentiation to the C...
详细信息
The GCA (Global Cellular Automata) model consists of a collection of cells which change their states synchronously depending on the states of their neighbors like in the classical CA model. In differentiation to the CA model the neighbors are not fixed and local, they are variable and global. The GCA model is applicable to a wide range of parallel algorithms, and it can be implemented on reconfigurable hardware. We discuss the GCA implementation of PRAM algorithms on reconfigurable hardware (Field Programmable Gate Array, FPGA), exemplified by the algorithm of Hirschberg et al., which determines the connected components of a given undirected graph. We provide two implementations with different numbers of cells: one with maximum parallelism and a compact one. We compare the implementation complexities, i.e. number of FPGA cells of both implementations, and thus present experimental evidence of our claims. The GCA(N) algorithm uses 3n cells with time complexity O(n log n), whereas the GCA(N(2)) algorithm uses n(n + 1) cells with time complexity O(log(2)n). The GCA(N) algorithm is more economic with respect to resources (logic x execution time) whereas the GCA(N(2)) algorithm can produce the result faster with a speedup of O((n + m)/log n). Further insights are that efficient mappings of PRAM algorithms onto GCA exist, and that PRAM and GCA optimality criteria differ because the latter takes memory consumption into account. This makes the GCA a parallel computational model and an implementation platform, thus narrowing the gap between theory and practice.
The numerical solution of 3D linear elasticity equations is considered. The problem is described by a coupled system of second-order elliptic partial differential equations. This system is discretized by trilinear par...
详细信息
The numerical solution of 3D linear elasticity equations is considered. The problem is described by a coupled system of second-order elliptic partial differential equations. This system is discretized by trilinear parallelepipedal finite elements. The preconditioned conjugate gradient iterative method is used for solving of the large-scale linear algebraic systems arising after the finite element method (FEM) discretization of. the problem. Displacement decomposition technique is applied at the first step to construct a preconditioner using the decoupled block-diagonal part of the original matrix. Then circulant block-factorization is used for preconditioning of the obtained block-diagonal matrix. Both techniques, displacement decomposition and circulant block-factorization, are highly parallelizable. A parallel algorithm is invented for the proposed preconditioner. The theoretical analysis of the execution time shows that the algorithm is highly efficient for coarse-grain parallel computer systems. A portable MPI parallel FEM code is developed. Numerical tests for real-life engineering problems of the geomechanics in geosciences on a number of modem parallel computers are presented. The reported speed-up and parallel efficiency well illustrate the parallel features of the proposed method and its implementation. (C) 2002 IMACS. Published by Elsevier Science B.V. All rights reserved.
Many of the operations to eliminate complaints concerning respiration impairments fail. In order to improve the success rate, it is important to recognize the responsiveness of the flow field within the nasal cavities...
详细信息
Many of the operations to eliminate complaints concerning respiration impairments fail. In order to improve the success rate, it is important to recognize the responsiveness of the flow field within the nasal cavities. Therefore, we are developing a computer assisted surgery (CAS) system that combines computational fluid dynamics (CFD) and virtual reality (VR) technology. However, the primary prerequisite for VR-based applications is real-time interaction. A single graphics workstation is not capable of satisfying this condition and of simultaneously calculating flow features employing the huge CFD data set. In this paper, we will present our approach of a distributed system that relieves the load on the graphics workstation and makes use of an "off-the-shelf'' parallel Linux cluster calculating streamlines. Moreover, we introduce first results and discuss remaining difficulties.
Developing parallel codes for computing the nonlinear flow in multiaquifer porous systems is an important task both for improving model efficiency and for performing large real-life simulations. Multiaquifer systems c...
详细信息
Developing parallel codes for computing the nonlinear flow in multiaquifer porous systems is an important task both for improving model efficiency and for performing large real-life simulations. Multiaquifer systems consist of sandy and clayey alternating layers. In this paper, highly compressible multiaquifer systems are considered, where some hydraulic parameters depend on the potential head, thus the flow inside some layers is governed by nonlinear equations. An effective procedure for solving these equations is developed, relying upon The partition of the solution procedure into layer-wise steps. By assigning to each processor the computation of the flow inside a suitable set of layers, the iterative solution procedure can be efficiently implemented on a parallel super-computer. Using such a domain decomposition strategy, a satisfactory degree of parallelization is achieved when computing the flow in a realistic nonlinear multiaquifer system, employing a CRAY T3D massively parallel computer. Performing test simulations on real-life multiaquifer systems, the recorded speed-ups are as large as 1.89, 3.34. 5.37, with 2, 4, 8 processors, respectively. The importance of load balance and information exchange in casting the parallel performances of the code is also analyzed.
We solve an optimal control problem for controlled parabolic Ito equations by a stochastic quasigradient method. Because of high amounts of computation time required by numerical solution of such problems we investiga...
详细信息
We solve an optimal control problem for controlled parabolic Ito equations by a stochastic quasigradient method. Because of high amounts of computation time required by numerical solution of such problems we investigate the parallelization of the algorithm. We distribute the computations of space stages over several processor nodes of a parallel computer. We obtain an efficient algorithm with low communication cost by using a ring topology
The analogies observed between parallel computing and system integration modeling are presented and discussed. Two models, the Computation Structure Model and the parallel Integration Evaluation Model are utilized for...
详细信息
The analogies observed between parallel computing and system integration modeling are presented and discussed. Two models, the Computation Structure Model and the parallel Integration Evaluation Model are utilized for representing the analogies. The comparison shows that techniques utilized in performance analysis of parallel computing algorithms, can be taken as basis for developing models for the integration process of distributed production tasks.
In this paper we show that it is impossible to solve a number of "natural" two-dimensional geometric problems in polylog time with a polynomial number of processors (unless P = NC). Thus, we disprove a popul...
详细信息
In this paper we show that it is impossible to solve a number of "natural" two-dimensional geometric problems in polylog time with a polynomial number of processors (unless P = NC). Thus, we disprove a popular belief that there are no natural 1)complete geometric problems in the plane. The problems we address include instances of polygon triangulation, planar partitioning, and geometric layering. Our results are based on non-trivial reductions from the monotone circuit value and planar circuit value problems.
The advancement of the engine control increases the amount of computation. The production ECU (Electronic Control Unit), which is made of single-core architecture, cannot have a higher clock speed. Using multi- / many...
详细信息
The advancement of the engine control increases the amount of computation. The production ECU (Electronic Control Unit), which is made of single-core architecture, cannot have a higher clock speed. Using multi- / many-core architecture is the only way to decrease execution time. However, when implementing the engine control software, various problems occur in utilization of the multi- / many-core ECU. One of the biggest problems is sequential structure of control software because the software can only execute with one core on the multi- / many-core ECU. The purpose of this paper is to describe the parallelized control design method, which has decomposed sequential structure and decreases execution time in the embedded multi- / many-core production ECU. (C) 2016, IFAC (International Federation of Automatic Control) Hosting by Elsevier Ltd. All rights reserved.
暂无评论