The paper proposes a parallel algorithm for solving large overdetermined systems of linear algebraic equations with a dense matrix. This algorithm is based on the use of a modification of the conjugate gradient method...
详细信息
The paper proposes a parallel algorithm for solving large overdetermined systems of linear algebraic equations with a dense matrix. This algorithm is based on the use of a modification of the conjugate gradient method, which is able to take into account rounding errors accumulated during calculations when making a decision to terminate the iterative process. The parallel algorithm is constructed in such a way that it takes into account the capabilities of the message passing interface (MPI) parallel programming technology, which is used for the software implementation of the proposed algorithm. The programming examples are shown using the Python programming language and the mpi4py package, but all programs are built in such a way that they can be easily rewritten using the C/C++/Fortran programming languages. The advantage of using the modern MPI-4.0 standard is demonstrated.
作者:
Hungenahally, SGRIFFITH UNIV
FAC SCI & TECHNOLSIGNAL PROC & INTELLIGENT SYST RES LABBRISBANEQLD 4111AUSTRALIA
Development of efficient algorithms for parallel computer architectures is an on-going research area and in the recent past a great volume of theoretical work has been carried out for the search of suitable algorithms...
详细信息
ISBN:
(纸本)0780320182
Development of efficient algorithms for parallel computer architectures is an on-going research area and in the recent past a great volume of theoretical work has been carried out for the search of suitable algorithms in concurrent processing environment. In this paper, the results obtained in the implementation of an Optimal parallel algorithm developed by Deng and Iyengar (1992) in the esoteric area of arithmetic expression parsing is reported. The 'C' code developed and tested on an IBM Compatible Personal Computer in this investigative study, is a simple recursive descent parser and may be used for parallel parsing of arithmetic expressions. The algorithm was developed to suit the SIMD parallel architecture to avoid any communication bottlenecks posed by PVM system, however, design and structure of the code readily permits portability to a parallel computer system.< >
In this paper, first, a fault diagnosis approach of large system based on iteration of function space in dynamic programming is proposed, and it new concept about the minimum of coefficient product of routes from sour...
详细信息
ISBN:
(纸本)0780342534
In this paper, first, a fault diagnosis approach of large system based on iteration of function space in dynamic programming is proposed, and it new concept about the minimum of coefficient product of routes from source node to target node is put forward;an algorithm to figure out this minimum and to track down this correspondent route is given. Second, the algorithm is changed into a tabular iteration method, which is simpler, more regular and helpful for programming. Third, the algorithm's operation time is discussed. In order to reduce the operation time, a concept of bipartite state space is brought up. According to this concept, a large net can be divided into two or more subnets which are independent of each other. Fourth, a hind of parallel algorithm based on tabular iteration for fault diagnosis is also offered;the steps of applying this algorithm are detailed. Finally, this approach is shown to be more effective and simpler by an example of fault diagnosis.
GRAPES (Global/Regional Assimilation and PrEdiction System) is a new developed numerical weather prediction system and will be implemented operationally in the next few years at China Meteorological Administration (CM...
详细信息
GRAPES (Global/Regional Assimilation and PrEdiction System) is a new developed numerical weather prediction system and will be implemented operationally in the next few years at China Meteorological Administration (CMA). For a global semi-implicit semi-Lagrangian numerical prediction model formulated in spherical coordinates, due to the convergence of meridians, the longitudinal grid size decreases toward zero as the poles are approached. Therefore, parallelism near the poles is a tough issue. With efficiency, portability, maintainability and extensibility requirements, a cap-longitude-latitude decomposition parallel algorithm is proposed and realized adherence to the architectures of highperformance computers at CMA. The results indicate that the computing performance of the proposed algorithm is good on IBM-cluster 1600 at CMA. And it can resolve effectively the occurrence of calculated zonal wind exceeding the maximum of the halo regions when locating semi-Lagrangian departure points. The algorithm is efficient and stable and can meet the operational implementation requirement.
The parallel algorithm of Petri net based on multi-core clusters is put forward in order to make the Petri net system with concurrent synchronous function realize parallel control and running. First, select different ...
详细信息
The parallel algorithm of Petri net based on multi-core clusters is put forward in order to make the Petri net system with concurrent synchronous function realize parallel control and running. First, select different Petri net structures and conduct transformation, and give the partitioning method of the subnets of place invariant-based Petri net system. Then, put forward the parallel algorithm of Petri net based on multicore clusters according to the MPI+Open MP+STM(STM, Software Transactional Memory and transactional memory) three-level parallel programming model and combining with the parallelized analysis of the changes of internal subnets and among the subnets. The experiment results show that the algorithm can better reflect the actual running process of Petri net system, and it is a feasible and effective method of realizing the parallel control and running of Petri net system.
Given G =(V,E) is a simple planar graph,and it doesn't contain any odd loops,|V| = n,|E| = *** this paper,we propose an efficient parallel algorithm for edge-coloring by using A colors,based on SIMD-CRCW PRAM,a ...
详细信息
ISBN:
(纸本)0780312333
Given G =(V,E) is a simple planar graph,and it doesn't contain any odd loops,|V| = n,|E| = *** this paper,we propose an efficient parallel algorithm for edge-coloring by using A colors,based on SIMD-CRCW PRAM,a kind of shared memory model that many processors can read and write a unit ***, A is the maximum degree of vertices of *** A is an even mimber,the algorithm requires O(log△·log n) time and O(n△) processors;otherwise it requires O(log△? log n +△n) time and O(n△) ***△= O(log(1)n),the algorithm is an efficient algorithm.
Audio feature extraction is a very important technique in the field of sound processing. It extremely impacts the effectiveness and correctness of sound recognition, sound verification, etc. It is a computation intens...
详细信息
Audio feature extraction is a very important technique in the field of sound processing. It extremely impacts the effectiveness and correctness of sound recognition, sound verification, etc. It is a computation intensive stage in the whole sound recognition process, which is a challenging for acceleration. In this paper, a coarse-grained parallel feature extraction algorithm for high throughput of audio slices is proposed to improve the efficiency of audio feature extraction. Three typical audio feature extraction algorithms, Mel Frequency Cepstrum Coefficients(MFCC), Spectrogram image features(SIF), Octave-Based Spectral Contrast, are chosen to parallelize. Experiments results on different platforms show that the speedup of accelerated audio feature extraction is up to 17.23 on the platform with 16 cores 32 threads.
This work suggests parallel algorithms for solving a sparse system of N - linear equations in N - unknowns by Jacobi method on Extended Fibonacci Cube EFC_1(n) [3]. Where n is the degree of EFC_1(n) and N is the numbe...
详细信息
ISBN:
(纸本)9781467345279
This work suggests parallel algorithms for solving a sparse system of N - linear equations in N - unknowns by Jacobi method on Extended Fibonacci Cube EFC_1(n) [3]. Where n is the degree of EFC_1(n) and N is the number of processors of EFC_1(n). Two parallel versions of the algorithm are discussed. The single pass of the first algorithm involves 2 (N-1) data communications in N steps. Whereas the second algorithm achieves the same number of data communications in N/2 + logN steps. Further each pass of both algorithms have 3N/2 + 1 additions, N/2 - 1 subtractions, N - 1 multiplications and N divisions.
The parallel algorithm of Petri net based on multicore clusters is put forward in order to make the Petri net system with concurrent synchronous function realize parallel control and ***,select different Petri net str...
详细信息
The parallel algorithm of Petri net based on multicore clusters is put forward in order to make the Petri net system with concurrent synchronous function realize parallel control and ***,select different Petri net structures and conduct transformation,and give the partitioning method of the subnets of place invariant-based Petri net ***,put forward the parallel algorithm of Petri net based on multicore clusters according to the MPI+OpenMP+STM(STM,Software Transactional Memory and transactional memory)three-level parallel programming model and combining with the parallelized analysis of the changes of internal subnets and among the *** experiment results show that the algorithm can better reflect the actual running process of Petri net system,and it is a feasible and effective method of realizing the parallel control and running of Petri net system.
We present a parallel algorithm to compute the supremum of max-min powers of any map from the Cartesian product of a finite set to a bounded subset of thereal numbers which can be run on an SIMD machine. The algorithm...
详细信息
We present a parallel algorithm to compute the supremum of max-min powers of any map from the Cartesian product of a finite set to a bounded subset of thereal numbers which can be run on an SIMD machine. The algorithm is based on graph theoretical methods. Some variations of the parallel algorithm are also considered.
暂无评论