This paper reports a class of new hybrid compact finite-difference schemes with high-order accuracy for solving the Helmholtz equation in two and three dimensions. The innovation of the scheme is to hybridize explicit...
详细信息
This paper reports a class of new hybrid compact finite-difference schemes with high-order accuracy for solving the Helmholtz equation in two and three dimensions. The innovation of the scheme is to hybridize explicit and implicit compact schemes to deal with the solution and its first- and second-order derivatives, and to solve it by a step-by-step coupled iterative method. In response to the inefficiency of serial algorithms in solving the Helmholtz equation with large wavenumber and the issue of memory overflow on a single processor making computation infeasible because of excessively large computational scale, a parallel algorithm is proposed based on the Message Passing Interface (MPI) environment. Truncation error analysis demonstrates sixth-order accuracy for the proposed scheme, and numerical experiments confirm the theoretical sixth-order accuracy for problems with variable and large wavenumber. Also, the MPI-based parallel algorithm exhibits great parallel speedup and enables the tackling of large-scale problems that are beyond the reach of serial algorithms.
This article solves the convergence problem in the parallel BK algorithm for large-scale flow networks. We introduce a merging method and a pseudo-Boolean representation-based invariance analysis that optimize the alg...
详细信息
This article solves the convergence problem in the parallel BK algorithm for large-scale flow networks. We introduce a merging method and a pseudo-Boolean representation-based invariance analysis that optimize the algorithm's performance compared to classical approaches such as Ford-Fulkerson, Edmonds-Karp, Push-Relabel, and Kargers algorithm. Our approach improves energy efficiency, reduces memory usage, and lowers time complexity by leveraging parallelization techniques. We evaluate the performance of these algorithms using Python simulations on various benchmark graphs, ranging from small to large-scale networks. The results show that our method reduces memory consumption by up to 40% and speeds up execution time by 30%, while maintaining high accuracy in finding minimum cuts. This paper demonstrates the algorithm's potential for applications in image segmentation, wireless sensor networks, and network reliability analysis
The parallel algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is ...
详细信息
The parallel algorithm is widely recognized for its effectiveness in handling large-scale datasets stored in a distributed manner, making it a popular choice for solving statistical learning models. However, there is currently limited research on parallel algorithms specifically designed for high-dimensional regression with combined regularization terms. These terms, such as elastic-net, sparse group lasso, sparse fused lasso, and their nonconvex variants, have gained significant attention in various fields due to their ability to incorporate prior information and promote sparsity within specific groups or fused variables. The scarcity of parallel algorithms for combined regularizations can be attributed to the inherent nonsmoothness and complexity of these terms, as well as the absence of closed-form solutions for certain proximal operators associated with them. This paper proposes a unified constrained optimization formulation based on the consensus problem for these types of convex and nonconvex regression problems, and derives the corresponding parallel alternating direction method of multipliers (ADMM) algorithms. Furthermore, it is proven that the proposed algorithm not only has global convergence but also exhibits a linear convergence rate. It is worth noting that the computational complexity of the proposed algorithm remains the same for different regularization terms and losses, which implicitly demonstrates the universality of this algorithm. Extensive simulation experiments, along with a financial example, serve to demonstrate the reliability, stability, and scalability of our algorithm. The R package for implementing the proposed algorithm can be obtained at https:// *** /xfwu1016 /CPADMM.
This paper studies the minimum weight set cover (MinWSC) problem with a small neighborhood cover property (tau-SNC). A parallel algorithm is presented, obtaining approximation ratio tau(1+3 epsilon) in O(Llog(1+epsilo...
详细信息
This paper studies the minimum weight set cover (MinWSC) problem with a small neighborhood cover property (tau-SNC). A parallel algorithm is presented, obtaining approximation ratio tau(1+3 epsilon) in O(Llog(1+epsilon)n(3)/epsilon(2)+4 tau(3)2(tau)L(2)logn) rounds, where 0
A new parallel implementation of a long-range interaction problem on the ring topology for a MIMD computer system is presented. The algorithm was applied for the implementation of a forces integrator in the molecular ...
详细信息
A new parallel implementation of a long-range interaction problem on the ring topology for a MIMD computer system is presented. The algorithm was applied for the implementation of a forces integrator in the molecular dynamics. The complexity estimation is made and also measured time results are given. It is shown that for each number of particles N there exists an optimal number of processors p. Time Complexity O(N2) of a sequential algorithm is reduced to O(N2/p) With the proposed parallel implementation. The time requirement for the optimal sequential algorithm is proportional to N2/2 and the time requirement for the proposed parallel algorithm is proportional to N2/(2p).
Projections are widely used in machine vision, volume rendering, and computer graphics. For applications with 3D volume data, we design a parallel projection algorithm on SIMD mesh-connected computers and implement th...
详细信息
Projections are widely used in machine vision, volume rendering, and computer graphics. For applications with 3D volume data, we design a parallel projection algorithm on SIMD mesh-connected computers and implement the algorithm on the parallel Algebraic Logic (PAL) computer. The algorithm is a parallel ray casting algorithm for both orthographic and perspective projections. It decomposes a volume projection into two transformations that can be implemented in the SIMD fashion to solve the data distribution and redistribution problem caused by non-regular data access patterns in volume projections.
Temperature distributions involved in some metal cutting or surface milling processes may be obtained by solving a nonlinear inverse problem. A simplified metal cutting process is considered in this paper. A problem p...
详细信息
Temperature distributions involved in some metal cutting or surface milling processes may be obtained by solving a nonlinear inverse problem. A simplified metal cutting process is considered in this paper. A problem partitioning concept is used to partition the original problem. We propose a distributed algorithm in relation to the choice of numerical schemes in order to simulate the temperature distribution of such problems. Numerical results are provided for three different materials. Efficiency analysis is provided to examine the algorithm when it is being run in a distributed computing environment. Possibilities are discussed of encapsulating the algorithm in a computer virtual design environment for metal cutting and extending the algorithm in a real time simulation system. (C) 1997 Elsevier Science Ltd.
In this paper we describe a proximal Support Vector Machine algorithm for multiclassification problem by one-vs-all scheme. The computational requirement for the new algorithm is almost the same as training one of its...
详细信息
In this paper we describe a proximal Support Vector Machine algorithm for multiclassification problem by one-vs-all scheme. The computational requirement for the new algorithm is almost the same as training one of its element binary proximal Support Vector Machines. Low rank approximation is taken to reduce computational costs when the kernel matrix is too large. An error bound estimation for the approximated solution is given, which is used as a stopping criteria for low rank approximation. A post-processing strategy is developed to overcome the difficulty arising from unbalanced data and to improve the classification accuracy. A parallel implementation of the algorithm using standard MPI communication routines is provided to handle large-scale problems and to accelerate the training process. Experiment results on several public datasets validate the effectiveness of our proposed algorithm. (C) 2010 Elsevier Inc. All rights reserved.
The purpose of this paper is to propose a new parallel algorithm for the knapsack problem. We develop a generation and searching technique to derive the desired two ordered lists in the preliminary process of the gene...
详细信息
The purpose of this paper is to propose a new parallel algorithm for the knapsack problem. We develop a generation and searching technique to derive the desired two ordered lists in the preliminary process of the general knapsack problem. The proposed parallel algorithm is developed on the basis of an SIMD machine with shared memory. The algorithm needs O(2n/4) memory and O(2n/8) processors to find a solution for the knapsack problem of n components in time O(2n/2). The proposed parallel algorithm has a cost of O(2(5n/8)) which is an improved result over the past researches.
A parallel algorithm to generate the dominance graph on a collection of nonoverlapping iso-oriented rectangles is presented. This graph arises from the constraint graph commonly used in compaction algorithms for VLSI ...
详细信息
A parallel algorithm to generate the dominance graph on a collection of nonoverlapping iso-oriented rectangles is presented. This graph arises from the constraint graph commonly used in compaction algorithms for VLSI circuits. The dominance graph expresses the notion of ''aboveness'' on a collection of nonoverlapping rectangles: it is the directed graph which contains an edge from a rectangle b to rectangle c iff c is immediately above b. The algorithm is based on the divide and conquer paradigm;in the EREW PRAM model, it has time complexity O(log2 n), using n/log n processors. Its processor-time product is O(n log n), which is optimal.
暂无评论