This paper provides an enhanced method focused on reducing the computational complexity of function approximation problems by dividing the input data vectors into small groups, which avoids the curse of dimensionality...
详细信息
This paper provides an enhanced method focused on reducing the computational complexity of function approximation problems by dividing the input data vectors into small groups, which avoids the curse of dimensionality. The computational complexity and memory requirements of the approximated problem are higher when the input data dimensionality increases. A divide-and-conquer algorithm is used to distribute the input data of the complex problem to a divided radial basis function neural network (Div-RBFNN). Under this algorithm, the input data variables are typically distributed to different RBFNNs based on whether the number of the input data dimensions is odd or even. In this paper, each Div-RBFNN will be executed independently and the resulting outputs of all Div-RBFNNs will be grouped using a system of linear combination function. The parameters of each Div-RBFNN (centers, radii, and weights) will be optimized using an efficient learning algorithm, which depends on an enhanced clustering algorithm for function approximation, which clusters the centers of the RBFs. Compared to traditional RBFNNs, the proposed methodology reduces the number of executing parameters. It further outperforms traditional RBFNNs not only with respect to the execution time but also in terms of the number of executing parameters of the system, which produces better approximation error.
This work is concerned with approximating matrix functions for banded matrices, hierarchically semiseparable matrices, and related structures. We develop a new divide-and-conquer method based on (rational) Krylov subs...
详细信息
This work is concerned with approximating matrix functions for banded matrices, hierarchically semiseparable matrices, and related structures. We develop a new divide-and-conquer method based on (rational) Krylov subspace methods for performing low-rank updates of matrix functions. Our convergence analysis of the newly proposed method proceeds by establishing relations to best polynomial and rational approximation. When only the trace or the diagonal of the matrix function is of interest, we demonstrate---in practice and in theory---that convergence can be faster. For the special case of a banded matrix, we show that the divide-and-conquer method reduces to a much simpler algorithm, which proceeds by computing matrix functions of small submatrices. Numerical experiments confirm the effectiveness of the newly developed algorithms for computing large-scale matrix functions from a wide variety of applications.
For the given signature operator 7-c = Ir ED -In-r, a pseudo Jacobi matrix is a self-adjoint matrix relatively to a symmetric bilinear form (., .)H, and it is the counterpart of a classical Jacobi matrix to the indefi...
详细信息
For the given signature operator 7-c = Ir ED -In-r, a pseudo Jacobi matrix is a self-adjoint matrix relatively to a symmetric bilinear form (., .)H, and it is the counterpart of a classical Jacobi matrix to the indefinite scalar product space setting. In this article, we consider an inverse eigenvalue problem for this class of matrices. The main concern is to construct an n x n pseudo-Jacobi matrix from a prescribed n-tuple of distinct real numbers and a Jacobi matrix of order not less than L n2 ], such that its spectrum is this tuple and the given Jacobi matrix is its trailing principal submatrix. A divide-and conquer scheme is used to solve this problem, and a necessary and sufficient condition under which the problem is solvable is presented. A numerical algorithm is designed to solve this pseudo-Jacobi matrix inverse eigenvalue problem according to the obtained results. Some illustrative numerical examples are also given to test the reconstructive algorithm. & COPY;2023 Elsevier Inc. All rights reserved.
Mobile-edge computing (MEC), as an emerging computing paradigm, allows app vendors to deploy their mobile and/or IoT applications on edge servers to deliver low-latency services to their app users. However, when an ed...
详细信息
Mobile-edge computing (MEC), as an emerging computing paradigm, allows app vendors to deploy their mobile and/or IoT applications on edge servers to deliver low-latency services to their app users. However, when an edge server needs to serve excessive app users concurrently, severe interference is incurred, which immediately reduces app users' achievable data rates and, consequently, impacts their perceived service quality. This is a major challenge to the app vendor's attempt to minimize the edge resources required for serving its app users with a satisfactory service quality. To tackle this challenge, in this article, we present and formulate this multiple edge application deployment (MEAD) problem in the MEC environment, aiming to maximize app users' overall service quality at minimum deployment cost, considering application shareability and communication interference. We prove that the MEAD problem is NP-hard. Then, we propose a heuristic approach, namely, the deployment-priority greedy via the divide-and-conquer strategy (DPG-D&C), to solve the MEAD problem effectively and efficiently. We evaluate our approach extensively by using a widely used real-world data set. The experimental results show that DPG-D&C significantly outperforms state-of-the-art approaches.
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inverse...
详细信息
Kinetic Monte Carlo (KMC) simulations are used to study long-time dynamics of a wide variety of systems. Unfortunately, the conventional KMC algorithm is not scalable to larger systems, since its time scale is inversely proportional to the simulated system size. A promising approach to resolving this issue is the synchronous parallel KMC (SPKMC) algorithm, which makes the time scale size-independent. This paper introduces a formal derivation of the SPKMC algorithm based on local transition-state and time dependent Hartree approximations, as well as its scalable parallel implementation based on a dual linked list cell method. The resulting algorithm has achieved a weak-scaling parallel efficiency of 0.935 on 1024 Intel Xeon processors for simulating biological electron transfer dynamics in a 4.2 billion-heme system, as well as decent strong-scaling parallel efficiency. The parallel code has been used to simulate a lattice of cytochrome complexes on a bacterial-membrane nanowire, and it is broadly applicable to other problems such as computational synthesis of new materials. (C) 2017 Elsevier B.V. All rights reserved.
Three parallel implementations of a divide-and-conquer search algorithm (called SUDA2) for finding minimal unique itemsets (MUIs) are compared in this paper. The identification of MUIs is used by national statistics a...
详细信息
Three parallel implementations of a divide-and-conquer search algorithm (called SUDA2) for finding minimal unique itemsets (MUIs) are compared in this paper. The identification of MUIs is used by national statistics agencies for statistical disclosure assessment. The first parallel implementation adapts SUDA2 to a symmetric multi-processor cluster using the message passing interface (MPI), which we call an MPI cluster;the second optimizes the code for the Cray MTA2 (a shared-memory, multi-threaded architecture) and the third uses a heterogeneous 'group' of workstations connected by LAN. Each implementation considers the parallel structure of SUDA2, and how the subsearch computation times and sequence of subsearches affect load balancing. All three approaches scale with the number of processors, enabling SUDA2 to handle larger problems than before. For example, the MPT implementation is able to achieve nearly two orders of magnitude improvement with 132 processors. Performance results are given for a number of data sets. Copyright (C) 2009 John Wiley & Sons, Ltd.
A tournament T-n is an orientation of a complete graph on n vertices. A king in a tournament is a vertex from which every other vertex is reachable by a path of length at most 2. A sorted sequence of kings in a tourna...
详细信息
A tournament T-n is an orientation of a complete graph on n vertices. A king in a tournament is a vertex from which every other vertex is reachable by a path of length at most 2. A sorted sequence of kings in a tournament T-n is an ordered list of its vertices u(1), u(2),..., u(n) such that u(i) dominates u(i+1) (u(i) --> u(i+1)) and u(i) is a king in the subtournament induced by {u(j) : i less than or equal to j less than or equal to n} for each i = 1, 2,..., n-1. In particular, if T-n is transitive, searching for a sorted sequence of kings in T-n is equivalent to sorting a set of n numbers. In this paper, we try to find a sorted sequence of kings in a general tournament by asking the following type of binary question: "What is the orientation of the edge between two specified vertices u, v?" The cost for finding a sorted sequence of kings is the minimum number of binary questions asked in order to guarantee the finding of a sorted sequence of kings. Using an adversary argument proposed in this paper, we show that the cost for finding a sorted sequence of kings in T-n is Theta(n(3/2)) in the worst case, thus settling the order of magnitude of this question. We also show that the cost for finding a king in T-n is Omega(n(4/3)) and O(n(3/2)) in the worst case. Finally, we show a connection between a sorted sequence of kings and a median order in a tournament.
Adaptive coarse graining of highly complex biomolecular systems is conducted by either imposing constraints on the system model to generate a coarser model, or releasing certain constraints from it to create a higher ...
详细信息
Adaptive coarse graining of highly complex biomolecular systems is conducted by either imposing constraints on the system model to generate a coarser model, or releasing certain constraints from it to create a higher fidelity one. This paper addresses some of the challenges of transitions to finer-scale models. It is shown that the kinetic energy associated with ignored modes of motion during the coarse graining process must be estimated and put back into the simulation. As such, unlike real mechanical systems, transitioning back to a finer biomolecular model may result in an infinite number of solutions. Herein, an optimization-based approach subject to the satisfaction of impulse-momentum balance and desired kinetic energy is presented to arrive at a physically meaningful state of the system immediately after the transition. In order to reduce the cost of formulating and solving this optimization problem, generalized velocities of the system are partitioned into two categories: independent joint velocities associated with the released joints, and dependent joint velocities associated with the rest of the joints. This paper further develops a divide-and-conquer algorithm to efficiently perform this partitioning process and express dependent velocities in terms of independent ones. The presented method is highly parallelizable and avoids the construction of the mass and Jacobian matrices of the entire system, resulting in a significant improvement in computational efficiency.
A number of modeling and simulation algorithms using internal coordinates rely on hierarchical representations of molecular systems. Given the potentially complex topologies of molecular systems, though, automatically...
详细信息
A number of modeling and simulation algorithms using internal coordinates rely on hierarchical representations of molecular systems. Given the potentially complex topologies of molecular systems, though, automatically generating such hierarchical decompositions may be difficult. In this article, we present a fast general algorithm for the complete construction of a hierarchical representation of a molecular system. This two-step algorithm treats the input molecular system as a graph in which vertices represent atoms or pseudo-atoms, and edges represent covalent bonds. The first step contracts all cycles in the input graph. The second step builds an assembly tree from the reduced graph. We analyze the complexity of this algorithm and show that the first step is linear in the number of edges in the input graph, whereas the second one is linear in the number of edges in the graph without cycles, but dependent on the branching factor of the molecular graph. We demonstrate the performance of our algorithm on a set of specifically tailored difficult cases as well as on a large subset of molecular graphs extracted from the protein data bank. In particular, we experimentally show that both steps behave linearly in the number of edges in the input graph (the branching factor is fixed for the second step). Finally, we demonstrate an application of our hierarchy construction algorithm to adaptive torsion-angle molecular mechanics. (C) 2011 Wiley Periodicals, Inc. J Comput Chem 32: 1589-1598, 2011
This paper presents an efficient algorithm for the simulation of multi-flexible-body systems undergoing discontinuous changes in model definition. The equations governing the dynamics of the transitions from a higher ...
详细信息
This paper presents an efficient algorithm for the simulation of multi-flexible-body systems undergoing discontinuous changes in model definition. The equations governing the dynamics of the transitions from a higher to a lower fidelity model and vice versa are formulated through imposing/removing certain constraints on/from the system model. The issue of the non-uniqueness of the results associated with the transition from a lower to a higher fidelity model may be handled by solving an optimization problem. This optimization problem is subjected to the satisfaction of the constraint imposed by the generalized impulse-momentum equations. The divide-and-conquer algorithm (DCA) is applied to formulate the jumps in the system states resulting from the model transition. The DCA formulation in its basic form is both time and processor optimal and results in linear and logarithmic complexity when implemented in serial and parallel with O(n) processors, respectively. As such, its application can reduce the effective computational cost of formulating and solving the optimization problem in the transitions to the finer models. The principal aspects of the mathematics for the algorithm implementation is developed and numerical examples are provided to validate the method. (C) 2013 Elsevier B.V. All rights reserved.
暂无评论