We have developed a mathematical model for video on demand server design based on principal component analysis. Singular value decomposition on the video correlation matrix is used to perform the PCA. The challenge is...
详细信息
We have developed a mathematical model for video on demand server design based on principal component analysis. Singular value decomposition on the video correlation matrix is used to perform the PCA. The challenge is to counter the computational complexity, which grows proportionally to n 3 , where n is the number of video streams. We present a solution from high performance computing, which splits the problem up and computes it in parallel on a distributed memory system.
This paper proposes a parallel algorithm for computing anN( = Kn) point Lagrange interpolation on fc-ary n-cube networks. The algorithm consists of three phases: initialisation, main and final. There is no computation...
详细信息
This paper proposes a parallel algorithm for computing anN( = Kn) point Lagrange interpolation on fc-ary n-cube networks. The algorithm consists of three phases: initialisation, main and final. There is no computation in the initialisation phase. The main phase is composed of N/2 steps, each consisting of four multiplications and four subtractions, and an additional step including one division and one multiplication. Communication in the main phase is based on an all-to-all broadcast algorithm on a Hamiltonian ring embedded in a k-ary n-cube. The final phase is carried out in n x ⌊k/l⌋ steps, each requiring one addition. A performance evaluation of the proposed algorithm reveals a near to optimum speedup for a typical range of sy:;tem parameters used in current state-of-the-art implementations. Our study also reveals that when implementation cost is taken into account low-dimensional K-ary n-cubes achieve better speedup than their higher-dimensional counterparts.
The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, wit...
详细信息
The adaptive BDDC method is extended to the selection of face constraints in three dimensions. A new implementation of the BDDC method is presented based on a global formulation without an explicit coarse problem, with massive parallelism provided by a multifrontal solver. Constraints are implemented by a projection and sparsity of the projected operator is preserved by a generalized change of variables. The effectiveness of the method is illustrated on several engineering problems. (c) 2011 IMACS. Published by Elsevier B.V. All rights reserved.
In this paper, we first present an O(log n) time sorting algorithm on 3-D mesh-connected computers with multiple broadcasting (abbreviated to MCCMB) using n(1/2) X n(1/2) X n(1/2) processors. Our algorithm is derived ...
详细信息
In this paper, we first present an O(log n) time sorting algorithm on 3-D mesh-connected computers with multiple broadcasting (abbreviated to MCCMB) using n(1/2) X n(1/2) X n(1/2) processors. Our algorithm is derived from rotate sort. Further, we also show that the result can be extended to n(1/k-1) x n(1/k-1) x ... x n(1/k-1) k-dimensional MCCMB of size O(n(l+1/k-1)) to sort n data items in O(7(k-3) logn) time, for k greater than or equal to 3. The algorithm proposed is optimal speed-up while k is any constant. The contribution of this paper is to show that the proposed algorithm can be run in a higher dimensional MCCMB and using fewer processors but keeps the same time complexity as O(log n).
Aiming at the complex structure of the existing deniable authentication image encryption methods based on public key cryptography and the high computational cost caused by many bilinear and modular power operations, a...
详细信息
Equations of equilibrium arise in numerous areas of engineering. Applications to electrical networks, structures, and fluid flow are elegantly described in Introduction to Applied Mathematics, Wellesley Cambridge Pres...
详细信息
Equations of equilibrium arise in numerous areas of engineering. Applications to electrical networks, structures, and fluid flow are elegantly described in Introduction to Applied Mathematics, Wellesley Cambridge Press, Wellesley, MA, 1986 by Strang. The context in which equilibrium equations arise may be stated in two forms:
Abstract The succesful application of model predictive control (MPC) in fast embedded systems relies on faster and more energy efficient ways of solving complex optimization problems. A custom quadratic programming (Q...
详细信息
Abstract The succesful application of model predictive control (MPC) in fast embedded systems relies on faster and more energy efficient ways of solving complex optimization problems. A custom quadratic programming (QP) solver implementation on a field-programmable gate array (FPGA) can provide substantial acceleration by exploiting the parallelism inherent in some optimization algorithms, apart from providing novel computational opportunities arising from deep pipelining. This paper presents a new MPC algorithm based on multiplexed MPC that can take advantage of the full potential of an existing FPGA design by utilizing the provided ‘free’ parallel computational channels arising from such pipelining. The result is greater acceleration over a conventional MPC implementation and reduced silicon usage. The FPGA implementation is shown to be approximately 200x more energy efficient than a high performance general purpose processor (GPP) for large control problems.
We describe an alternative implementation of Atallah and Vishkin’s parallel algorithm for finding an Euler Tour of a graph. Instead of finding a spanning tree as an intermediate step, this algorithm is based on ident...
详细信息
We describe an alternative implementation of Atallah and Vishkin’s parallel algorithm for finding an Euler Tour of a graph. Instead of finding a spanning tree as an intermediate step, this algorithm is based on identifying a strut which is easier to compute. Using the strut, vertices which have more than one circuit passing through them are identified directly. Stitching at such vertices reduces the number of circuits in the Euler Partition.
Heuristic search is a fundamental component of Artificial Intelligence applications. Because search routines are frequently also a computational bottleneck, numerous methods have been explored to increase the efficien...
详细信息
Heuristic search is a fundamental component of Artificial Intelligence applications. Because search routines are frequently also a computational bottleneck, numerous methods have been explored to increase the efficiency of search. Recently, researchers have begun investigating methods of using parallel MIMD and SIMD hardware to speed up the search process. In this paper, we present a massively-parallel SIMD approach to search named MIDA* search. The components of MIDA* include a very fast distribution algorithm which biases the search to one side of the tree, and an incrementally-deepening depthfirst search of all the processors in parallel. We show the results of applying MIDA* to instances of the Fifteen Puzzle problem and to the robot arm motion planning problem. Results reveal an efficiency of 74% and a speedup of 8553 and 492 over serial and 16-processor MIMD algorithms, respectively, when finding a solution to the Fifteen Puzzle problem that is close to optimal.
The channel-assignment problem is central to the integrated circuit fabrication process. Given a two-sided printed circuit board, we wish to make n pairs of components electrically equivalent. The connections are made...
详细信息
The channel-assignment problem is central to the integrated circuit fabrication process. Given a two-sided printed circuit board, we wish to make n pairs of components electrically equivalent. The connections are made using two vertical runs along with a horizontal one. Each horizontal run lies in a channel. The channel-assignment problem involves minimizing the total number of channels used. Recent advances in VLSI have made it possible to build massively parallel machines. To overcome the inefficiency of long distance communications among processors, some parallel architectures have been augmented by bus systems. If such a bus system can be dynamically changed to suit communication needs among processors, it is referred to as reconfigurable. The reconfigurable mesh is one of the practical models featuring a reconfigurable bus system. In this paper, we propose a constant-time algorithm to solve the channel-assignment problem of size n on a reconfigurable mesh of size n x n.
暂无评论