A parallel nonlinear solver (Paloschi, 1994) is implemented in the flowsheeting package SPEEDUP(Aspen Technology Inc.). The solver is a Quasi-Newton (QN) method approach where the parallelization is achieved by partit...
详细信息
A parallel nonlinear solver (Paloschi, 1994) is implemented in the flowsheeting package SPEEDUP(Aspen Technology Inc.). The solver is a Quasi-Newton (QN) method approach where the parallelization is achieved by partitioning the domain and range of the system of nonlinear equations, assigning one partition to each available processor. All significant tasks within the QN algorithm are performed in parallel. Linear systems are solved using a parallel preconditioned iterative linear solver (GMRES) and the preconditioner is based on a block-diagonal approximation. It is tested by using two steady-state simulation problems. The first is a plug-flow reactor flowsheet involving 1633 variables in the largest nonlinear block. The second example consists of a flowsheet with two columns, resulting in a problem with 5668 equations in the largest nonlinear block. Two MIMD parallel machines are considered. The first is a cluster of workstations running with the message passing protocol PVM. The other is a proper parallel machine, a FUJITSU AP1000. The larger problem is tested in both machines to compare the parallel performance. For the first example, the execution time is reduced by a factor of 2.4 using 8 processors and 1.5 using 4 on the Fujitsu AP1000 machine. For the second example, the reduction achieves a factor of 2.7 using 8 processors on a cluster of workstations and a factor of 7.1 using 16 processors on the Fujitsu AP1000. (C) 1998 Elsevier Science Ltd. All rights reserved.
In this paper, the parallel algorithms of the search, segmentation, and correction of the segmentation of zones of interest by images of locally homogeneous scenes are proposed. The results of the execution of paralle...
详细信息
This article introduces a programming and transformation system for describing, optimizing and mapping parallel algorithms onto a highly parallel multiprocessor architecture. The application of the system in digital s...
详细信息
This article introduces a programming and transformation system for describing, optimizing and mapping parallel algorithms onto a highly parallel multiprocessor architecture. The application of the system in digital signal processing, especially digital filtering, has been the main subject of investigation. Consequently both the class of computations schemes as well as the computer organization (MIMD) and the interconnection structure (crossbar) are effected. After presenting a special MIMD-architecture the results concerning concurrent optimized versus pure sequential computing time (speed-up) are delineated. Dieser Artikel beschreibt eine Methode zur Formulierung paralleler Algorithmen, deren Optimierung und Abbildung auf Multiprozessorsysteme. Der Schwerpunkt liegt dabei auf speziellen Anwendungen aus dem Gebiet der Echtzeitsignalverarbeitung, insbesondere der digitalen Filterung. Dies wirkt sich sowohl auf die Klasse der erfaβbaren Algorithmen als auch der betrachteten Rechnerarchitekturen und der notwendigen Verbindungsstrukturen aus. Dieser Ansatz wird an Hand eines vollständig implementierten Programmier- und Transformationssystems erläutert. Es wird eine spezielle MIMD-Architektur vorgestellt und daran Analysen des Zeitverhaltens gegenüber rein sequentieller Verarbeitung angestellt. Cet article introduit un système de programmation et de transformation à l'aide duquel on peut décrire, améliorer, accelérer et appliquer des algorithmes parallèles dans une architecture d'un système ‘multiprocessor’. Le point principal des analyses est l'application en temps réel, spécialement dans le domaine du filtrage numérique. Par consequent on observe une influence sur les classes des algorithmes ainsi que sur l'architecture de l'ordinateur. Cette proposition est décrite à l'aide d'un système complét et achevé de programmation et de transformation. Une MIMD-architecture particulière est présentée. Ensuite, on fait l'étude des temps de l'architecture du MIMD et des méthodes
Genetic sequence data typically exhibit variability in substitution rates across sites. In practice. there is often too hale, variation to fit a different rate for each site in the alignment. but the distribution of r...
详细信息
Genetic sequence data typically exhibit variability in substitution rates across sites. In practice. there is often too hale, variation to fit a different rate for each site in the alignment. but the distribution of rates across sites may not be well modeled using simple parametric families. Mixtures of different distributions can capture more complex patterns of rate variation, but are often parameter-rich and difficult to fit. We present a simple hierarchical model in which a baseline rate distribution, such as a gamma distribution. is discretized into several categories, the quantiles of which are estimated using a discretized beta distribution. Although this approach involves adding only two extra parameters to a standard distribution, a wide range of rate distributions can be captured. Using simulated data, we demonstrate that a "beta-" model can reproduce the moments of the rate distribution more accurately than the distribution used to simulate the data. even when the baseline rate distribution is misspecified. Using hepatitis C virus and mammalian mitochondrial sequences, we show that a beta-model can fit as well or better than a model with multiple discrete rate categories. and compares favorably with a model which fits a separate rate category to each site. We also demonstrate this discretization scheme in the context of codon models specifically aimed at identifying individual sites undergoing adaptive or purifying evolution.
The results of implementing a Navier-Stokes flow solver on the Symult Systems Series 2010 parallel processor are presented. The code solves the three-dimensional unsteady compressible Navier-Stokes equations of fluid ...
详细信息
The results of implementing a Navier-Stokes flow solver on the Symult Systems Series 2010 parallel processor are presented. The code solves the three-dimensional unsteady compressible Navier-Stokes equations of fluid dynamics for flow past arbitrary bodies, using an explicit finite-volume multi-stage Runge-Kutta time-stepping scheme. The implementation strategy on the Series 2010 parallel processor is discussed, and code-conversion issues such as domain decomposition and boundary condition implementation are highlighted. The performance of the code is evaluated by calculating the transonic laminar viscous flow over a wing.
A distributed algorithm is described for finding a common fixed point of a family of m > 1 nonlinear maps Mi: IRn→ IRnassuming that each map is a paracontraction and that such a common fixed point exists. The comm...
详细信息
The design and implementation of a parallel algorithm for computing Grobner bases on distributed memory multiprocessors is presented as a series of refinements on a transition rule program, in which computation procee...
详细信息
The design and implementation of a parallel algorithm for computing Grobner bases on distributed memory multiprocessors is presented as a series of refinements on a transition rule program, in which computation proceeds by nondeterministic invocations of guarded commands. The data structures are designed for high throughput and latency tolerance, appropriate for distributed memory machines. The programming style represents a compromise between shared-memory and message-passing models. In the data structure design, there is a classic trade-off between locality and load balance. It is argued that this is best solved by designing scheduling structures in tandem with the state data structures, since the decision to replicate or partition state affects the overhead of dynamically moving tasks.
We present a practical modification of the recent divide-and-conquer algorithms of [3] for approximating the eigenvalues of a real symmetric tridiagonal matrix. In this modified version, we avoid the numerical stabili...
详细信息
We present a practical modification of the recent divide-and-conquer algorithms of [3] for approximating the eigenvalues of a real symmetric tridiagonal matrix. In this modified version, we avoid the numerical stability problems of the algorithms of [3] but preserve their insensitivity to clustering the eigenvalues and the possibility to give a priori upper bounds on their computational cost for any input matrix. We confirm the theoretical effectiveness of our algorithms by numerical experiments.
Software for computer simulations of adaptive optics systems for atmospheric laser applications designed on the basis of advanced parallel programming techniques is developed. The adaptive optics system model includes...
详细信息
Software for computer simulations of adaptive optics systems for atmospheric laser applications designed on the basis of advanced parallel programming techniques is developed. The adaptive optics system model includes the emitting aperture geometry and beam propagation path scenario, vertical profiles of atmospheric parameters, fast parallel split-step Fourier algorithm for solving wave diffraction and propagation equations, time-dependent models of "frozen" atmospheric turbulence with a wide range of scales, and models of the wavefront sensor and controlled deformable mirror. The hardware system for computer simulations is an off-the-shelf desktop with a 6-core 12-thread Intel (R) Core (TM) i7-970 CPU at the maximum frequency of 3.5 GHz and an NVIDIA (R) GeForce GTX 580 graphic accelerator with 512 universal processors operating at 1.5 GHz. Results of simulations of adaptive imaging and laser beam shaping, aimed at estimating the efficiency of adaptive optics systems on atmospheric paths are presented.
Pseudo-random properties of a class of two-dimensional (2-D) 5-neighborhood cellular automata (CA), built around nonlinear (OR, AND) and linear (XOR) Boolean functions are studied. The site values at each step of the ...
详细信息
Pseudo-random properties of a class of two-dimensional (2-D) 5-neighborhood cellular automata (CA), built around nonlinear (OR, AND) and linear (XOR) Boolean functions are studied. The site values at each step of the 2-D CA evolution are taken in parallel and form pseudo-random sequences, which satisfy the criteria established for pseudo random number generator (PRNG): long period, excellent random qualities, single bit error propagation (avalanche criteria), easy and fast generation of the random bits. A block-scheme for secure Stream Cipher based on 2-D CA is proposed. The 2-D CA based PRNG algorithm has simple structure, use space-invariant and local interconnections and can be easily realized in very large scale integration or parallel optoelectronic architectures.
暂无评论