The intention of the contribution is to give an overview of the use of cluster computing for parallel and distributed solution of multidisciplinary optimization (MDO) problems based on the OpTiX-Workbench. First, a br...
详细信息
The intention of the contribution is to give an overview of the use of cluster computing for parallel and distributed solution of multidisciplinary optimization (MDO) problems based on the OpTiX-Workbench. First, a brief summary of nonsequential solution concepts for nonlinear optimization on multiprocessor systems will be given. The focus of attention will be put on coarse-grained parallelization and its implementation on clusters of workstations. The conceptual design objectives for the OpTiX-Workbench will be presented as well as its implementation on workstation clusters. The OpTiX system supports the steps from the formulation of MDO-problems to their solution on networks of computers. In order to demonstrate the usefulness of cluster computing, the solution of MDO-problems from the field of structural design and water management are discussed and numerical test results are supplied.
We describe distributed and parallel algorithms for processing remotely sensed data such as geostationary satellite imagery. We have built a distributed data repository based around the client-server computing model a...
详细信息
In distributedcomputing environment, an important factor affecting performance of parallel algorithm is communication bandwidth. A new parallel volume rendering algorithm is presented in this paper, based on Shear-Wr...
详细信息
ISBN:
(纸本)0819424285
In distributedcomputing environment, an important factor affecting performance of parallel algorithm is communication bandwidth. A new parallel volume rendering algorithm is presented in this paper, based on Shear-Wrap fractorization of the Viewing Transformation using Pipeline Framework of PCs. Taking full use of overlap of communication and computing, we overcome the bottleneck of communication. In the existed algorithms based on object partition, local rendering and image composition are divided into two serial processes. Communication hardly happens during local rendering. In the period of image compositing, however, communication is very busy, and even congested. Furthermore, there are a big synchronism overhead in this period. This paper well solves this drawback by making local rendering and image compositing concurrently through pipeline of PCs. We have experimented on a pipeline composing of 16 Pentiums. The result shows that, performance is not affected much by communication and system overhead is little compared to rendering time. This paper provides a new method for studying on low-price, high-efficiency and real-time volume rendering system.
VISPAR is a research project partially funded by the Commission of European Countries (CEC) at the National Centre for Software Technology in India for investigating the use of parallelcomputing in global illuminatio...
详细信息
VISPAR is a research project partially funded by the Commission of European Countries (CEC) at the National Centre for Software Technology in India for investigating the use of parallelcomputing in global illumination and visualization of 3D environments. A task delegation framework has been evolved based on the notion of distributed agents, which take both the responsibilities of suitably delegating tasks to other agents and of executing tasks delegated to them, based on their core competencies. The paper describes this framework and the distributed multi-agent architecture designed specifically, for obtaining high performance in interactive manipulation and walkthrough computations of very large virtual environments.
This paper presents a detailed evaluation of parallel message-driven programs on both message-passing and shared memory parallel architectures. Four large parallel applications from the domain of VLSI computer aided d...
详细信息
ISBN:
(纸本)9780897919029
This paper presents a detailed evaluation of parallel message-driven programs on both message-passing and shared memory parallel architectures. Four large parallel applications from the domain of VLSI computer aided design are evaluated, namely: parallel test pattern generation, parallel cell placement, logic synthesis, and event driven VHDL simulation. The parallelism structure, the communication characteristics, locality characteristics, grain sizes of computations, and detailed measurements of system time, idle time, and user time are measured for these applications. Results are presented for an Intel Paragon distributed-memory message-passing multicomputer and compared to a Sun SPARCcenter 1000E symmetric multiprocessor.
This article presents a new generation in parallel processing architecture for real-time image processing. The approach is implemented in a real time image processor chip, called the Xium(TM)-2, based on combining a f...
详细信息
ISBN:
(纸本)0819425885
This article presents a new generation in parallel processing architecture for real-time image processing. The approach is implemented in a real time image processor chip, called the Xium(TM)-2, based on combining a fully associative array which provides the parallel engine with a serial RISC core on the same die. The architecture is fully programmable and can be programmed to implement a wide range of color image processing, computer vision and media processing functions in real time. The associative part of the chip is based on patented pending methodology of Associative computing Ltd. (ACL), which condenses 2048 associative processors, each of 128 ''intelligent'' bits. Each bit can be a processing bit or st memory bit. At only 33 Mhz and 0.6 micron manufacturing technology process, the chip has It computational power of 3 Billion ALU operations per second and 66 Billion string search operations per second. The fully programmable nature of the Xium(TM)-2 chip enables developers to use ACL tools to write their own proprietary algorithms combined with existing image processing and analysis functions from ACL's extended set of libraries.
Some recent works have represented novel techniques that exploit cyclostationarity for channel identification in data communication systems using only second order statistics. In particular, it has been shown the feas...
详细信息
ISBN:
(纸本)0780343654;0780343662
Some recent works have represented novel techniques that exploit cyclostationarity for channel identification in data communication systems using only second order statistics. In particular, it has been shown the feasibility of blind identification based on the forward shift structure of the correlation matrices of the source. In this paper we propose an alternative high performance algorithm based on the above property but with an improved choice of the autocorrelation of the equalization matrices to be considered. The new representation of the equalization problem provide a cost function formulated as a large generalized eigenvalue problem, which can be efficiently solved by the Jacobi-Davidson method. We will mainly focus on the parallel aspects of the Jacobi-Davidson method on massively distributed memory computers. The performance of this method on this kind of architecture is always limited because of the global communication required for the inner products due to the Modified Gram-Schmidt (MGS) process. In this paper, we propose using Given rotations which require only local communications avoiding the global communication of inner products since this represents the bottleneck of the parallel performance on distributed memory computers. The corresponding data distribution and communication scheme will be presented as well. Several simulation experiments over different data transmission constellations carried out on Parsytec GC/PowerPlus are presented as well.
The communication overhead in many multiprocessor computing platform is a critical factor over performance. In this paper we will present communication performance of a large processing array built with TI 320C40 DSPs...
详细信息
ISBN:
(纸本)0819425885
The communication overhead in many multiprocessor computing platform is a critical factor over performance. In this paper we will present communication performance of a large processing array built with TI 320C40 DSPs. Inter-processor communication is provided by message passing which is a common method used in multiprocessors systems. The system is developed for image processing therefore transmission of large data blocks and various forms of communication are required frequently. The processor used in this system has six built in communication links. They are X-bit, bi-directional links with a speed of 20 Mbytes/sec. A processing array built with these processors employs MIMD paradigm and static interconnection. In this paper, the communication performance of such DSP network is investigated and performance results are presented. The communication functions include broadcasting, scattering, gathering and point to point transmission of messages.
A significant amount of research for developing a software environment for parallel computers is being performed. The research efforts are classified into three categories: compilers, languages and support tools. One ...
详细信息
A significant amount of research for developing a software environment for parallel computers is being performed. The research efforts are classified into three categories: compilers, languages and support tools. One of the most important support tools for parallel and distributedcomputing are the resource management tools. These tools enable the system to realize its the maximum utilization with the proper management of processors, communication channels and I/O devices. The goal of partitioning and scheduling schemes is to efficiently partition the application into several tasks and assign the individual onto the various processors of the system.
Temporal databases maintain past, present and future data. TSQL2 is a query language designed for temporal databases. In TSQL2, the GROUP BY clause has the temporal grouping property. In temporal grouping, the time li...
详细信息
Temporal databases maintain past, present and future data. TSQL2 is a query language designed for temporal databases. In TSQL2, the GROUP BY clause has the temporal grouping property. In temporal grouping, the time line of each attribute value is partitioned into several sections, and aggregate functions are computed for each time partition. This paper describes two approaches to parallelizing an algorithm for computing temporal aggregates. The two approaches have been implemented on an SGI PowerChallenge SMP parallel system. The experimental results show that the performance of the two approaches depends on data skew ratio and the number of processors used in the computation.
暂无评论