Summary form only given. This article gives a brief overview of theoretical advances, computing trends, applications and future perspectives in parallel genetic algorithms. It explains basic terms and behavior of (par...
详细信息
Summary form only given. This article gives a brief overview of theoretical advances, computing trends, applications and future perspectives in parallel genetic algorithms. It explains basic terms and behavior of (parallel) genetic algorithms. Genetic algorithms are easily parallelized algorithms, therefore two kinds of possible parallelism, data parallelism and control parallelism, are mentioned and described towards them. parallelism of genetic algorithms brings many advantages and gains. Classifications of these algorithms are often based on the type of computing model, a walk strategy and the used computing machinery. Afterwards significant milestones in the theory with latest advances are briefly mentioned. Then current trends in parallel computing with stress computer architectures of parallel systems, interconnection topologies, operating systems, parallel (genetic) libraries and programming paradigms are reviewed shortly. The sufficient space is devoted to the latest applications of parallel genetic algorithms. After the discussion section, perspectives of the algorithms are predicted till the year 2005. The information in the article is segregated into two periods before and after the year 2000 in all chapters. The second period is more interesting and of higher importance, because it highlights recent research efforts and gives some hints about possible future trends. That is why we devote much space to the second period. As there is no such an overview of the recent period of parallel genetic algorithms, our investigation could be appealing and useful in many aspects.
The new computational technologies are having a very strong influence on numerical optimization, in several different ways. Many researchers have been stimulated by the need to either conform the existing numerical te...
详细信息
The new computational technologies are having a very strong influence on numerical optimization, in several different ways. Many researchers have been stimulated by the need to either conform the existing numerical techniques to the new parallelarchitectures or to devise completely new parallel solution approaches. A mini-symposium on parallel Computing in Nonlinear Optimization was held in Naples, Italy, September 2001, during the international Conference ParCo2001, in order to bring together researchers active in this field and to discuss and share their findings. Some of the papers presented during the mini-symposium, as well as additional contributions from other researchers are collected in this special issue. Clearly, two different trends, well representative for most of the current research activities, can be identified. Firstly, there is an attempt to encapsulate parallel/linear algebra software and algorithms into optimization codes, particularly codes implementing interior point strategies for which the linear algebra issues are very critical, and secondly, there is an effort to devise new parallel solution strategies in global optimization, either for specific or general purpose problems, motivated by the large size and the combinatorial nature of them. In the present paper we review the literature on these trends and classify the contributed papers within this framework. (C) 2003 Elsevier Science B.V. All rights reserved.
The new computational technologies are having a very strong influence on numerical optimization, in several different ways. Many researchers have been stimulated by the need to either conform the existing numerical te...
详细信息
The new computational technologies are having a very strong influence on numerical optimization, in several different ways. Many researchers have been stimulated by the need to either conform the existing numerical techniques to the new parallelarchitectures or to devise completely new parallel solution approaches. A mini-symposium on parallel Computing in Nonlinear Optimization was held in Naples, Italy, September 2001, during the international Conference ParCo2001, in order to bring together researchers active in this field and to discuss and share their findings. Some of the papers presented during the mini-symposium, as well as additional contributions from other researchers are collected in this special issue. Clearly, two different trends, well representative for most of the current research activities, can be identified. Firstly, there is an attempt to encapsulate parallel/linear algebra software and algorithms into optimization codes, particularly codes implementing interior point strategies for which the linear algebra issues are very critical, and secondly, there is an effort to devise new parallel solution strategies in global optimization, either for specific or general purpose problems, motivated by the large size and the combinatorial nature of them. In the present paper we review the literature on these trends and classify the contributed papers within this framework. (C) 2003 Elsevier Science B.V. All rights reserved.
The following topics are dealt with: Grid and distributed computing; scheduling task systems; shared-memory multiprocessors; imaging and visualization; testing and debugging; performance analysis and real-time systems...
详细信息
The following topics are dealt with: Grid and distributed computing; scheduling task systems; shared-memory multiprocessors; imaging and visualization; testing and debugging; performance analysis and real-time systems; scheduling for heterogeneous resources; networking; peer-to-peer and mobile computing; compiler technology and run-time systems; load balancing; network routing; parallelprogramming models; parallelalgorithms; scheduling and storage; parallel and distributed performance; software for high performance clusters; decentralized algorithms; multithreading and VLIW; parallel and distributed real-time systems; high-level parallelprogramming models and supportive environments; Java for parallel and distributed computing; nature inspired distributed computing; high performance computational biology; advances in parallel and distributed computational models; reconfigurable architectures; communication architecture for clusters; next generation systems; fault-tolerant parallel and distributed systems; wireless, mobile and ad hoc networks; parallel and distributed image processing, video processing, and multimedia; formal methods for parallelprogramming; Internet computing and e-commerce; parallel and distributed scientific and engineering computing with applications; massively parallel processing; performance modeling, evaluation, and optimization of parallel and distributed systems; and parallel and distributed systems: testing and debugging.
The MiPPS library supports a hybrid model of parallelprogramming. The library is targeted at commodity multiprocessors, with support for clusters. The implementation of the concurrency routines reveals discrepancies ...
详细信息
The MiPPS library supports a hybrid model of parallelprogramming. The library is targeted at commodity multiprocessors, with support for clusters. The implementation of the concurrency routines reveals discrepancies between popular operating systems. Tests on suitable applications also reveal similar discrepancies in performance across different multiprocessors. The MiPPS library has also been the basis of a parallelization of the Active Chart Parsing algorithm for speech understanding.
Cluster systems are getting increasingly popular since they provide large computational power at a reasonable price. The cluster nodes are often SMP with a small number of processors that can access a shared address s...
详细信息
Cluster systems are getting increasingly popular since they provide large computational power at a reasonable price. The cluster nodes are often SMP with a small number of processors that can access a shared address space. The nodes are connected by a network like Myrinet or SCI, so the global address space is distributed. In this paper, we present a new programming model for such clusters of SMP. The model allows the programmer to adapt his program to the two-level structure of the address space by providing a micro-level and a macro-level. The micro-level allows a thread formulation of multiprocessor tasks that are executed within a node of the cluster system. The macro-level allows the hierarchical structuring of multiprocessor-tasks according to the structure of the algorithm using message passing for data-exchange. We demonstrate the usefulness of the approach by runtime tests on several cluster systems with different node architectures and different interconnection networks.
In this paper, we analyze the performance of the floating point digital signal processor (DSP) TMS320C6711 for an implementation of video coding motion. Two relevant motion estimation techniques were implemented: BMA ...
详细信息
In this paper, we analyze the performance of the floating point digital signal processor (DSP) TMS320C6711 for an implementation of video coding motion. Two relevant motion estimation techniques were implemented: BMA (block matching algorithm) and BMGT (block matching using geometric transforms). These have been combined with fast block matching algorithms to speed up the process. In order to increase the DSP performance, we have optimized some programming mechanisms like: the level of code parallelism, hand designed assembly code and an efficient usage of internal memory as cache. This implementation has shown that real-time motion estimation of BMA type, can be implemented in this DSP. However. BMGT type motion estimation cannot be done by one DSP alone in-real time applications, due to its high computational complexity.
This work presets two emerging media microprocessors, VIRAM and Imagine, and compares the implementation strategies and performance results of these unique architectures. VIRAM is a complete system on a chip which use...
详细信息
This work presets two emerging media microprocessors, VIRAM and Imagine, and compares the implementation strategies and performance results of these unique architectures. VIRAM is a complete system on a chip which uses PIM technology to combine vector processing with embedded DRAM. Imagine is a programmable streaming architecture with a specialized memory hierarchy designed for computationally intensive data-parallel codes. First, we preset a simple and effective approach for understanding and optimizing vector/stream applications. Performance results are then presented from a number of multimedia benchmarks and a computationally intensive scientific kernel. We explore the complex interactions between programming paradigms, the architectural support at the ISA level and the underlying microarchitecture of these two systems. Our long term goal is to evaluate leading media microprocessors as possible building blocks for future high performance systems.
The demand for mobile computer power has exploded in the recent years. Variable length VLIW processors offer the necessary performance at low power. Software optimizations are necessary to further decrease the energy ...
详细信息
The demand for mobile computer power has exploded in the recent years. Variable length VLIW processors offer the necessary performance at low power. Software optimizations are necessary to further decrease the energy consumption. In this article we present a compiler optimization which reduces the dynamic power dissipation resulting from the switching activities during instruction fetch. Energy consumption can be reduced by minimizing the Hamming distance between successively fetched instruction words. Using a dynamic programming approach we first compute a set of optimal instruction arrangements of the execution bundles in a basic block. These sets are used in an enumerative optimal algorithm and a genetic evolution, in order to minimize an objective function for the Hamming distance. We evaluated our algorithms on different variable length VLIW architectures with 3 to 6 parallel functional units. On a large set of DSP benchmark programs the Hamming distance can be reduced by about 10% on average. Maximum reductions range up to 30%.
The design of memory consistency models for both hardware and soft-ware is a difficult task. It is particularly difficult for a programming language because the target audience is much wider than the target audience f...
详细信息
ISBN:
(纸本)0769515797
The design of memory consistency models for both hardware and soft-ware is a difficult task. It is particularly difficult for a programming language because the target audience is much wider than the target audience for a machine language, making usability a more important criteria. Adding to this problem is the fact that the programming language community has little experience designing programming language consistency models, and therefore each new attempt is very much a voyage into uncharted territory. A concrete example of the difficulties of the task is the current Java Memory Model. Although designed to be easy to use by Java programmers, it is poorly understood and at least one common idiom (the "double check idiom") to exploit the model is unsafe. In this paper we describe the design of an optimizing Java compiler that will accept, as either input or as an interface implementation, a consistency model for the code to be compiled The compiler will use escape analysis, Shasha and Snir's delay set analysis, and our CSSA program representation to normalize the effects of different consistency models on optimizations and analysis. When completed, the compiler will serve as a testbed to prototype new memory models, and to measure the differences of different memory models on program performance.
暂无评论