the proceedings contain 51 papers. the special focus in this conference is on System Software and algorithms. the topics include: Charon message-passing toolkit for scientific computations;dynamic slicing of concurren...
ISBN:
(纸本)3540414290
the proceedings contain 51 papers. the special focus in this conference is on System Software and algorithms. the topics include: Charon message-passing toolkit for scientific computations;dynamic slicing of concurrent programs;an efficient run-time scheme for exploiting parallelism on multiprocessor systems;characterization and enhancement of static mapping heuristics for heterogeneous systems;optimal segmented scan and simulation of reconfigurable architectures on fixed connection networks;reducing false causality in causal message ordering;the working-set based adaptive protocol for software distributed shared memory;evaluation of the optimal causal message ordering algorithm;register efficient mergesorting;applying patterns to improve the performance of fault tolerant CORBA;design, implementation and performance evaluation of a high performance CORBA group membership protocol;analyzing the behavior of event dispatching systems through simulation;a domain-specific semi-automatic parallelization tool;practical experiences with java compilation;performance prediction and analysis of parallel out-of-core matrix factorization;integration of task and data parallelism;parallel and distributed computational fluid dynamics;parallel congruent regions on a mesh-connected computer;can scatter communication take advantage of multidestination message passing?;a first class design constraint for future architectures;embedded computing;instruction level distributed processing;speculative multithreaded processors;a fast tree-based barrier synchronization on switch-based irregular networks;meta-data management system for high-performance large-scale scientific data access and parallel sorting algorithms with sampling techniques on clusters with processors running at different speeds.
this paper adopts a transformational programming approach for deriving massively parallelalgorithms from functional specifications. It gives a brief description of a framework for relating key higher order functions ...
详细信息
ISBN:
(纸本)0780365429
this paper adopts a transformational programming approach for deriving massively parallelalgorithms from functional specifications. It gives a brief description of a framework for relating key higher order functions such as map, reduce, and scan with communicating processes with different configurations. the parallelisation of many interesting functional algorithms can then be systematically synthesized by combining "off the shelf" parallel implementations of instances of these higher order functions. Efficiency in the final message-passing algorithms is achieved by exploiting data parallelism, for generating the intermediate results in parallel; and functional parallelism, for processing intermediate results in stages such that the output of one stage is simultaneously input to the next one. this approach is illustrated through a case study for testing whether all the elements of a given list are distinct. Bird-Meertens formalism is used to concisely carry out algebraic transformations.
Comparison of five different 32-bit integer multipliers is done for various performance measures. Multipliers included in comparison are the array multiplier, modified Booth (radix-4) multiplier, optimized Wallace tre...
详细信息
ISBN:
(纸本)9643600572
Comparison of five different 32-bit integer multipliers is done for various performance measures. Multipliers included in comparison are the array multiplier, modified Booth (radix-4) multiplier, optimized Wallace tree multiplier, combined modified Booth-Wallace tree multiplier and twin pipe serial parallel multiplier. Comparison is based on synthesis results obtained by synthesizing all multiplier architectures towards FPGA.
Higher-order statistics or cumulants, and their associated Fourier transforms, have been established as powerful analytical tools in modern signal processing. To achieve real-time performance in estimating cumulants d...
详细信息
Higher-order statistics or cumulants, and their associated Fourier transforms, have been established as powerful analytical tools in modern signal processing. To achieve real-time performance in estimating cumulants directly from the incoming time-series data, it is necessary to design a VLSI implementable parallel architecture that speeds up the estimation process. this paper presents a computationally efficient VLSI architecture for computing third-order cumulants for two-dimensional signals. Specifically, the third-order cumulants estimation algorithm is first reformulated so that any redundancy due to symmetry properties is eliminated, and the inherently available parallelism is revealed and exploited by a suitable architecture. It is based on a systolic array implementation and exploits parallelism, pipelining, and regular cell structures. the system architecture consists of (3q/sup 2/+9q+2) processing elements (PEs), where q is the maximum lag of third-order cumulant sequence. Performance in terms of speedup and efficiency is evaluated.
this paper describes a parallel architecture for a variety of algorithms for video compression. It has been designed to meet the requirements of encoding and decoding according to the ITU-T standard H.263. the archite...
详细信息
ISBN:
(纸本)3540664432
this paper describes a parallel architecture for a variety of algorithms for video compression. It has been designed to meet the requirements of encoding and decoding according to the ITU-T standard H.263. the architecture is an implementation of the instruction systolic array (ISA) model which combines the simplicity of systolic arrays withthe flexibility of a programmable parallel computer. Although the parallel accelerator unit is implemented on no more than 9 mm(2) of silicon it suffices to meet the compression rate necessary to send a compressed video stream through a standard ISDN terminal interface.
Following in the wake of the Accelerated Strategic Computing Initiative (ASCI) of the US Department of Energy, in the forthcoming years powerful new supercomputers will be brought into the market by the manufacturers ...
详细信息
ISBN:
(数字)9783540491644
ISBN:
(纸本)3540656413
Following in the wake of the Accelerated Strategic Computing Initiative (ASCI) of the US Department of Energy, in the forthcoming years powerful new supercomputers will be brought into the market by the manufacturers participating in the high-performance computing race. Hence, the large-scale computing facilities in the key research centers and industrial plants world-wide will surpass the teraflops performance barrier, too. the parallelarchitectures will be further extended to hierarchically clustered parallel computers mainly based on commodity-chip processors and SMP nodes tying together possibly tens of thousands of processing elements. In addition, heterogeneous computing and metacomputing will determine future large-scale computing by interconnecting supercomputers of diverse architectures as giant supercomputer complexes. these developments will challenge not only system reliability, availability and serviceability to novel levels, but also interactivity of concurrent algorithms and, in particular, adaptivity, accuracy and stability of parallel numerical methods.
the paper discusses the possibilities of designing TV systems with adaptive parallel pre-processing of signals. the efficiency of the system is calculated in comparison with TV systems which have the coarse structure ...
详细信息
the proceedings contain 59 papers. the special focus in this conference is on parallel Computing in Regular Structures. the topics include: Analytical modeling of parallel application in heterogeneous computing enviro...
ISBN:
(纸本)3540663630
the proceedings contain 59 papers. the special focus in this conference is on parallel Computing in Regular Structures. the topics include: Analytical modeling of parallel application in heterogeneous computing environments;skeletons and transformations in an integrated parallel programming environment;sequential unification and aggressive lookahead mechanisms for data memory accesses;a coordination model and facilities for efficient parallel computation;parallelizing of sequential programs on the basis of pipeline and speculative features of the operators;kinetic model of parallel data processing;PSA approach to population models for parallel genetic algorithms;highly accurate numerical methods for incompressible 3D fluid flows on parallelarchitectures;dynamic task scheduling with precedence constraints and communication delays;two-dimensional scheduling of algorithms with uniform dependencies;consistent lamport clocks for asynchronous groups with process crashes;comparative analysis of learning methods of cellular-neural associative memory;emergence and propagation of round autowave in cellular neural network;routing and embeddings in super cayley graphs;implementing cellular automata based models on parallelarchitectures;overview, design innovations, and preliminary results;implementing model checking and equivalence checking for time petri nets by the RT-MEC tool;learning concurrent programming;the speedup performance of an associative memory based logic simulator;a high-level programming environment for distributed memory architectures;virtual shared files;an object oriented environment to manage the parallelism of the FIIT applications;performance studies of shared-nothing parallel transaction processing systems;synergetic tool environments and logically instantaneous communication on top of distributed memory parallel machines.
this paper describes the application of the Relaxation By Elimination (RBE) method to matching the 3D structure of molecules in chemical databases within the frame work of binary correlation matrix memories. the paper...
详细信息
ISBN:
(纸本)0852967217
this paper describes the application of the Relaxation By Elimination (RBE) method to matching the 3D structure of molecules in chemical databases within the frame work of binary correlation matrix memories. the paper illustrates that, when combined with distributed representations, the method maps well onto these networks, allowing high performance implementation in parallel systems. It outlines the motivation, the neural architecture, the RBE method and presents some results of matching small molecules against a database of 100,000 models.
暂无评论