this paper adopts a transformational programming approach for deriving massively parallelalgorithms from functional specifications. It gives a brief description of a framework for relating key higher order functions ...
详细信息
ISBN:
(纸本)0780365429
this paper adopts a transformational programming approach for deriving massively parallelalgorithms from functional specifications. It gives a brief description of a framework for relating key higher order functions such as map, reduce, and scan with communicating processes with different configurations. the parallelisation of many interesting functional algorithms can then be systematically synthesized by combining "off the shelf" parallel implementations of instances of these higher order functions. Efficiency in the final message-passing algorithms is achieved by exploiting data parallelism, for generating the intermediate results in parallel; and functional parallelism, for processing intermediate results in stages such that the output of one stage is simultaneously input to the next one. this approach is illustrated through a case study for testing whether all the elements of a given list are distinct. Bird-Meertens formalism is used to concisely carry out algebraic transformations.
In this paper, a pipelined architecture for inverse discrete cosine transform (IDCT) is presented. Pipeline architectures are popular in parallel fast Fourier transform implementations but they are rare in IDCT implem...
详细信息
In this paper, a pipelined architecture for inverse discrete cosine transform (IDCT) is presented. Pipeline architectures are popular in parallel fast Fourier transform implementations but they are rare in IDCT implementations due to the irregularities in fast IDCT algorithms. the proposed architecture is derived by applying vertical projection to in-place IDCT algorithm. the resulting structure is modular and easy to pipeline. the word width requirements in the internal arithmetic are estimated to fulfil the requirements set by IEEE standard for 8×8 inverse cosine transform.
Comparison of five different 32-bit integer multipliers is done for various performance measures. Multipliers included in comparison are the array multiplier, modified Booth (radix-4) multiplier, optimized Wallace tre...
详细信息
ISBN:
(纸本)9643600572
Comparison of five different 32-bit integer multipliers is done for various performance measures. Multipliers included in comparison are the array multiplier, modified Booth (radix-4) multiplier, optimized Wallace tree multiplier, combined modified Booth-Wallace tree multiplier and twin pipe serial parallel multiplier. Comparison is based on synthesis results obtained by synthesizing all multiplier architectures towards FPGA.
this paper examines implementations of a multi-layer perceptron (MLP) on bus-based shared memory (SM) and on distributed memory (DM) multiprocessor systems. the goal has been to optimize HW and SW architectures in ord...
详细信息
this paper examines implementations of a multi-layer perceptron (MLP) on bus-based shared memory (SM) and on distributed memory (DM) multiprocessor systems. the goal has been to optimize HW and SW architectures in order to obtain the fastest response possible. Prototyping parallel MLP algorithms for up to 8processing nodes withthe DM as well as SM memory was done using CSP-based TRANSIM tool. the results of prototyping MLPs of different sizes on various number of processing nodes demonstrate the feasible speedups, efficiency and time responses for the given CPU speed, link speed or bus bandwidth.
Following in the wake of the Accelerated Strategic Computing Initiative (ASCI) of the US Department of Energy, in the forthcoming years powerful new supercomputers will be brought into the market by the manufacturers ...
详细信息
ISBN:
(纸本)3540656413
Following in the wake of the Accelerated Strategic Computing Initiative (ASCI) of the US Department of Energy, in the forthcoming years powerful new supercomputers will be brought into the market by the manufacturers participating in the high-performance computing race. Hence, the large-scale computing facilities in the key research centers and industrial plants world-wide will surpass the teraflops performance barrier, too. the parallelarchitectures will be further extended to hierarchically clustered parallel computers mainly based on commodity-chip processors and SMP nodes tying together possibly tens of thousands of processing elements. In addition, heterogeneous computing and metacomputing will determine future large-scale computing by interconnecting supercomputers of diverse architectures as giant supercomputer complexes. these developments will challenge not only system reliability, availability and serviceability to novel levels, but also interactivity of concurrent algorithms and, in particular, adaptivity, accuracy and stability of parallel numerical methods.
this paper describes a parallel architecture for a variety of algorithms for video compression. It has been designed to meet the requirements of encoding and decoding according to the ITU-T standard H.263. the archite...
详细信息
ISBN:
(纸本)3540664432
this paper describes a parallel architecture for a variety of algorithms for video compression. It has been designed to meet the requirements of encoding and decoding according to the ITU-T standard H.263. the architecture is an implementation of the instruction systolic array (ISA) model which combines the simplicity of systolic arrays withthe flexibility of a programmable parallel computer. Although the parallel accelerator unit is implemented on no more than 9 mm(2) of silicon it suffices to meet the compression rate necessary to send a compressed video stream through a standard ISDN terminal interface.
the proceedings contain 59 papers. the special focus in this conference is on parallel Computing in Regular Structures. the topics include: Analytical modeling of parallel application in heterogeneous computing enviro...
ISBN:
(纸本)3540663630
the proceedings contain 59 papers. the special focus in this conference is on parallel Computing in Regular Structures. the topics include: Analytical modeling of parallel application in heterogeneous computing environments;skeletons and transformations in an integrated parallel programming environment;sequential unification and aggressive lookahead mechanisms for data memory accesses;a coordination model and facilities for efficient parallel computation;parallelizing of sequential programs on the basis of pipeline and speculative features of the operators;kinetic model of parallel data processing;PSA approach to population models for parallel genetic algorithms;highly accurate numerical methods for incompressible 3D fluid flows on parallelarchitectures;dynamic task scheduling with precedence constraints and communication delays;two-dimensional scheduling of algorithms with uniform dependencies;consistent lamport clocks for asynchronous groups with process crashes;comparative analysis of learning methods of cellular-neural associative memory;emergence and propagation of round autowave in cellular neural network;routing and embeddings in super cayley graphs;implementing cellular automata based models on parallelarchitectures;overview, design innovations, and preliminary results;implementing model checking and equivalence checking for time petri nets by the RT-MEC tool;learning concurrent programming;the speedup performance of an associative memory based logic simulator;a high-level programming environment for distributed memory architectures;virtual shared files;an object oriented environment to manage the parallelism of the FIIT applications;performance studies of shared-nothing parallel transaction processing systems;synergetic tool environments and logically instantaneous communication on top of distributed memory parallel machines.
Withthe emergence of new and sophisticated control devices like data gloves and data suits, there is an increasing need to integrate gestural expression into the musical composition and performance environment. In su...
详细信息
the proceedings contain 69 papers. the special focus in this conference is on parallel Numerics, parallel Computing in Image processing, Video processing, and Multimedia. the topics include: Non-standard parallel solu...
ISBN:
(纸本)3540656413
the proceedings contain 69 papers. the special focus in this conference is on parallel Numerics, parallel Computing in Image processing, Video processing, and Multimedia. the topics include: Non-standard parallel solution strategies for distributed sparse linear systems;optimal tridiagonal solvers on mesh interconnection networks;parallel pivots LU algorithm on the cray T3E;experiments withparallel one-sided and two-sided algorithms for SVD;combined systolic array for matrix portrait computation;a class of explicit two-step runge-kutta methods with enlarged stability regions for parallel computers;a parallel strongly implicit algorithm for solution of diffusion equations;a parallel algorithm for lagrange interpolation on k-ary n-cubes;long range correlations among multiple processors;a monte-carlo method with inherent parallelism for numerical solving partial differential equations with boundary conditions;blocking techniques in numerical software;HPF and numerical libraries;an object library for parallel sparse array computation;performance analysis and derived parallelization strategy for a SCF program at the hartree fock level;computational issues in optimizing ophthalmic lens;parallel finite element modeling of solidification processes;architectural approaches for multimedia processing;on parallel reconfigurable architectures for image processing;parallel multiresolution image segmentation with watershed transformation and solving irregular inter-processor data dependency in image understanding tasks.
暂无评论