A close coupling of an SIMD engine with an MIMD architecture appears to be a promising new area of development of massively parallel processors. On the one hand, the high computational efficacy of the SIMD machines sh...
详细信息
A close coupling of an SIMD engine with an MIMD architecture appears to be a promising new area of development of massively parallel processors. On the one hand, the high computational efficacy of the SIMD machines should be preserved in future MPP developments;on the other, the flexibility of the MIMD approach should foster novel applications of MPP. the authors intend to discuss the hurdles that arise in closely integrating two such architectures, while at the same time presenting, with examples from some application fields, the features and functionalities that make this evolution such an interesting opportunity.
the proceedings contains 32 papers. Topics discussed include algorithms for parallelization, distributed computer systems and networking, software tools and environments, parallel finite and boundary elements, applica...
详细信息
the proceedings contains 32 papers. Topics discussed include algorithms for parallelization, distributed computer systems and networking, software tools and environments, parallel finite and boundary elements, applications in fluid flour and applications in applied science.
Vector computers have been extensively used for years in matrix algebra to treat with large dense matrix problems. However, if matrices are sparse and we use special storage schemes for them, vectorization provides a ...
详细信息
Vector computers have been extensively used for years in matrix algebra to treat with large dense matrix problems. However, if matrices are sparse and we use special storage schemes for them, vectorization provides a poor performance due to the great amount of indirections in the code. An alternative option is the utilization of a multiprocessor (or a cluster of workstations);in this case, a data parallel programming model also fails because of the reason pointed out for vector computers. therefore, the best choice is to parallelize the corresponding algorithms using message passing routines. In order to discuss these features, we will focus on solving sparse linear least squares problems, which appear in several scientific areas such as structural analysis, geodetic survey, molecular structure and many others. Experimental results are obtained for vector and parallel computer architectures.
Performance analysis of multiprocessor architectures in the early design phases is an important task in the development of complex parallelarchitectures. An approach for the hierarchical modeling and analysis of a wi...
详细信息
ISBN:
(纸本)0818677589
Performance analysis of multiprocessor architectures in the early design phases is an important task in the development of complex parallelarchitectures. An approach for the hierarchical modeling and analysis of a wide class of multiprocessor architectures is introduced. the technique combines and extends several efficient approaches to analyze large CTMCs underlying hierarchical models of multiprocessor systems. First, the generator matrix is represented in a very compact format which can be exploited in efficient numerical solution techniques. Second, symmetries inherent in most multiprocessor systems can be exploited by generating a CTMC with a smaller state space, which results from exact aggregation. Symmetry exploitation is fully automated and keeps the compact representation of the generator matrix.
the proceedings contains 80 papers from the Fourthinternationalconference on High Performance Computing. Topics discussed include: database management systems (DBMS);data migration and caching;algorithms;programming...
详细信息
the proceedings contains 80 papers from the Fourthinternationalconference on High Performance Computing. Topics discussed include: database management systems (DBMS);data migration and caching;algorithms;programming and languages;load balancing and scheduling;reconfigurable custom computing;routing;instruction level parallelism (ILP) architectures and compiler issues;parallel input/output and multithreaded systems;virtual channels;and image processing.
this paper presents a complete methodology for the automatic synthesis of VLSI architectures used in digital signal processing. Most signal processingalgorithms have the form of an n-dimensional nested loop with unit...
详细信息
ISBN:
(纸本)0780341376
this paper presents a complete methodology for the automatic synthesis of VLSI architectures used in digital signal processing. Most signal processingalgorithms have the form of an n-dimensional nested loop with unit uniform loop carried dependencies. We model such algorithms with generalized UET grids. We calculate the optimal makespan for the generalized UET grids and then we establish the minimum number of systolic cells required achieving the optimal makespan. We present a complete methodology for the hardware synthesis of the resulting architecture, based on VHDL. this methodology automatically detects all necessary computation and communication elements and produces optimal layouts. the complexity of our proposed scheduling policy is completely independent of the size of the nested loop and depends only on its dimension, thus being the most efficient (in terms of complexity) known to us. All these methods were implemented and incorporated in an integrated software package which provides the designer with a powerful parallel design environment, from high level signal processing algorithmic specifications to low-level (i.e., actual layouts) optimal implementation. the evaluation was performed using well-known algorithms from signal processing.
the proceedings contains 22 papers from the 1997 international Symposium on Field Programmable Gate Arrays. Topics discussed include: field programmable gate array (FPGA) architectures;FPGA partitioning and synthesis;...
详细信息
the proceedings contains 22 papers from the 1997 international Symposium on Field Programmable Gate Arrays. Topics discussed include: field programmable gate array (FPGA) architectures;FPGA partitioning and synthesis;rapid prototyping and emulation;reconfigurable computing;and FPGA floorplanning and routing.
this paper describes an application in high-performance signal processing using reconfigurable computing engines. the application is a 250 MHz cross-correlator for radio astronomy and was developed using the fastest a...
详细信息
ISBN:
(纸本)9780897918015
this paper describes an application in high-performance signal processing using reconfigurable computing engines. the application is a 250 MHz cross-correlator for radio astronomy and was developed using the fastest available Xilinx FPGA's. We will report experimental results on the operation of CMOS FPGA's at 250 MHz, and describe the architectural innovations required to build a 250 MHz reconfigurable signal processor. Extensions of the technique to a variety of high-performance real-time signal processingalgorithms are discussed. the results of this design work provide important clues as to how to improve FPGA architectures to better support real-time signal processing at hundreds of MHz. In particular, direct routing resources between logic elements are critical to preserving high performance. these routing resources need to be symmetric in order to allow for two-way communications between logic elements. Four-way symmetry and regularity would allow for orthogonal transformations of processing elements in a hierarchical fashion. Finally, experimental results indicate that clock buffering is frequently the cause of ultimate failure in speed and performance tests. Wave pipelining techniques may be suitable in clock distribution to improve performance to match that of other elements in the system.
the main features of the radio frequency (RF) hollow cathodes for thin film processing are summarized. the utilization of cylindrical RF hollow cathodes in boththe plasma-enhanced chemical vapour deposition (PECVD) a...
详细信息
the main features of the radio frequency (RF) hollow cathodes for thin film processing are summarized. the utilization of cylindrical RF hollow cathodes in boththe plasma-enhanced chemical vapour deposition (PECVD) and the physical vapour deposition (PVD) of films is reviewed. An example of the high rate PECVD of Si-N films is described in more detail. Gas metastables excited inside the cathode can act as a source of additional heat, thereby enhancing the thermionic electron emission and ionization of the gas. Transition from the glow into a hot cathode are regime is characterized by changes in the plasma parameters and consequently in the growth of films. Examples for PVD of TIN films are shown. Magnetic focusing in the linear are discharge source leads to the formation of linear hot zones at the outlet of the parallel-plate cathode and enables the hollow cathode discharges to be scaled up for large area applications. (C) 1997 Elsevier Science S.A.
暂无评论