dataflow languages are a natural way to describe the flow of computations in a DSP application. The SILAGE language has been developed for this purpose. It contains also more-dimensional arrays of signals and a natur...
详细信息
dataflow languages are a natural way to describe the flow of computations in a DSP application. The SILAGE language has been developed for this purpose. It contains also more-dimensional arrays of signals and a natural extension of it, delayed versions of arrays, e.g. to represent previous frames in video applications. The paper describes new dataflow analysis techniques, to support multi-dimensional arrays. It checks single assignment of arrays, checks if for each consumption of an indexed signal, there is a production, and it will create data dependencies between productions and consumptions. These problems are formulated as integer linear programming problems. This formulation is independent of the number of signals in the arrays. Results show very fast running times (<1 s) for problems more than a hundred nodes.< >
Real-time cyclic spectral analysis is useful in many applications, but is difficult to achieve because of its computational complexity. This paper studies the distribution of complex multipliers in multiprocessor cycl...
详细信息
Real-time cyclic spectral analysis is useful in many applications, but is difficult to achieve because of its computational complexity. This paper studies the distribution of complex multipliers in multiprocessor cyclic spectrum analyzers, with the objective of obtaining computational balance. Computationally balanced implementations efficiently use hardware so that computational bottlenecks are reduced and a smooth flow of data between computational sections of the analyzer is maintained. Tables are presented that give the number of complex multipliers required in each section of the analyzer to obtain computational balance.< >
Many multiprocessor list scheduling heuristics that account for interprocessor communication delay have been proposed in recent years. However, no uniform comparative study of published heuristics has been performed i...
详细信息
Many multiprocessor list scheduling heuristics that account for interprocessor communication delay have been proposed in recent years. However, no uniform comparative study of published heuristics has been performed in almost 20 years. This paper presents the results of a large quantitative study using random, but program-like input graphs. We found differences in the performance of the various heuristics related to the amount of parallelism in the input graph. As well, we found that no single heuristic consistently produces the best schedules under call program structures and multiprocessor configurations. Finally we propose enhancements to existing heuristics as well as our own heuristic, DS, which strikes a good balance between performance and scheduling (i.e. compile) time.< >
Studies synchronous multi-rate dataflow graphs to determine the minimal required buffer sizes that still guarantee the construction of a deadlock-free static schedule. We develop a rule to quickly analyze a graph'...
详细信息
Studies synchronous multi-rate dataflow graphs to determine the minimal required buffer sizes that still guarantee the construction of a deadlock-free static schedule. We develop a rule to quickly analyze a graph's consistency. A graph is split up into single and parallel paths. Single paths are analysed, as well as the most frequent parallel paths. The results are used in the rapid prototyping environment GRAPE-II in the case where the emulation hardware contains FPGAs, or when memory is critical.< >
Examples of scientific visualization techniques used for the interactive exploration of very large data sets from supercomputer simulations of fluid flow are presented. Interactive rendering of images from simulations...
详细信息
Examples of scientific visualization techniques used for the interactive exploration of very large data sets from supercomputer simulations of fluid flow are presented. Interactive rendering of images from simulations of grids of 2 million or more computational zones are required to drive high-end graphics workstations to their limits with 2-D data. The author presents one such image and discusses interactive steering of 2-D flow simulations, a phenomenon now possible with grids of half a million computational zones. He uses a simulation of compressible turbulence on a grid of 134 million computational zones to set the scale for discussing interactive 3-D visualization techniques. A concept for a gigapixel-per-second video wall, or gigawall, which could be built with present technology to meet the demands of interactive visualization of the data sets that will be produced by the next generation of supercomputers, is discussed.
This paper describes various aspects and results of 2D finite clement (FE) modelling of electrostatic fields in 12-electrode capacitive systems for two-phase flow imaging. The capacitive technique relies on changes in...
详细信息
This paper describes various aspects and results of 2D finite clement (FE) modelling of electrostatic fields in 12-electrode capacitive systems for two-phase flow imaging. The capacitive technique relies on changes in capacitances between electrodes (mounted on the outer surface of the flow pipe) due to the change in permittivities of flow components. The measured capacitances between various electrode pairs and the field computation data are used to reconstruct the cross sectional image of the flow components. FE modelling of the electric field is necessary to optimize design variables and evaluate the system response to various flow regimes, likely to be encountered in practice. Results are presented in terms of normalized capacitances for various flow regimes. The effects of key geometric parameters of the electrode system are also presented and analyzed.
An implementation of the ray-tracing algorithm that is based on the Voxar parallel processing model, which simulates 3D physical phenomena, is discussed. The implementation of a general parallel ray-tracing program, a...
详细信息
An implementation of the ray-tracing algorithm that is based on the Voxar parallel processing model, which simulates 3D physical phenomena, is discussed. The implementation of a general parallel ray-tracing program, and the implementation of Voxar are reviewed. Results of a performance evaluation of Voxar show that the machine's best points are its efficiency on complex ray-tree images and its parallel animation functionality. Its weak points are the insufficiency of the deadlock prevention strategy, the high cost of the communication system, the sequential generation of the primary rays, and the rigidity of the regular subdivision.
A performance-measurement facility for current leads has been developed as a part of Argonne National Laboratory's program to develop applications for high-temperature superconductors. The facility measures the ra...
详细信息
A performance-measurement facility for current leads has been developed as a part of Argonne National Laboratory's program to develop applications for high-temperature superconductors. The facility measures the rate of helium vapor boil-off due to current-lead heat input to liquid helium and the pressure drop across a current lead for a pair of leads operating at currents up to 100 A. The facility's major components are a liquid-helium dewar with low background-heat input;a dewar insert that incorporates the current leads and associated instrumentation or connections for flow, pressure, level, temperature and voltage measurements;and a computer driven data-acquisition system. Background heat input is low enough so that boil-off rates one-tenth that of an optimized conventional lead can be characterized. The facility has been operated with conventional (i.e., vapor-cooled copper) leads, and with leads incorporating high-temperature superconductors at their cold ends. Details of the facility design, construction, and operating experience are presented.
During the past ten years several variants of an analysis technique called program slicing have been developed. Program slicing has applications in maintenance tasks such as debugging, testing, program integration, pr...
详细信息
During the past ten years several variants of an analysis technique called program slicing have been developed. Program slicing has applications in maintenance tasks such as debugging, testing, program integration, program verification, etc. and can be characterized as a type of dependence analysis. A program slice can loosely be defined as the subset of a program needed to compute a certain variable value at a certain program position. A novel method for interprocedural dynamic slicing which is more precise than interprocedural static slicing methods and is useful for dependence analysis at the procedural abstraction level was given by M. Kamkar et al. (1992, 1993). It is demonstrated here how interprocedural dynamic slicing can be used to increase the reliability and precision of interprocedural dataflow testing. The work on dataflow testing reported by E. Duesterwald et al. (1992), which is a novel method for dataflow testing through output influences, is generalized.< >
Parallel implementation issues of the textured algorithm to solve the optimal routing problem (ORP) in data networks is investigated. The textured model decomposes a large data network into a multi-level structure, ea...
详细信息
Parallel implementation issues of the textured algorithm to solve the optimal routing problem (ORP) in data networks is investigated. The textured model decomposes a large data network into a multi-level structure, each level contains a few subnetworks and each subnetwork is controlled by a local processor (e.g. an internet gateway). Subnetworks of the same level are not overlapped with each other, subnetworks on different levels overlap partially. Compared with solving the ORP globally, the textured algorithm clearly saves computation time and has better precision since parallel computation is applied to smaller scale subproblems at each level. On the other hand, synchronization overhead among local processors needs to be addressed. It is shown that due to the characteristics of the textured algorithm, its synchronization overhead can be managed to remain a constant as the size of network increases since one only needs to exchange data among neighboring processors.< >
暂无评论