Dataflow architectures can achieve much better performance and higher efficiency than general-purpose core, approaching the performance of a specialized design while retaining programmability. However, advanced applic...
详细信息
Dataflow architectures can achieve much better performance and higher efficiency than general-purpose core, approaching the performance of a specialized design while retaining programmability. However, advanced application scenarios place higher demands on the hardware in terms of cross-domain and multi-batch processing. In this article, we propose a unified scale-vector architecture that can work in multiple modes and adapt to diverse algorithms and requirements efficiently. First, a novel reconfigurable interconnection structure is proposed, which can organize execution units into different cluster typologies as a way to accommodate different data-level parallelism. Second, we decouple threads within each DFG node into consecutive pipeline stages and provide architectural support. By time-multiplexing during these stages, dataflow hardware can achieve much higher utilization and performance. In addition, the task-based program model can also exploit multi-level parallelism and deploy applications efficiently. Evaluated in a wide range of benchmarks, including digital signalprocessingalgorithms, CNNs, and scientific computing algorithms, our design attains up to 11.95x energy efficiency (performance-per-watt) improvement over GPU (v100), and 2.01x energy efficiency improvement over state-of-the-art dataflow architectures.
The proceedings include 28 papers, 27 of them are indexed separately. Topics covered include advanced techniques of real-time signalprocessing in various fields, such as space information systems, radar data processi...
详细信息
The proceedings include 28 papers, 27 of them are indexed separately. Topics covered include advanced techniques of real-time signalprocessing in various fields, such as space information systems, radar data processing, missile guidance underwater acoustic imaging, wideband integrated optics. Also hardware architectures and algorithms for real-time signalprocessing are considered. Several presentations are devoted to digital approaches and analog implementations of real-time signalprocessing.
A streak camera is a recording instrument in which spatial image is swept in time. creating a spatial-temporal image on a charge-coupled device (CCD). Traditional analysis for captured image data has been using unifor...
详细信息
ISBN:
(纸本)0819454974
A streak camera is a recording instrument in which spatial image is swept in time. creating a spatial-temporal image on a charge-coupled device (CCD). Traditional analysis for captured image data has been using uniform and as sampling points, in which a block of CCD pixel readouts are summed to give one reading. Equivalently simple area moving averages are applied concurrently while sampling, and high frequency content is reduced. To solve this problem. we use peak-value sampling procedure, based on the view from photoelectron statistics. After background correction. maximum values in spatial dimensions are selected to obtain time series data. A DSP filter is then applied and optimized for this time series. A Welch algorithm fast Fourier transform is applied to obtain power spectra. Segmented cumulative spectra is then calculated for global statistics and related to time domain fluctuations. Self similarity at different sweeping time-scales is used to recognize CCD pattern noise. Sinusoidal pattern noise is automatically corrected by peak-value sampling. Computational results show that time-frequency analysis using peak-value sampling algorithm and similar variants is far more effective in discovering high frequency oscillatory noise than traditional uniform binned sampling. We have applied this algorithm to analyze data produced by a 4096x4096 CCD streak camera illuminated with a macro pulse laser. High frequency oscillations in 6similar to10 GHz region were found in laser spectra. Spatial-temporal oscillations of this range are difficult to diagnose with conventional optoelectronic detectors on a per-shot basis. This work has led to improvement of laser design.
Investigating a number of different integral transforms uncovers distinct patterns in the type of scale-based convolution theorems afforded by each. It is shown that scaling convolutions behave in quite a similar fash...
详细信息
Investigating a number of different integral transforms uncovers distinct patterns in the type of scale-based convolution theorems afforded by each. It is shown that scaling convolutions behave in quite a similar fashion to translational convolution in the transform domain, such that the many diverse transforms have only a few different forms for convolution theorems. The hypothesis is put forth that the space of integral transforms is partitionable based on these forms.
Fault tolerance is increasingly important as society has come to depend on computers for more and more aspects of daily life. The current concern about the Y2K problems indicates just how we much we depend on accurate...
详细信息
Fault tolerance is increasingly important as society has come to depend on computers for more and more aspects of daily life. The current concern about the Y2K problems indicates just how we much we depend on accurate computers. This paper describes work on time-shared TMR, a technique which is used to provide arithmetic operations that produce correct results in spite of circuit faults.
Wireless sensor networks present a number of challenges to system designers, including notably the efficient use of limited resources such as bandwidth and energy. One way these challenges can be addressed is through ...
详细信息
Wireless sensor networks present a number of challenges to system designers, including notably the efficient use of limited resources such as bandwidth and energy. One way these challenges can be addressed is through the application of signalprocessing principles in the design, deployment and operation of sensor networks. After a discussion of general issues arising in this context, this talk will describe several recent developments in this area. These include the effects of receiver choice on energy efficiency, collaborative beam-forming, sensor scheduling, and distributed learning. Some of the work described in this talk can be found in Refs. 1-8.
The Wigner Distribution Function (WDF) is a time-frequency descriptor capable of tracking the time-varying second order statistics in a signal. In this paper, we characterize linear systems in terms of the WDFs of the...
详细信息
Algorithm-based fault tolerance (ABFT) has been proposed as a cost-effective approach to concurrent error detection. So far, the application of ABFT has been limited to computationally intensive applications that lend...
详细信息
ISBN:
(纸本)0819416207
Algorithm-based fault tolerance (ABFT) has been proposed as a cost-effective approach to concurrent error detection. So far, the application of ABFT has been limited to computationally intensive applications that lend easily to high-level fault modeling. In this paper we extend the application of ABFT to non-computationally intensive applications. To that end, we first develop a fault model for such systems. Based on the fault model, we develop ABFT schemes for a set of graph theoretic as well as set theoretic problems. Application of the new schemes is illustrated with examples.
In this paper we consider cyclostationary signalprocessing techniques implemented via acousto-optics (AO). Cyclic processing methods are reviewed and motivated, including the cyclic correlation and the cyclic spectru...
详细信息
ISBN:
(纸本)0819416207
In this paper we consider cyclostationary signalprocessing techniques implemented via acousto-optics (AO). Cyclic processing methods are reviewed and motivated, including the cyclic correlation and the cyclic spectrum. We show how a 1D AO spectrum analyzer can be used to detect the presence, and estimate the value, of cycle frequencies. The cyclic correlation can then be computed at cycle frequencies of interest using a 1D time-integrating correlator. Next we consider the problem of computing the (2D) cyclic correlation for all cycle frequencies and lags simultaneously. This is accomplished via an AO triple-product processor, configured in a manner similar to that used for ambiguity function generation. The cyclic spectrum can be obtained in a post-processing step by Fourier transforming the cyclic correlation in one dimension. We then consider higher order extensions of the cyclic correlation and show how a 2D slice of the 3D cyclic triple-correlation can be computed using an AO four-product processor.
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. I...
详细信息
ISBN:
(纸本)0819445584
Speech is metered if the stresses occur at a nearly regular rate. Metered speech is common in poetry, and it can occur naturally in speech, if the speaker is spelling a word or reciting words or numbers from a list. In radio communications, the CQ request, call sign and other codes are frequently metered. In tactical communications and air traffic control, location, heading and identification codes may be metered. Moreover metering may be expected to survive even in HF communications, which are corrupted by noise, interference and mistuning. For this environment, speech recognition and conventional machine-based methods are not effective. We describe Time-Frequency methods which have been adapted successfully to the problem of mitigation of HF signal conditions and detection of metered speech. These methods are based on modeled time and frequency correlation properties of nearly harmonic functions. We derive these properties and demonstrate a performance gain over conventional correlation and spectral methods. Finally, in addressing the problem of HF single sideband (SSB) communications, the problems of carrier mistuning, interfering signals, such as manual Morse, and fast automatic gain control (AGC) must be addressed. We demonstrate simple methods which may be used to blindly mitigate mistuning and narrowband interference, and effectively invert the fast automatic gain function.
暂无评论