Some necessary background in speech recognition and window systems is given, with an analysis of how they might be combined. Xspeak, a navigation application, and its operation and a field study of its use are describ...
详细信息
Some necessary background in speech recognition and window systems is given, with an analysis of how they might be combined. Xspeak, a navigation application, and its operation and a field study of its use are described. With Xspeak, window navigation tasks usually performed with a mouse can be controlled by voice. An improved version, Xspeak II, which incorporates a language for translating spoken commands, is introduced
An algorithm is presented for identifying state-space models from frequency-domain data. The main advantage of this approach is that it avoids windowing distortions associated with other frequency-domain methods. Othe...
详细信息
An algorithm is presented for identifying state-space models from frequency-domain data. The main advantage of this approach is that it avoids windowing distortions associated with other frequency-domain methods. Other advantages are that an arbitrary frequency weighting can be introduced to shape the estimation error, and the system order can be overspecified. The numerical properties are demonstrated on real physical data taken on a complex flexible structure, leading to the successful identification of a multivariable (four-input/three-output) 100-state model over a bandwidth of 100 Hertz. The results indicate that the algorithm would be useful in applications requiring the accurate identification of high-order systems over wide bandwidths.
Using a workstation cluster for parallel program development requires consideration of various factors to optimise the mapping of the algorithm to the characteristics of the environment, In this paper we present a new...
详细信息
Using a workstation cluster for parallel program development requires consideration of various factors to optimise the mapping of the algorithm to the characteristics of the environment, In this paper we present a new analysis and verification of well-known ideas in parallel programming research of specific importance to both the use and design of workstation cluster computing systems. We define a new performance measure related to memory resource utilisation and show how redundant memory usage can lead to poor memory utilisation of the cluster We also present analytical and experimental evidence that the pool-of-tasks paradigm can lead to significantly improved speedup over series-parallel algorithms, especially when considering equivalent computational and communication requirements. The effect of load balancing on the series-parallel and pool-of-tasks algorithms is examined, and our analysis and experimental results confirm not only that the pool-of-tasks algorithms are more robust to load imbalances but that the effect of the imbalance is mitigated when more workstations are used. (C) 1998 John Wiley & Sons, Ltd.
Frequency analysis using DFT (discrete Fourier transform) or its faster computational technique (FFT) is an obvious choice for the entire image and signal processing domain where spectral leakage or picket fence effec...
详细信息
Frequency analysis using DFT (discrete Fourier transform) or its faster computational technique (FFT) is an obvious choice for the entire image and signal processing domain where spectral leakage or picket fence effect is a major problem. Earlier works describe the software and ROM-based implementation of windowing functions to overcome the above-mentioned problems during spectral analysis. In this work we have proposed a CORDIC (co-ordinate rotation digital computer)-based unified windowing architecture to remove the spectral leakage, picket fence effect and resolution problems with different tradeoff between mainlobe and sidelobe in the frequency domain. A parallel-pipelined architecture has been adopted for the present design to ensure high throughput for real-time applications with the latency equal to twice of CORDIC length plus three extra cycles. This unified architecture includes a combination of linear CORDIC and circular CORDIC with FIFO and a few multiplexers where the selection of window and its length are user defined. We have synthesised this architecture with 0.18 mu m CMOS technology using Synopsys Design Analyser. The total estimated dynamic power was found to be 350 mW with an operating frequency of 125 MHz and total cell area 11 mm(2) (approximately).
Four basic algorithms for implementing distributed shared memory are compared. Conceptually, these algorithms extend local virtual address spaces to span multiple hosts connected by a local area network, and some of t...
详细信息
Four basic algorithms for implementing distributed shared memory are compared. Conceptually, these algorithms extend local virtual address spaces to span multiple hosts connected by a local area network, and some of them can easily be integrated with the hosts' virtual memory systems. The merits of distributed shared memory and the assumptions made with respect to the environment in which the shared memory algorithms are executed are described. The algorithms are then described, and a comparative analysis of their performance in relation to application-level access behavior is presented. It is shown that the correct choice of algorithm is determined largely by the memory access behavior of the applications. Two particularly interesting extensions of the basic algorithms are described, and some limitations of distributed shared memory are noted
This paper investigates various processor management techniques for improving the performance of mesh-connected multicomputers. Unlike almost all prior work where the focus was on improving the submesh recognition abi...
详细信息
This paper investigates various processor management techniques for improving the performance of mesh-connected multicomputers. Unlike almost all prior work where the focus was on improving the submesh recognition ability of the processor allocation algorithms, this research examines other alternatives to improve system performance beyond what is achievable with usually assumed first come first served (FCFS) scheduling and any allocation. First, we use the smallest job first (SJF) policy to improve the spatial parallelism in a mesh. Next. we introduce a generic processor management scheme called multitasking and multiprogramming (M(2)). Then, an M(2) policy for mesh-connected multicomputers called virtual mesh (VM) is proposed and analyzed. The proposed VM scheme allows multiprogramming of jobs on several VMs. Finally, a novel approach called limit allocation is used for job allocation. With this scheme, a job (submesh) size is reduced if the job cannot be allocated. The objective here is to reduce the job waiting time and hence improve the overall performance. While all of the three approaches are viable alternatives to reduce the average job response time under various workloads, the VM and the limit allocation techniques are especially attractive for providing some additional features. The VM scheme brings in the concept of time-sharing execution for better efficiency and limit allocation shows how job size restriction can be beneficial for performance and fault-tolerance in a mesh topology. Moreover, the limit allocation scheme using even the simplest allocation policy can outperform any other approach. (C) 2001 Published by Elsevier Science B.V.
The applications of discrete-time signal-processing techniques, such as windowing and filtering for the purpose of implementing accurate excitation schemes in the finite-difference time-domain (FDTD) method are demons...
详细信息
The applications of discrete-time signal-processing techniques, such as windowing and filtering for the purpose of implementing accurate excitation schemes in the finite-difference time-domain (FDTD) method are demonstrated. The effects of smoothing windows of various lengths and digital lowpass filters of various bandwidths and characteristics are investigated on finite-source excitations of the FDTD computational domain. Both single-frequency sinusoidal signals and multifrequency arbitrary signals are considered.
In the classical scheduling theory it is widely assumed that any task requires for its processing only one processor at a time. In this paper the problem of deterministic scheduling of tasks requiring for their proces...
详细信息
In the classical scheduling theory it is widely assumed that any task requires for its processing only one processor at a time. In this paper the problem of deterministic scheduling of tasks requiring for their processing more than one processor at a time, i.e., a constant set of dedicated processors, is analyzed. Schedule length is assumed to be a performance measure. Tasks are assumed to be preemptable and independent. Low order polynomial algorithms for simple cases of the problem are given. Then a method to solve the general version of the problem for a limited number of processors is presented, while the case of an arbitrary number of processors is known to be NP-hard. Finally, a version of the problem, where besides processors every task can also require additional resources, is considered.
Suggests that transport protocols which use simple, low-cost mechanisms provide fast general-purpose cluster communication over protected virtual networks, while highlighting a design of one such protocol. What is nee...
详细信息
Suggests that transport protocols which use simple, low-cost mechanisms provide fast general-purpose cluster communication over protected virtual networks, while highlighting a design of one such protocol. What is needed for a cluster protocol; Details on the design of a cluster protocol; Information on virtual networks.
Technological improvements achieved for reconfigurable hardware together with applications' increasing demand of flexibility and speed for complex applications, make run-time reconfigurable devices an interesting ...
详细信息
Technological improvements achieved for reconfigurable hardware together with applications' increasing demand of flexibility and speed for complex applications, make run-time reconfigurable devices an interesting piece in the design of computing systems. Operating systems ( OS) have to be extended with functionalities, which allow them to efficiently manage the field-programmable gate arrays (FPGA). A constant-complexity algorithm is presented that would be a part of such an extended OS. The algorithm decides the scheduling and placing of arrival tasks with real-time constraints in the FPGA device. It divides the FPGA area into four partitions of different sizes. Each partition has an associated queue where the hardware manager places each arriving task depending on its size, shape and real time parameters as deadline requirements. The algorithm may change the queue selection policy, partition strategies and the sizes or the number of partitions at run-time in order to adapt itself in function of special characteristics of task profiles, taking into account task sizes and execution times for tasks in each queue. The authors will present experimental results, which prove that our algorithm is as competitive as other complex 'area-greedy' algorithms for real applications with real-time deadline constraints.
暂无评论