An investigation into the performance evaluation of sequential and parallel computing has been carried out. Performance metrics, on the basis of maximum efficiency, have been proposed for parallelarchitectures, These...
详细信息
An investigation into the performance evaluation of sequential and parallel computing has been carried out. Performance metrics, on the basis of maximum efficiency, have been proposed for parallelarchitectures, These apply to both homogeneous and heterogeneous architectures and are consistent with those of traditional architectures. These have been verified through implementation of several algorithms on uni- and multi-processor architectures. Based on the proposed concept of speedup a task allocation strategy for heterogeneous architectures has been developed. It has been demonstrated that, with such a strategy, the efficiency achieved with a heterogeneous architecture is near to its maximum value. Moreover, it has been shown that to achieve maximum efficiency a large proportion of tasks must be allocated to the faster processor of the architecture. However, due to the disparity in capabilities of the processors, communication overhead becomes a dominant factor in the implementation. Thus, to obtain a better task allocation and minimum communication overhead, high performance processors must be selected carefully. Compiler efficiency and code optimisation have been investigated showing that these affect the performance of the processors in real-time applications. The code optimisation experiments have also shown that the regularity or irregularity of the algorithm, as well as the code itself affect the performance of the processor. It has accordingly been demonstrated that different processor capabilities, communication overhead and an inappropriate task allocation can affect dramatically the performance of the application. On the other hand, a poor performance of the processor can result due to the regularity or irregularity of the application, compiler and optimisation levels of the compiler. The applications considered have varying computing requirements due to their different characteristics and different sizes. The heterogeneity present in these architectures
The paper presents an interactive segmentation system that uses a parallelprocessing architecture. Poor contrasts, variable tissue properties and complex-shaped structures make the isolation of meaningful regions of ...
详细信息
The paper presents an interactive segmentation system that uses a parallelprocessing architecture. Poor contrasts, variable tissue properties and complex-shaped structures make the isolation of meaningful regions of interest difficult. The interactive approach uses the human user's knowledge base to assist in image segmentation. The measurement of regions of interest enable the resultant output image to be quantified for clinical purposes. The graphical user interface - developed under Microsoft Windows - incorporates a mouse driven interactive display. A transputer based parallelprocessing engine is provided for the computationally intensive tasks of the system. These modules of the system communicate with each other using the Windows Dynamic Data Exchange (DDE) model.
The paper presents an interactive segmentation system that uses a parallelprocessing architecture. Poor contrasts, variable tissue properties and complex-shaped structures make the isolation of meaningful regions of ...
The paper presents an interactive segmentation system that uses a parallelprocessing architecture. Poor contrasts, variable tissue properties and complex-shaped structures make the isolation of meaningful regions of interest difficult. The interactive approach uses the human user's knowledge base to assist in image segmentation. The measurement of regions of interest enable the resultant output image to be quantified for clinical purposes. The graphical user interface-developed under Microsoft Windows-incorporates a mouse driven interactive display. A transputer based parallelprocessing engine is provided for the computationally intensive tasks of the system. These modules of the system communicate with each other using the Windows Dynamic Data Exchange (DDE) model.
This paper discusses the possibilities for parallelprocessing of the Full- and Limited-Memory BFGS training algorithms, two powerful second-order optimization techniques used to train Multilayer Perceptrons. The step...
详细信息
This paper discusses the possibilities for parallelprocessing of the Full- and Limited-Memory BFGS training algorithms, two powerful second-order optimization techniques used to train Multilayer Perceptrons. The step size and gradient calculations are identified as the critical components in both. The matrix calculations in the Full-Memory algorithm are also shown to be significant for larger problems. Various strategies are considered for parallelisation, the best of which is implemented on PVM and transputer based architectures. The generation of a neural predictive model for a nonlinear chemical plant is used as a control case study to assess parallel performance in terms of achievable speed-up. The transputer implementation is found to give excellent speed-ups but the size of problem that can be trained is limited by memory constraints. On the other hand speed-ups achievable with the PVM implementation are much poorer because of inefficient communication, but memory does not pose a problem.
The nature of several algorithms from the active vibration control domain is discussed and the issues of algorithm parallelisation and mapping are introduced. The algorithms have been implemented on the network of C40...
详细信息
The nature of several algorithms from the active vibration control domain is discussed and the issues of algorithm parallelisation and mapping are introduced. The algorithms have been implemented on the network of C40s and the real-time performance evaluated. In all the cases, a single C40 has shown enough processing power for the real-time implementation of the algorithms. Multiprocessor implementation did not offer very impressive performances as compared to ideal speedup, due to communication overhead, run-time memory management and some other unpredictable features of the architecture. However, the obtained real time performances of the uni-processor and the multi-processor architectures are quite reasonable.< >
This paper describes a fully decentralized, transputer-based architecture for data fusion problems. This architecture takes the form of a network of sensor nodes, each with its own processing facility, which together ...
详细信息
This paper describes a fully decentralized, transputer-based architecture for data fusion problems. This architecture takes the form of a network of sensor nodes, each with its own processing facility, which together do not require any central processor or any central communication facility. In this architecture, computation is performed locally and communication occurs between any two nodes. Such an architecture has many desirable properties including robustness to sensors failure, and flexibility to the addition or loss of one or more sensors. We first describe the decentralized data fusion algorithm and some of its consequences. We then describe a number of implementations of this algorithm: on a vision-based surveillance network, on a large process control rig comprising some 150 sensors, and on a modular mobile robot.< >
The proceedings contains 10 papers. Some of the specific topics discussed are: animation using accelerated ray tracing on a hypercube;SPARC-GAP- a parallel genetic algorithm processing;a transputer based parallel DSP ...
详细信息
The proceedings contains 10 papers. Some of the specific topics discussed are: animation using accelerated ray tracing on a hypercube;SPARC-GAP- a parallel genetic algorithm processing;a transputer based parallel DSP (digital signal processing) environment;and parallel architecture for real-time control.
A parallel color image processing system for grading fresh agricultural products is described in this paper. The system uses a transputer array to process data at a rate of 80 objects per second. It is also provided w...
详细信息
A parallel color image processing system for grading fresh agricultural products is described in this paper. The system uses a transputer array to process data at a rate of 80 objects per second. It is also provided with debug buffer which allows uninterrupted systems operation. Some of the system's other features are a data magazine module, a chaincoder module, a task allocation module and a postal service module.
A distributed computer containing DSP devices as the computational elements is presented. The computer is based on multiple processor modules (TM5320CXX series), interconnected via a time division multiplexed transput...
详细信息
A distributed computer containing DSP devices as the computational elements is presented. The computer is based on multiple processor modules (TM5320CXX series), interconnected via a time division multiplexed transputer link channel to a central control unit. Some simulation performance results for the communication mechanism are also presented.< >
The authors present a method of setting up image processing chains for applications with strict temporal constraints. The method is based on the concept of a modular parallel vision machine, also called an evaluation ...
详细信息
暂无评论