This paper deals with currently used algorithms for the reconstruction of functional images which run up to 60 hours or more on a single workstation and deal with hundreds of megabyte of data. A parallel implementatio...
详细信息
ISBN:
(纸本)3540641408
This paper deals with currently used algorithms for the reconstruction of functional images which run up to 60 hours or more on a single workstation and deal with hundreds of megabyte of data. A parallel implementation with high efficiency and almost linear speedup of a sophisticated iterative algorithm is given and its applicability to other reconstruction methods is shown. Whereas running this application on a high performance parallel computer is straightforward, there are more issues under production conditions as they are enforced by daily routine in a clinic. We adress the topic of fault tolerant parallelizing and batch queuing of programs which are typically written in a high level language like IDL or MATLAB and show how load balancing can preserve the ownership of workstations in a network of workstations (NOW) which is used for distributed computing during office hours.
This work reports the development of a versatile framework allowing the characterization and analysis of computer vision techniques as well as their applications to biological shapes, with attention focused on neural ...
详细信息
ISBN:
(纸本)0818692162
This work reports the development of a versatile framework allowing the characterization and analysis of computer vision techniques as well as their applications to biological shapes, with attention focused on neural cells. The proposed framework has been implemented within the Sigma ynergos system, a powerful imaging laboratory that includes, among other features, tools for performance assessment of computer vision techniques, image databases, real-time processing by using distributed systems and interface with the Internet. The motivations for the development of such a framework: (i) the importance of biological shape analysis;(ii) its potential as an effective tool for the systematic assessment of imageprocessing and analysis techniques;and (iii) the possibility of conducting extensive characterizations of biological shapes. The paper describes an experiment to assess multiscale shape features for complexity characterization, which have been adopted for the classification of two types of ganglion neural cells (cat), namely alpha and beta. This experiment involves: (1) a training stage where the k-means clustering algorithm learns the prototypes of each class from the database;(2) the neurons in the database are classified;(3) the classification results are compared to the original classes;and (4) the number of misclassifications is determined. The genetic algorithm is used as a means of effectively investigating the N-dimensional spaces defined by the parameter configurations.
The rapid grow of both, the size of remote sensing data and the number of users in this field requires systems which are easy to use, platform independent and mighty. Currently, many users are not able to process or e...
详细信息
The rapid grow of both, the size of remote sensing data and the number of users in this field requires systems which are easy to use, platform independent and mighty. Currently, many users are not able to process or even access data the way they would like to. Utilizing upcoming technologies like WWW, Java and CORBA, we propose a distributed system that connects users, data bases and method bases. Latter ones help users to find an appropriate sequence of methods for processing, and incorporating a broker they schedule execution onto fast remote processing units (backends). We discuss design considerations concerning the interaction of the back-end with other system components, and strategies for effective job distribution.
This paper deals with the parallel implementation of reconstruction algorithms for functional imaging on a network of workstations (NOW). Algorithms which provide the best image quality are not used in clinical routin...
详细信息
This paper deals with the parallel implementation of reconstruction algorithms for functional imaging on a network of workstations (NOW). Algorithms which provide the best image quality are not used in clinical routine, because they have a runtime of up to 60 hours with real clinical data sets of several hundred megabytes. After giving an overview of currently used image reconstruction algorithms, we describe a general parallel implementation of these algorithms with almost linear speedup and high efficiency which cuts down the runtime to a feasible limit. The high load which is caused by the parallel application conflicts with the predominantly interactive usage of clinical workstations, therefore we address load balancing with an application oriented, adaptive mechanism in order to preserve the ownership of workstations. Furthermore we explain how the integration of MATLAB and IDL based applications with a conventional distributed queuing system (DQS) can be achieved and why this significantly improves usage in clinical routine.
This paper describes the architecture and operating system, and gives an evaluation of NEC's new parallel computer Cenju-4 Major features of Cenju-4 are: a) parallel memory architecture which encompasses distribut...
详细信息
This paper describes the architecture and operating system, and gives an evaluation of NEC's new parallel computer Cenju-4 Major features of Cenju-4 are: a) parallel memory architecture which encompasses distributed shared memory and user-level inter-processor communication. b) Scalable system from 8 nodes to 1,024 nodes. Using the powerful RISC processor VR10000 (200 MHz) from MIPS ii Technologies, Inc., Cenju-4 system can be configured from 8 nodes to 1,024 nodes, flexibly extending the system as the demand arises. c) Utilization of a flexible micro-kernel operating system. Since the system adopts a micro-kernel based operating system (MACH), it can be configured into several software environments such as UNIX double dagger server systems and, single system image systems. The key components of the system are two 1 M gate arrays which implement memory control, inter processor communication control and network communication controls. The programming environment provided are de-facto standard libraries, high-level programming languages such as MPI (Message Passing Interface), PVM (parallel Virtual Machine) and HPF (High Performance Fortran). The operating system and the inter-processor communication libraries fully exploit the functionality of the hardware to realize an inter-processor communication latency of 4.5 mu s and the throughput of 169 MB/s at user program level.
Instruction scheduling methods based on the construction of state diagrams (or automata) have been used for architectures involving deeply pipelined function units. However, the size of the state diagram is prohibitiv...
详细信息
Instruction scheduling methods based on the construction of state diagrams (or automata) have been used for architectures involving deeply pipelined function units. However, the size of the state diagram is prohibitively large, resulting in high execution time and space requirement. We present a simple method for reducing the size of the state diagram by recognizing unique paths of a state diagram. Our experiments show that the number of paths in the reduced state diagram is significantly lower-by 1 to 3 orders of magnitude-compared to the number of paths in the original state diagram. Using the reduced MS-state diagrams, we develop an efficient software pipelining method. The proposed software pipelining algorithm produced efficient schedules and performed better than R.A. Huff's (1993) Slack Scheduling method, and the original Co-scheduling method, in terms of both the initiation interval (ii) and the time taken to construct the schedule.
The residue-to-binary conversion is the crucial step for residue arithmetic. The traditional methods are the Chinese remainder theorem (CRT) and the mixed radix conversion. This paper presents new Chinese remainder th...
详细信息
ISBN:
(纸本)0780351487
The residue-to-binary conversion is the crucial step for residue arithmetic. The traditional methods are the Chinese remainder theorem (CRT) and the mixed radix conversion. This paper presents new Chinese remainder theorems I, ii, and Ill for the residue-to-binary conversion, with the following detailed results. (1) The big weights in the original CRT are reduced to a matrix of numbers less than the moduli P/sub i/. (2) The new Chinese remainder theorem I is a parallel algorithm in mixed radix format. The delay is reduced from O(n) to O(logn). (3) The new Chinese remainder theorem ii reduces the modulo operation from the size M to a size less than /spl radic/M. (4) The new Chinese remainder theorem ii can be easily extended to the new Chinese remainder theorem iiI for non-prime moduli sets. (5) A summary of a long list of references on residue-to-binary conversion is also presented.
parallelism has been perceived as the way for a computer vision system to achieve the required speedup in practice with existing algorithms and computing resources. It is known that methods of task division in such sy...
We describe a general purpose environment for the development of parallelimageprocessing/computer vision algorithms: PRIME (parallelimage Media processing Environment). ''General purpose'' here mean...
详细信息
ISBN:
(纸本)0819425885
We describe a general purpose environment for the development of parallelimageprocessing/computer vision algorithms: PRIME (parallelimage Media processing Environment). ''General purpose'' here means that the environment is designed so as to be used on a variety of multi-processor systems ranging from tightly-coupled computers to loosely-coupled computers. The key point of the system is that it provides an architecture-independent programming environment for imageprocessing and computer vision. We show the outline of PRIME, its implementation, and its preliminary performance evaluation.
This paper describes two different parallel computing approaches for imageprocessing problems on a Pentium based multiprocessor-system. These multiprocessor computers are often used as network servers. We demonstrate...
详细信息
ISBN:
(纸本)0819425885
This paper describes two different parallel computing approaches for imageprocessing problems on a Pentium based multiprocessor-system. These multiprocessor computers are often used as network servers. We demonstrate the utilization of one of these machines, equipped with four Intel Pentium processors, far a parallelimageprocessing task. A parallel computation of motion vector-fields based on correlation techniques is discussed to show the possible acceleration. The computational results show that a high efficiency can be reached, even a linear speedup is possible under certain conditions. Besides the mentioned correlation technique there are various imageprocessing problems that can easily be evaluated in parallel. Although massively parallel systems and special purpose systems are much faster, off-line imageprocessing can be accelerated by using these broadly available low-cost machines.
暂无评论