Device model evaluation, an essential part of a circuit simulator, is a compute-intensive task. A multiprocessor-based circuit simulator that ignores the parallelization of model equation formulation (LOAD), and just ...
详细信息
Device model evaluation, an essential part of a circuit simulator, is a compute-intensive task. A multiprocessor-based circuit simulator that ignores the parallelization of model equation formulation (LOAD), and just parallelizes the solution (SOLVE) of the equations will seriously degrade the simulation performance. this paper describes methods of parallelizing the LOAD part of a circuit simulator on PACE (parallel Architecture for Circuit Evaluation) a distributed memory multiprocessor designed at AT&T Bell Laboratories. this is integrated withthe parallel SOLVE algorithms given in our earlier work. Load balancing and minimization of interprocessor communication are used as the primary objectives of the parallel LOAD heuristics studied. Performance results, using the prototype PACE system, on benchmark circuits show the feasibility of our approach.< >
In this paper we study parallelarchitectures where the communication means are constituted solely by buses. these promising architectures can use the power of bus technologies, providing a viable way to interconnect ...
详细信息
the pedestrian monitoring functions investigated in this project illustrate the feasibility of automatic pedestrian data collection, even in crowded scenes. Real-time performance has been achieved for the density meas...
详细信息
ISBN:
(纸本)085296613X
the pedestrian monitoring functions investigated in this project illustrate the feasibility of automatic pedestrian data collection, even in crowded scenes. Real-time performance has been achieved for the density measuring routine using a small network of three transputers operating in parallel. A system of this nature would be valuable many surveys requiring measurements of crowd density. the automatic counting of individual pedestrians has also been demonstrated successfully. this software has not, so far, been optimized to operate in real time. However, the results obtained show that the presence and direction of movement of pedestrians is measurable. It may also be concluded that successful pedestrian tracking can be achieved using image processing technology. this type of system would be useful in road safety studies, for example, where the relative movements of road vehicles and pedestrians are of interest.
In most distributed memory MIMD multiprocessors, processors are connected by a point-to-point interconnection network. Since interprocessor communication frequently constitutes serious bottlenecks, several architectur...
详细信息
the implementation of image analysis applications consisting of many tasks on parallel machines raises problems regarding compatibility of data-representations between tasks and allocation of available resources. thes...
详细信息
the implementation of image analysis applications consisting of many tasks on parallel machines raises problems regarding compatibility of data-representations between tasks and allocation of available resources. these problems arise from the integration of different parallelalgorithms in one system and are not raised when parallelalgorithms are developed for each task independently. the harmonization of data-representation requirements of different parallelalgorithms and the partitioning of resources among them are problems that must be addressed at the level of application and not at the level of individual tasks. In this paper the problems are formulated, components of a system-based approach to these problems are presented and solutions are suggested that facilitate the development of complex industrial image analysis applications on parallel machines. the concrete example is a text-recognition application.
Discusses an alternate strategy for data-path synthesis. In this approach a Very Long Instruction Word (VLIW) processor structure consisting of a consolidated register file interconnected with functional units is used...
详细信息
Discusses an alternate strategy for data-path synthesis. In this approach a Very Long Instruction Word (VLIW) processor structure consisting of a consolidated register file interconnected with functional units is used as the underlying architecture. the functional units are described using ultra fine-grain templates which detail the functionality at the component level. During scheduling the architectural organization of the VLIW is relaxed, allowing a Percolation-based Scheduler to modify the templates so that parallelism in the application dictates architectural modification. the experiments demonstrate performance improvements on standard benchmarks, as well as improved memory port utilization for the synthesized VLIW architectures.< >
Épidaure, an Actor based programming environment, is presented. the Actor programming approach is combined withthe distributed shared memory (DSM) abstraction. Rather than using processes as compounding structur...
详细信息
this paper presents a flexible communication module for low-level as well as high-level image processing operations. It allows a good separation of data communication and data processing and thereby reduces the necess...
详细信息
this paper presents a flexible communication module for low-level as well as high-level image processing operations. It allows a good separation of data communication and data processing and thereby reduces the necessary amount of work for the implementation of parallel image processingalgorithms. It supports heterogenous processor systems. It has been successfully used for the parallel implementation of a hierarchical image transition and for its symbolic analysis on a 9-node transputer image processing system. Experimental results in the field of traffic sign detection are discussed.
Scalable parallel computer architectures provide the computational performance demanded by advanced biological computing problems. NIH has developed a number of parallelalgorithms and techniques useful in determining...
详细信息
Scalable parallel computer architectures provide the computational performance demanded by advanced biological computing problems. NIH has developed a number of parallelalgorithms and techniques useful in determining biological structure and function. these applications include processing electron micrographs to determine the three-dimensional structure of viruses, calculating the solvent accessible surface area of proteins to predict the three-dimensional conformation of these molecules from their primary structure, and searching for homologous DNA sequences in large genetic databases. Timing results demonstrate substantial performance improvements withparallel implementations compared with conventional sequential systems.
Matching is an important part of a model-based object recognition system. Matching is a difficult task, for a number of reasons. First, in a number of recognition systems matching is formulated as a combinatorial prob...
详细信息
Matching is an important part of a model-based object recognition system. Matching is a difficult task, for a number of reasons. First, in a number of recognition systems matching is formulated as a combinatorial problem with exponential worst-case complexity. thus, heuristics are needed to reduce the complexity by pruning the search space. Second, images do not present perfect data: noise and occlusion greatly complicate the task. Finally, even at moderate image resolutions the amount of data to be handled is such that this task cannot be done in real-time on supercomputers. Although no existing visual system can solve the general recognition problem, some existing approaches have obtained acceptable results for limited domains or simple scenes. Much less work has been done on parallel matching, despite the great need for speeding up the process. parallelalgorithms have often to be designed from scratch, and the recognition problem itself often requires reformulation since many of the proposed sequential algorithms do not lend themselves naturally to efficient parallel implementations. In this paper, we survey some of the existing parallel matching algorithms for 2D and 3D objects. Some of these algorithms have been implemented on SIMD architectures such as the Connection Machine or MasPar, or MIMD machines such as the Intel Touchstone Delta; other algorithms have been developed for the PRAM model of computation.
暂无评论