3D silicon technology has been under development since 1980, primarily aimed at on-focal- plane signal processing to solve a variety of military sensor systems problems. the thrust has been to bring more and more para...
详细信息
ISBN:
(纸本)0819415413
3D silicon technology has been under development since 1980, primarily aimed at on-focal- plane signal processing to solve a variety of military sensor systems problems. the thrust has been to bring more and more parallel analog and digital processing into the closest possible proximity to the detector array. At this time on-focal-plane functionality includes preamplification, spatial and temporal matched filtering, nonuniformity correction, neural networks, analog-digital conversion, digital logic, and digital memory. Historically, a custom- built specialty technology constrained by cost in its applicability, 3D silicon has undergone a dual-use conversion to include high-volume, low-cost commercial computer electronics. 3D silicon is on the way to becoming the lowest-cost-per-gate technology available and, because of this, sensor system design and performance will be revolutionized.
this paper presents a flexible communication module for low-level as well as high-level image processing operations. It allows a good separation of data communication and data processing and thereby reduces the necess...
详细信息
this paper presents a flexible communication module for low-level as well as high-level image processing operations. It allows a good separation of data communication and data processing and thereby reduces the necessary amount of work for the implementation of parallel image processingalgorithms. It supports heterogenous processor systems. It has been successfully used for the parallel implementation of a hierarchical image transition and for its symbolic analysis on a 9-node transputer image processing system. Experimental results in the field of traffic sign detection are discussed.
this paper describes some techniques currently under research to explore hardware-software tradeoffs during a system development. We show that the moving of SW operations to HW can be further improved if the source co...
详细信息
this paper describes some techniques currently under research to explore hardware-software tradeoffs during a system development. We show that the moving of SW operations to HW can be further improved if the source code is modified in order to increase the overall parallelism of the system. We then show the limits of this approach and a new RISC architecture under research to overcome this limitations.< >
the proceedings contain 29 papers. the special focus in this conference is on Compiler Construction. the topics include: Action transformations in the ACTRESS compiler generator;an overview of door attribute grammars;...
ISBN:
(纸本)9783662466629
the proceedings contain 29 papers. the special focus in this conference is on Compiler Construction. the topics include: Action transformations in the ACTRESS compiler generator;an overview of door attribute grammars;coupling evaluators for attribute coupled grammars;towards the global optimization of functional logic programs;efficient organization of control structures in distributed implementations;implementing 2DT on a multiprocessor;global code selection for directed acyclic graphs;compiling nested loops for limited connectivity VLIWs;delayed exceptions — speculative execution of trapping instructions;a suite of analysis tools based on a general purpose abstract interpreter;flow grammars — a flow analysis methodology;provable correctness of prototype interpreters in LDL;developing efficient interpreters based on formal language specifications;generating an efficient compiler for a data parallel language from a denotational specification;towards provably correct code generation for a hard real-time programming language;supporting array dependence testing for an optimizing/parallelizing c compiler;processing array statements and procedure interfaces in the PREPARE high performance Fortran compiler;a practical approach to the symbolic debugging of parallelized code;reducing the cost of data flow analysis by congruence partitioning;interprocedural constant propagation using dependence graphs and a data-flow model;solving demand versions of interprocedural analysis problems;compile time instruction cache optimizations and a framework for scheduling across basic blocks.
the Naval Air Warfare Center, China Lake has developed a real time, hardware and software system designed to implement and evaluate biologically inspired retinal and cortical models. the hardware is based on the Adapt...
详细信息
ISBN:
(纸本)0819415472
the Naval Air Warfare Center, China Lake has developed a real time, hardware and software system designed to implement and evaluate biologically inspired retinal and cortical models. the hardware is based on the Adaptive Solutions Inc. massively parallel CNAPS system COHO boards. Each COHO board is a standard size 6U VME card featuring 256 fixed point, RISC processors running at 20 MHz in a SIMD configuration. Each COHO board has a Companion board built to support a real time VSB interface to an imaging seeker, a NTSC camera and to other COHO boards. the system is designed to have multiple SIMD machines each performing different Corticomorphic functions. the system level software has been developed which allows a high level description of Corticomorphic structures to be translated into the native microcode of the CNAPS chips. Corticomorphic structures are those neural structures with a form similar to that of the retina, the lateral geniculate nucleus or the visual cortex. this real time hardware system is designed to be shrunk into a volume compatible with air launched tactical missiles. Initial versions of the software and hardware have been completed and are in the early stages of integration with a missile seeker.
Matching is an important part of a model-based object recognition system. Matching is a difficult task, for a number of reasons. First, in a number of recognition systems matching is formulated as a combinatorial prob...
详细信息
Matching is an important part of a model-based object recognition system. Matching is a difficult task, for a number of reasons. First, in a number of recognition systems matching is formulated as a combinatorial problem with exponential worst-case complexity. thus, heuristics are needed to reduce the complexity by pruning the search space. Second, images do not present perfect data: noise and occlusion greatly complicate the task. Finally, even at moderate image resolutions the amount of data to be handled is such that this task cannot be done in real-time on supercomputers. Although no existing visual system can solve the general recognition problem, some existing approaches have obtained acceptable results for limited domains or simple scenes. Much less work has been done on parallel matching, despite the great need for speeding up the process. parallelalgorithms have often to be designed from scratch, and the recognition problem itself often requires reformulation since many of the proposed sequential algorithms do not lend themselves naturally to efficient parallel implementations. In this paper, we survey some of the existing parallel matching algorithms for 2D and 3D objects. Some of these algorithms have been implemented on SIMD architectures such as the Connection Machine or MasPar, or MIMD machines such as the Intel Touchstone Delta; other algorithms have been developed for the PRAM model of computation.
A novel reconfigurable architecture based on a multi-ring multiprocessor network is described. the reconfigurable architecture is shown to combine low network diameter with a low degree of connectivity for each node i...
详细信息
A novel reconfigurable architecture based on a multi-ring multiprocessor network is described. the reconfigurable architecture is shown to combine low network diameter with a low degree of connectivity for each node in the network. the mathematical properties of the network topology and the hardware for the reconfiguration switch are described. Primitive parallel operations on the network topology are described and analyzed. A large class of algorithms for the Boolean n-cube and the 2-D mesh is shown to map efficiently on the proposed architecture without loss of performance. the architecture is shown to be well suited for a number of problems in computer vision.
Scalable parallel computer architectures provide the computational performance demanded by advanced biological computing problems. NIH has developed a number of parallelalgorithms and techniques useful in determining...
详细信息
Scalable parallel computer architectures provide the computational performance demanded by advanced biological computing problems. NIH has developed a number of parallelalgorithms and techniques useful in determining biological structure and function. these applications include processing electron micrographs to determine the three-dimensional structure of viruses, calculating the solvent accessible surface area of proteins to predict the three-dimensional conformation of these molecules from their primary structure, and searching for homologous DNA sequences in large genetic databases. Timing results demonstrate substantial performance improvements withparallel implementations compared with conventional sequential systems.
Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. the article presents an analysis of a texture segmentation application containing...
详细信息
Segmentation and other image processing operations rely on convolution calculations with heavy computational and memory access demands. the article presents an analysis of a texture segmentation application containing a 96/spl times/96 convolution. Sequential execution required several hours an single processor systems with over 99% of the time spent performing the large convolution. 70% to 75% of execution time is attributable to cache misses within the convolution. We implemented the same application on CM-5, iPSC/860 and PVM distributed memory multicomputers, tailoring the parallelalgorithms to each machine's architecture. parallelization significantly reduced execution time, taking 49 seconds on a 512 node CM-5 and 6.5 minutes on a 32 node iPSC/860. the results indicate for large kernel convolutions the size and bandwidth of the fast memory store is more important than processor power or communication overhead.< >
In this paper, we identify the computational requirements for structural pattern analysis, particularly for the operations of spatial grouping and matching. We describe two such algorithmsthat are in wide use here at...
详细信息
In this paper, we identify the computational requirements for structural pattern analysis, particularly for the operations of spatial grouping and matching. We describe two such algorithmsthat are in wide use here at USC and discuss approaches to reducing their execution times via parallel implementation. We provide brief descriptions and results of two research projects geared generally, toward the parallel implementation of computer vision systems and specifically, towards these algorithms.
暂无评论