This paper deals with the problem of computing, projections of digitalimages. The novelty of our contribution is that we present algorithms which are suitable for implementation in general purpose imageprocessing an...
详细信息
This paper deals with the problem of computing, projections of digitalimages. The novelty of our contribution is that we present algorithms which are suitable for implementation in general purpose imageprocessing and image analysis pipeline architectures. Also, we propose some new pipeline configurations which achieve a remarkable degree of parallelism in the computation of projection data and in fact, of many other geometrical descriptors of digitalimages. In particular, random access memories and other dedicated hardware devices are not needed in our algorithms. The effectiveness of our approach and feasibility of the proposed architectures are demonstrated by running our algorithms in commercially available short-pipelines for imageprocessing and analysis. Examples are shown of the use of projection data for machine vision applications.
This demonstration paper presents a multicore Real Time Operating System (RTOS) that schedules a parameterized dataflow Model of Computation (MoC) onto a multicore digital Signal Processor (DSP) at runtime. This RTOS ...
详细信息
This demonstration paper presents a multicore Real Time Operating System (RTOS) that schedules a parameterized dataflow Model of Computation (MoC) onto a multicore digital Signal Processor (DSP) at runtime. This RTOS called Synchronous Parameterized and Interfaced Dataflow Embedded Runtime (SPIDER) exploits the Parameterized and Interfaced Synchronous Dataflow (PiSDF) MoC and its features at runtime to identify locally static regions and to optimize their execution onto multicore platforms. The RTOS is used to dispatch a stereo matching algorithm tasks with a varying range of disparities. The platform used for this demonstration is a Texas Instruments Keystone ii Multiprocessor System-on-Chip (MPSoC) device composed of 8 DSP cores, 4 ARM cores, a shared memory subsystem, Multicore Navigator and multiple dedicated accelerators.
In this paper, we deal with the problem of detecting and segmenting objects in textured darkfield digitalimagery for automated visual inspection applications. The technique we will follow is based on a sequential app...
详细信息
In this paper, we deal with the problem of detecting and segmenting objects in textured darkfield digitalimagery for automated visual inspection applications. The technique we will follow is based on a sequential application of local operators which serves the purpose of clustering the object and the background gray levels. This procedure can be considered as an extension of average-thresholding type techniques. This algorithm has fast implementations in general purpose imageprocessing pipeline architectures and therefore, it is appealing to real-time computer vision applications. Computational examples showing the effectiveness of the segmentation technique will be discussed.
Matrix inversion is a computationally intensive basic block of many digital signal processingalgorithms. To decrease the cost of their implementations, programmers often prefer the fixed-point arithmetic. This arithm...
详细信息
Matrix inversion is a computationally intensive basic block of many digital signal processingalgorithms. To decrease the cost of their implementations, programmers often prefer the fixed-point arithmetic. This arithmetic requires less resources and runs faster than the floating-point arithmetic, but all the arithmetical details must be handled by the programmer. In this article, we overcome this drawback by presenting an automated approach to synthesize fixed-point code for matrix inversion based on Cholesky decomposition. First we rigorously define the square root and division operators especially in terms of rounding error, and we implement them in the CGPE library. This allows us to provide accuracy certificates for the generated code. Second we propose a workflow based on Cholesky decomposition that carefully uses these operators to produce accurate code for the basic blocks of matrix inversion. Finally we illustrate the efficiency of our approach on some benchmarks, and show how it allows us to synthesize accurate code in a few seconds and thus to reduce the development time of fixed-point matrix inversion.
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-iiimage Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the...
详细信息
This paper describes architectures and design of a general purpose parallel image processor chip called a SliM-iiimage Processor. The chip has a linear array of 64 processing elements (PEs), operates at 30 MHz in the worst case simulation and gives 1.92 GIPS. SiiM-ii can greatly reduce the inter-PE communication overhead, due to the idea of sliding that is overlapping inter-PE communication with computation. In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU operation, data I/O, and inter-PE communication simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-ii provides parallel load/store between the register file and on-chip memory as in DSP chips. The bandwidth of data I/O and inter-PE communication increases due to bit-parallel paths. We developed VHDL models and performed logic synthesis using the COMPASS/sup TM/ CAD tool. We used the COMPASS/sup TM/ 3.3 V 0.6 /spl mu/m standard cell library (v8r4.9.1). The total number of transistors is about 1.5 millions. The SliM-ii chip is being fabricated at the LG Semiconductor Co,, Ltd. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.
This paper describes architectures and design of a linear array processor chip called a SliM-iiimage Processor. The chip has a linear array of 64 processing elements (PEs). In contrast to existing array processors, e...
详细信息
This paper describes architectures and design of a linear array processor chip called a SliM-iiimage Processor. The chip has a linear array of 64 processing elements (PEs). In contrast to existing array processors, each PE has a multiplier that is quite effective for convolution, template matching, etc. The instruction set can execute an ALU, a data I/O, and an inter-PE communication operations simultaneously in an instruction cycle. In addition, during the ALU/multiplier operation, SliM-ii provides parallel data load/store between the register file and on-chip memory as in DSP chips. The SliM-ii contains about 1.5 million transistors in a 13.2/spl times/13.0 mm/sup 2/ core size and the package type is 208 pin PQ2. The performance estimation shows a significant improvement for algorithms requiring multiplications compared with existing array processors.
暂无评论