3D pose estimation based on a monocular camera can be applied to various fields such as human-computer interaction and human action recognition. As a two-stage 3D pose estimator, videoPose3D achieves state-of-the-art ...
详细信息
ISBN:
(纸本)9784901122207
3D pose estimation based on a monocular camera can be applied to various fields such as human-computer interaction and human action recognition. As a two-stage 3D pose estimator, videoPose3D achieves state-of-the-art accuracy. However, because of the limitation of two-stage processing, image information is partially lost in the process of mapping 2D poses to 3D space, which results in limited final accuracy. This paper proposes an image-assisting pose estimation model and a back-projection based offset generating module. The image-assisting pose estimation model consists of a 2D pose processing branch and an imageprocessing branch. image information is processed to generate an offset to refine the intermediate 3D pose produced by the 2D pose processing network. The back-projection based offset generating module projects the intermediate 3D poses to 2D space and calculates the error between the projection and input 2D pose. With the error combining with extracted image feature, the neural network generates an offset to decrease the error. By evaluation, the accuracy on each action of Human3.6M dataset gets an average improvement of 0.9 mm over the videoPose3D baseline.
During the last years, Time-of-Flight sensors achieved a significant impact onto research fields in machinevision. In comparison to stereo vision system and laser range scanners they combine the advantages of active ...
详细信息
ISBN:
(纸本)9780819486813
During the last years, Time-of-Flight sensors achieved a significant impact onto research fields in machinevision. In comparison to stereo vision system and laser range scanners they combine the advantages of active sensors providing accurate distance measurements and camera-based systems recording a 2D matrix at a high frame rate. Moreover low cost 3D imaging has the potential to open a wide field of additional applications and solutions in markets like consumer electronics, multimedia, digital photography, robotics and medical technologies. This paper focuses on the currently implemented 4-phase-shift algorithm in this type of sensors. The most time critical operation of the phase-shift algorithm is the arctangent function. In this paper a novel hardware implementation of the arctangent function using a reconfigurable processor system is presented and benchmarked against the state-of-the-art CORDIC arctangent algorithm. Experimental results show that the proposed algorithm is well suited for real-time processing of the range images of TOF cameras.
The normalized cross correlation coefficient is a prevalent pattern-matching algorithm in machinevision for industrial inspections. Despite its common use, there are problems with practical applications. For instance...
详细信息
The normalized cross correlation coefficient is a prevalent pattern-matching algorithm in machinevision for industrial inspections. Despite its common use, there are problems with practical applications. For instance, false alarms occur since it is highly sensitive to environmental changes or inspection equipment, not to mention it requires complex calculations. This paper proposes the partial information correlation coefficient (PICC) method to improve the traditional normalized cross correlation coefficient (TNCCC). The PICC uses the technique of significant points to calculate the correlation coefficient. An experiment is also conducted to demonstrate the application through many image samples from the IC industry, such as PCBs, BGAs, and ICs. The results show that the PICC can effectively reduce false alarms in defect detection.
machinevision is an enabling technology for many applications but ''alignment'' is arguably the most useful application class. Alignment is the task of ''finding the position of a landmark or ...
详细信息
ISBN:
(纸本)0819426377
machinevision is an enabling technology for many applications but ''alignment'' is arguably the most useful application class. Alignment is the task of ''finding the position of a landmark or work piece in the electronic image'' so that it can be tracked, moved, followed, or otherwise adjusted. Many early alignment applications were in aerospace and defense. The visual 'landmark' they used was a star, a constellation or a laser-designated target. These applications made possible highly stable satellite platforms, accurate antenna aiming, and accurate military ordinance that are simply not possible with any other technology. These 'aiming' applications were extensions of traditional gunsighting techniques and nautical navigation. In factory automation, vision-based alignment continues to play a key role in the semiconductor and electronics manufacturing revolution. Robotic machinery requires precision guidance to mate work pieces (dice and printed wiring boards) with process machinery (bonders, saws, and robots). machinevision technology arrived just in time to make this possible, and new developments continue to improve precision and productivity in this area. New alignment applications are emerging in unexpected areas, such as the automotive service garage. This paper describes a new automotive service application for vehicle wheel alignment. Two machinevision cameras measure the position and attitude of few: wheel-mounted targets as the vehicle rolls and is steered. Six axes of rotation are used to define locations and orientations of the axles in three dimensional space. Their values are visibly inferred and measured, and their geometric relationships computed. The measurements are compared against the vehicle's ideal design tolerances for adjustment and repair purposes.
The traditional imageprocessing techniques require a lot of computational effort due to data on each pixel are computed in a sequential way and the path of information is an A/D converter. The delay accumulation crea...
详细信息
ISBN:
(纸本)0819439835
The traditional imageprocessing techniques require a lot of computational effort due to data on each pixel are computed in a sequential way and the path of information is an A/D converter. The delay accumulation create in this process is unacceptable in real time imageprocessing because of the high information flow managed in the usual vision tasks (e.g. automatic industrial inspection, vision problems in robotics, pattern analysis, etc.). Thus, the use of a massive parallel architecture working with analog signals avoids the previous problems. This is just the basis idea of Cellular Neural Network (CNN's): an array of analogic dynamic processors which cells interact directly within a finite local neighborhood. The local CNN connectivity allow its realization as VLSI chips that can operate at a very high speed and complexity. Nowadays CNN architectures implemented as VLSI chips shows the aptitude of extremely high speed compared with traditional digital imageprocessing tools. The proliferation of more and more sophisticated CNN architectures, and the increasing effort to implant practical system based in CNN chips, make important the development of analog algorithm to perform complex imageprocessing tasks dedicated to many different fields, i.e. industrial applications, robotic systems and pattern recognition. The objective of this work is to generate a learning machine capable of find solutions for complex imageprocessing task by CNN's. First a general machine for automatic analog algorithm design independent of the problem to solve is created, this is accomplished through an evolutionary strategy that is an extension of genetic programming. Second, this work introduces a suite of sub-mechanisms that increase the power of genetic programming and contribute to reduce the enormous space search for producing a plentiful search. Some concepts in this section are related with AI theory, in such a way that in this work we are in the intersection field of AI and Imag
Deep learning-based approaches, such as Convolutional Neural Nets (CNNs), have shown high performance in classifying contents of images. CNNs, however, have the notable drawbacks of potentially high computing costs, p...
详细信息
ISBN:
(纸本)9781510674219;9781510674202
Deep learning-based approaches, such as Convolutional Neural Nets (CNNs), have shown high performance in classifying contents of images. CNNs, however, have the notable drawbacks of potentially high computing costs, poor explainability, and wide performance variance if the underlying imagery data deviates from the training baseline. As advanced imageprocessing capabilities are matured, the on-board detection of objects in space-based imagery is increasingly proposed. On-board satellite processingapplications, which may be resource-limited, can drive the need for simpler models that reduce the necessary computing burden for edge computing applications. This raises the question of how well classic computer vision techniques can compete with more modern approaches. This paper characterizes and compares the performance of multiple computer vision models for the application of distinguishing maritime vessels from typical clutter in commercial electrooptical (EO) satellite imagery. A Support Vector machine (SVM) model using manually curated features is compared to multiple DL-based models spanning a range of model sizes, with the goal of determining whether classical approaches can compete favorably with DL when computational resources are taken into consideration. Differences in performance and processing resources are characterized between the approaches. Findings include that the SVM-based model may approach the accuracy of some CNN-based models for classifying images of clouds in satellite EO imagery for smaller DL-based models. However, even the smallest DL-based models, which take about the same computational resources as the SVM-based model, generally out-perform the SVMbased model. This finding may have implications for the operational use of on-board processing techniques for satellite payloads.
This paper concentrates on how to construct wavelets according to the practical needs of computer vision and imageprocessing. At first, a theory for the construction of dyadic wavelets has been established. The resul...
详细信息
ISBN:
(纸本)0819424935
This paper concentrates on how to construct wavelets according to the practical needs of computer vision and imageprocessing. At first, a theory for the construction of dyadic wavelets has been established. The resulted dyadic wavelets possess zero-symmetric or zero-antisymmetric property, and can also be fastly decayed so that they are suitable for edge detection. Then an algebra approach for the construction of orthogonal wavelets is proposed. It facilitaes the selection of best wavelet for a given imageprocessing task such as image compression.
The image Intensifier Tube (IIT) is the most critical component within a night vision device. Acquisition, production, test and evaluation of image intensifier tubes can be greatly enhanced by the application of machi...
详细信息
ISBN:
(纸本)0819445630
The image Intensifier Tube (IIT) is the most critical component within a night vision device. Acquisition, production, test and evaluation of image intensifier tubes can be greatly enhanced by the application of machinevision technology. The Navy, Air Force and Army have invested over $2,000,000 in the development of a machinevision-based test set known as the Automated Intensifier Measurement System (AIMS). This paper will describe methodologies employed in the AIMS to measure Modulation Transfer Function (MTF), Dark Spots, Bright Spots, Shear Distortion, and Gross Distortion.
The conference materials contain 29 papers dealing with machinevision imaging. Illumination and sensing methods and systems, image and instrument models, optical processing, data manipulation, three-dimensional imagi...
详细信息
ISBN:
(纸本)0819407518
The conference materials contain 29 papers dealing with machinevision imaging. Illumination and sensing methods and systems, image and instrument models, optical processing, data manipulation, three-dimensional imaging techniques and systems are the main topics covered.
Sparse representation based on dictionary learning has been widely used in many applications over the past decade. In this article, a new method is proposed for removing noise from video images using sparse representa...
详细信息
暂无评论