Video retrieval compares multimedia queries to items in a video collection in multiple dimensions and combines all the similarity scores into a final retrieval ranking. Although text is the most reliable feature for v...
详细信息
ISBN:
(纸本)3540225390
Video retrieval compares multimedia queries to items in a video collection in multiple dimensions and combines all the similarity scores into a final retrieval ranking. Although text is the most reliable feature for video retrieval, features from other modalities can provide complementary information. A reranking framework for video retrieval to augment text feature based retrieval with other evidence is presented. A boosted reranking algorithm called co-retrieval is then introduced, which combines a boosting type learning algorithm and a noisy label prediction scheme to select automatically the most useful (weak) features from multiple modalities. The proposed approach is evaluated with queries and video from the 65 h test collection of the 2003 NIST TRECVID evaluation and it achieves considerable improvement over several baseline retrieval algorithms.
Progress in the automatic detection and identification of humans in video, given a minimal number of labelled faces as training data, is described. This is an extremely challenging problem owing to the many sources of...
详细信息
Progress in the automatic detection and identification of humans in video, given a minimal number of labelled faces as training data, is described. This is an extremely challenging problem owing to the many sources of variation in a person's imaged appearance: pose variation, scale, facial expression, illumination, partial occlusion, motion blur, etc. The method developed in this work combines approaches from computervision, for detection and pose estimation, with those from machine learning for classification. A 'generative' model of a person's head is defined consisting of a coarse 3-D model and multiple texture maps. This allows faces to be rendered with a variety of facial expressions and at poses differing from those of the training data. It is shown that the identity of a target face can then be determined by first proposing faces with similar pose, and then classifying the target face as one of the proposed faces or not. Furthermore, the texture maps of the model can be automatically updated as new poses and expressions are detected. Results of detecting three characters in a TV situation comedy are demonstrated.
A 3-d visualization system of the cranium based on reconstruction from X-rays is presented. Since X-rays belong to the penetrating projection images, the objects do not have definite surface in images. To solve this p...
详细信息
ISBN:
(纸本)0769523935
A 3-d visualization system of the cranium based on reconstruction from X-rays is presented. Since X-rays belong to the penetrating projection images, the objects do not have definite surface in images. To solve this problem, an approach of pasting the lead granules on the face of a patient and reconstructing the face through the correlated vision is adopted. Then the 3-d cranium model is built by subtracting the thickness of soft tissue from the face model. The whole system consists of image pre-processing, feature point recognition, matching, texture mapping, animation, and 3-d measurement. Only the X-ray machine, adhesive tapes with lead granules and Multrasonograph are needed. The experiment demonstrates that this approach is effective and of high precision. It can help doctors examine and measure the face and cranium of patients by computer. Hopefully, this product could be developed and applied in clinic one day.
Reliable image matching is important to many problems in computervision, imageprocessing and pattern recognition. Hausdorff distance and many of its variations have been employed for image matching with success. In ...
详细信息
Reliable image matching is important to many problems in computervision, imageprocessing and pattern recognition. Hausdorff distance and many of its variations have been employed for image matching with success. In this paper we propose an improved image matching method based on a modified Hausdorff distance with normalized gradient consistency measure. The proposed new image matching algorithm integrates the geometric Hausdorff distance with the photometric intensity gradient information to obtain a better image similarity measure. To show the improvement of the proposed algorithm, we test it with some previous image matching methods on the problem of face recognition under lighting changes. Experimental results show the proposed method produces more accurate face recognition than the previous methods.
As an alternative for human inspection, presented in this study was the development of a machine vision inspection system (MVIS) purposely for car seat frames. The proposed MVIS was designed to meet the demands, featu...
详细信息
ISBN:
(纸本)0819460737
As an alternative for human inspection, presented in this study was the development of a machine vision inspection system (MVIS) purposely for car seat frames. The proposed MVIS was designed to meet the demands, features and specifications of car seat frame manufacturing companies in striving for increased throughput of better quality. This computer-based MVIS was designed to perforin quality measures by detecting holes, nuts and welding spots on every car seat frame in real time and ensuring these portions are intact, precise and in proper place. In this study, the NI vision Builder software for Automatic Inspection was used as a solution in configuring the aimed quality measurements. The proposed software has measurement techniques such as edge detecting and pattern-matching which are capable of identifying the boundaries or edges of an object and analyzing the pixel values along the profile to detect significant intensity changes. Either of these techniques is capable of gauging sizes, detecting missing portion and checking alignment of parts. The techniques for visual inspection were optimized through qualitative analysis and simulation of human tolerance on inspecting car seat frames. Furthermore, this study exemplified the incorporation of the optimized vision inspection environment to the pre-inspection and post-inspection subsystems. The optimized participation of human on this proposed MVIS for car seat frames has ideally eased to feeding and sorting.
We have been witnessing lately a convergence among mathematical morphology and other nonlinear fields, such as curve evolution, PDE-based geometrical imageprocessing, and scale-spaces. An obvious benefit of such a co...
详细信息
We have been witnessing lately a convergence among mathematical morphology and other nonlinear fields, such as curve evolution, PDE-based geometrical imageprocessing, and scale-spaces. An obvious benefit of such a convergence is a cross-fertilization of concepts and techniques among these fields. The concept of adjunction however, so fundamental in mathematical morphology, is not yet shared by other disciplines. The aim of this paper is to show that other areas in imageprocessing can possibly benefit from the use of adjunctions. In particular, a strong relationship between pyramids and adjunctions is presented. We show how this relationship may help in analyzing existing pyramids, and construct new pyramids. Moreover, it will be explained that adjunctions based on a curve evolution scheme can provide idempotent shape filters. This idea is illustrated in this paper by means of a simple affine-invariant polygonal flow. Finally, the use of adjunctions in scale-space theory is also addressed.
In this paper, an intellectual property protection mechanism realized on a watermarking scheme is proposed. The embedding technique we adopted in this paper is based on the modular operation. The modulus is a threshol...
详细信息
We propose image enhancement, edge detection, and segmentation models for the multi-channel case, motivated by the philosophy of processingimages as surfaces, and generalizing the Mumford-Shah functional. Refer to ht...
详细信息
The proceedings contain 42 papers. The special focus in this conference is on Probabilistic Models and Estimation. The topics include: A double-loop algorithm to minimize the bethe free energy;a variational approach t...
ISBN:
(纸本)3540425233
The proceedings contain 42 papers. The special focus in this conference is on Probabilistic Models and Estimation. The topics include: A double-loop algorithm to minimize the bethe free energy;a variational approach to maximum a posteriori estimation for image denoising;maximum likelihood estimation of the template of a rigid moving object;an application to shape retrieval;a fast MAP algorithm for 3D ultrasound;designing the minimal structure of hidden markov model by bisimulation;relaxing symmetric multiple windows stereo using markov random fields;camera calibration for 3-D surface reconstruction;a hierarchical markov random field model for figure-ground segregation;articulated object tracking via a genetic algorithm;learning matrix space image representations;supervised texture segmentation by maximising conditional likelihood;optimization of paintbrush rendering of images by dynamic MCMC methods;illumination invariant recognition of color texture using correlation and covariance functions;path based pairwise data clustering with application to texture segmentation;a maximum likelihood framework for grouping and segmentation;image labeling and grouping by minimizing linear functionals over cones;grouping with directed relationships;segmentations of spatio-temporal images by spatio-temporal markov random field model;highlight and shading invariant color image segmentation using simulated annealing;edge based probabilistic relaxation for sub-pixel contour extraction;two variational models for multispectral image classification;an experimental comparison of min-cut/max-flow algorithms for energy minimization in vision;a discrete/continuous minimization method in interferometric imageprocessing and a transformation approach.
Multi-modality image registration and fusion are essential steps in building 3D models from remote sensing data. In this paper, we present a neural network technique for the registration and fusion of multi-modality r...
详细信息
暂无评论