The virtual studio concept replaces real background sets with computer-generated synthetic scenes. In this paper, we discuss the optical landmark-based camera tracking method in virtual studio applications. Studio cam...
详细信息
ISBN:
(纸本)081944281X
The virtual studio concept replaces real background sets with computer-generated synthetic scenes. In this paper, we discuss the optical landmark-based camera tracking method in virtual studio applications. Studio cameras are zoom lens imaging device, both the internal parameters and external parameters of the camera are allowed to vary from image to image. Optical tracking -uses patternrecognition-eliminates the need for painstaking calibration of the lens system and can be used with any camera mount, including hand-held cameras. Here we address the problem of accurately tracking the 3D motion and focus of a monocular camera in a known 3D environment and dynamically estimating the 3D location and focus of the camera. We utilize fully automated landmark-based camera calibration to initialize the motion estimation and employ extended Kalman filter techniques to track landmarks and to estimate the camera location and focus. The implementation of our approach has been proven to be efficient and robust and our system successfully tracks in real-time at approximately 25 Hz. This paper describes several years of work at the NUDT Multimedia laboratory to develop optical patternrecognition and tracking systems for use in virtual studio.
We present a sealable object tracking framework, which is capable of tracking the contour of rigid and non-rigid objects in the presence of occlusion. The method adaptively divides the object contour into subcontours,...
详细信息
We present a sealable object tracking framework, which is capable of tracking the contour of rigid and non-rigid objects in the presence of occlusion. The method adaptively divides the object contour into subcontours, and employs several low-level features such as color edge, color segmentation, motion models, motion segmentation, and shape continuity information in a feedback loop to track each subcontour. We also introduce some novel performance evaluation measures to evaluate the goodness of the segmentation and tracking. The results of these performance measures are utilized in a feedback loop to adjust the weights assigned to each of these low-level features for each sub-contour at each frame. The framework is scalable because it can be adapted to roughly track simple objects in real-time as well as pixel-accurate tracking of more complex objects in off-line mode. The proposed method does not depend on any single motion or shape model, and does not need training. Experimental results demonstrate that the algorithm is able to track the object boundaries accurately under significant occlusion and background clutter.
Because of the different characteristics of Arabic language and Romance and Anglo Saxon languages, recognition of documents written in hybrid of these languages requires that the language of the text to be identified ...
详细信息
A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two parts. images on the positive side of the bound...
详细信息
A new scheme of learning similarity measure is proposed for content-based image retrieval (CBIR). It learns a boundary that separates the images in the database into two parts. images on the positive side of the boundary are ranked by their Euclidean distances to the query. The scheme is called restricted similarity measure (RSM), which not only takes into consideration the perceptual similarity between images, but also significantly improves the retrieval performance based on the Euclidean distance measure. Two techniques, support vector machine and AdaBoost, are utilized to learn the boundary, and compared with respect to their performance in boundary learning. The positive and negative examples used to learn the boundary are provided by the user with relevance feedback. The RSM metric is evaluated on a large database of 10,009 natural images with an accurate ground truth. Experimental results demonstrate the usefulness and effectiveness of the proposed similarity measure for image retrieval.
This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approx...
详细信息
This paper presents a novel solution for flow-based tracking and 3D reconstruction of deforming objects in monocular image sequences. A non-rigid 3D object undergoing rotation and deformation can be effectively approximated using a linear combination of 3D basis shapes. This puts a bound on the rank of the tracking matrix. The rank constraint is used to achieve robust and precise low-level optical flow estimation without prior knowledge of the 3D shape of the object. The bound on the rank is also exploited to handle occlusion at the tracking level leading to the possibility of recovering the complete trajectories of occluded/disoccluded points. Following the same low-rank principle, the resulting flow matrix can be factored to get the 3D pose, configuration coefficients, and 3D basis shapes. The flow matrix is factored in an iterative manner, looping between solving for pose, configuration, and basis shapes. The flow-based tracking is applied to several video sequences and provides the input to the 3D non-rigid reconstruction task. Additional results on synthetic data and comparisons to ground truth complete the experiments.
We define and present a new model for semantic video compression and searching, and apply this model to several video genres, with special emphasis on instructional videos. In this model, a semantic distance based on ...
详细信息
We define and present a new model for semantic video compression and searching, and apply this model to several video genres, with special emphasis on instructional videos. In this model, a semantic distance based on the video genre is computed between adjacent frames in a dynamic video buffer of predetermined size, and redundant "unkey" frames leak from the buffer interior into a data structure "leakage history directed acyclic graph" (LHDAG) which records the relative importance of these frames. What exits the buffer forms a highly compressed video stream consisting of only the most semantically significant video frames, whereas the LHDAG permits efficient semantic exploration of the video interior. This novel hierarchical but context-sensitive data structure LHDAG permits the searching of the video at display rates that are proportional to visual significance, at levels of semantic density selectable by the user. The data structure is simple to create and query, and appears to be more psychologically plausible than more straightforward fixed sampling indexing schemes. We empirically display and mathematically analyze the relationships of the video buffer's leaking rate, buffer size, and exit delay, and demonstrate its performance on several extended videos. The flexibility of the method is indicated by its demonstration on two very different definitions of semantic significance: Color similarity, and content ("ink pixel") similarity.
In this paper, methods of choosing a vehicle out of an image are explored. Digital images are taken from a monocular camera. imageprocessing.techniques are applied to each single frame picture to create the feature v...
详细信息
ISBN:
(纸本)0819441937
In this paper, methods of choosing a vehicle out of an image are explored. Digital images are taken from a monocular camera. imageprocessing.techniques are applied to each single frame picture to create the feature vector. Finally the resulting features are used to classify whether there is a car in the picture or not using support vector machines. The result are compared to those obtained using a neural network. A discussion on techniques to enhance the feature vector and th results from both learning machines will be included.
Early diagnosis and removal of colonic polyps is effective in the elimination of subsequent carcinoma. This paper presents a new approach for computer-aided detection of polyps. The approach mimics the way the radiolo...
详细信息
ISBN:
(纸本)0780372115
Early diagnosis and removal of colonic polyps is effective in the elimination of subsequent carcinoma. This paper presents a new approach for computer-aided detection of polyps. The approach mimics the way the radiologists view CT abdomen images and utilizes several geometric attributes obtained from many triples of mutually orthogonal planes. The histogram of the attributes obtained from a sufficiently large number of perpendicular random images serves as a robust signature to represent the shape. We combine the new 3-D patternrecognition with a support vector machine classifier, and show that the number of the false positive detections in the initial polyp detection studies can be substantially reduced. One of the main contributions of this study is the thorough analysis of planar geometrical attributes. When an appropriate combination of planar attributes is used, the false positive rate is reduced by 87 percent beyond that of the initial stage detector, while maintaining a sensitivity level of 95 percent. Using such methods, radiologists should be able to view CTC data much more efficiently and accurately than without CAD.
The proceedings contain 54 papers. The special focus in this conference is on Face as Biometrics and Face imageprocessing. The topics include: Face identification and verification via ECOC;pose-independent face ident...
ISBN:
(纸本)3540422161
The proceedings contain 54 papers. The special focus in this conference is on Face as Biometrics and Face imageprocessing. The topics include: Face identification and verification via ECOC;pose-independent face identificataion from video sequences;face recognition using independent gabor wavelet features;face recognition from 2D and 3D images;face recognition using support vector machines with the feature set extracted by genetic algorithms;comparative performance evaluation of gray-scale and color information for face recognition tasks;evidence on skill differences of women and men concerning face recognition;face recognition by auto-associative radial basis function network;face recognition using independent component analysis and support vector machines;a comparison of face/non-face classifiers;using mixture covariance matrices to improve face and facial expression recognitions;real-time face detection using edge-orientation matching;directional properties of colour co-occurrence features for lip location and segmentation;robust face detection using the hausdorff distance;multiple landmark feature point mapping for robust face recognition;face detection on still images using HIT maps;lip recognition using morphological pattern spectrum;a face location algorithm robust to complex lighting conditions;automatic facial feature extraction and facial expression recognition;fusion of audio-visual information for integrated speech processing.would a speaker verification system foil him?;speaker discriminative weighting method for VQ-based speaker identification;a physiological or behavioural biometric?;an HMM-based subband processing.approach to speaker identification;affine-invariant visual features contain supplementary information to enhance speech recognition;recent advances in fingerprint verification (invited) and fast and accurate fingerprint verification (extended abstract).
In the presence of false matches and moving objects, image registration is challenging, as outlier rejection, matching and registration become interdependent. In this paper, we present an efficient and robust method, ...
详细信息
In the presence of false matches and moving objects, image registration is challenging, as outlier rejection, matching and registration become interdependent. In this paper, we present an efficient and robust method, 4D tensor voting, to estimate epipolar geometries for non-static scenes, and identify matching points due to salient and independent motions. Unlike other optimization techniques, data communication in 4D tensor voting does not involve any iterative search. Thus, initialization, local optimum, convergence, and dimensionality of parameter space are not problematic. Like the 8D counterpart, the only assumption we make is the pinhole camera model. Two advancements are made in this work. First, we reduce the dimensionality, and the 4D joint image space is an isotropic and orthogonal one, validating the general assumptions of tensor voting. This improvement is evidenced by the facts that only two passes are needed, and that 4D tensor voting can tolerate an even larger noise/signal ratio (up to a ratio of five). Second, instead of discarding motion pixels as outliers, we successively extract the epipolar geometries contributed by the static background and by the matching points due to salient motions. Only two frames are needed, and no simplifying assumption (such as affine camera model or homographic model between images) is made. Our 4D algorithm consists of two stages: local continuity constraint propagation to remove outliers, and global consistency checking to localize a 4D topological point cone. Results on challenging datasets are presented.
暂无评论