In this paper, we developed a color model to cancel the dependency between color channels, which enables us to separate spectral processing.from spatial processing. We introduced Independent Component Analysis (ICA) t...
详细信息
In this paper, we developed a color model to cancel the dependency between color channels, which enables us to separate spectral processing.from spatial processing. We introduced Independent Component Analysis (ICA) transformation in the wavelet domain to decorrelate the subband color joint statistics. The decorrelated joint color conditional histograms display scaling of variance. Gaussian Scale Mixture (GSM) was used to model the subband color statistics and a normalization scheme was adapted to cancel the pair-wise color subband statistical dependency. This color model was combined with the Portilla/Simoncelli texture model to construct the color texture model. Based on this model, features were extracted and the corresponding color texture synthesis scheme was developed.
image-based-interpolation creates smooth and photorealistic views between two view points. The concept of joint view triangulation (JVT) has been proven to be an efficient multi-view representation to handle visibilit...
详细信息
image-based-interpolation creates smooth and photorealistic views between two view points. The concept of joint view triangulation (JVT) has been proven to be an efficient multi-view representation to handle visibility issue. However;the existing JVT built only on a regular sampling grid, often produces undesirable artifacts for artificial objects. To tackle these problems, a new edge-constrained joint view triangulation is developed in this paper to integrate contour points and artificial rectilinear objects as triangulation constraints. Also a super-sampling technique is introduced to refine visible boundaries. The new algorithm is successfully demonstrated on many real image pairs.
We present contour based techniques, for automatic object recognition, that avoid the difficulties that arise as a consequence of translation, rotation and scaling, Our techniques do not require the use of any represe...
详细信息
ISBN:
(纸本)078036466X
We present contour based techniques, for automatic object recognition, that avoid the difficulties that arise as a consequence of translation, rotation and scaling, Our techniques do not require the use of any representation of shape. The first technique uses the scale-space filtered coordinate functions of the contours of the objects to be recognized as well as what we define as the "largest diameter" of the contour. The second technique uses the Hotelling transform of the vector representations of the points of contours. We describe and use to advantage some interesting properties this transform that make it an important tool for imageprocessing.
The analysis of human action captured in video sequences has been a topic of considerable interest in computer vision. Much of the previous work has focused on the problem of action or activity recognition, but ignore...
详细信息
ISBN:
(纸本)0769506623
The analysis of human action captured in video sequences has been a topic of considerable interest in computer vision. Much of the previous work has focused on the problem of action or activity recognition, but ignored the problem of detecting action boundaries in a video sequence containing unfamiliar and arbitrary visual actions. This paper presents an approach to this problem based on detecting temporal discontinuities of the spatial pattern of image motion that captures the action. We represent frame to frame optical-flow in terms of the coefficients of the most significant principal components computed from all the flow-fields within a given video sequence. We then detect the discontinuities in the temporal trajectories of these coefficients based on three different measures. We compare our segment boundaries against those detected by human observers on the same sequences in a recent independent psychological study of human perception of visual events. We show experimental results on the two sequences that were used in this study. Our experimental results are promising both from visual evaluation and when compared against the results of the psychological study.
作者:
Tieu, KViola, PMIT
Artificial Intelligence Lab Cambridge MA 02139 USA
We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse se...
详细信息
ISBN:
(纸本)0769506623
We present an approach for image retrieval using a very large number of highly selective features and efficient online learning. Our approach is predicated on the assumption that each image is generated by a sparse set of visual "causes" and that images which are visually similar share causes. We propose a mechanism for computing a very large number of highly selective features which capture some aspects of this causal structure (in our implementation there are or er 45,000 highly selective features). At query time a user selects a few example images, and a technique known as "boosting" is used to learn a classification function in this feature space. By construction, the boosting procedure learns a simple classifier which only relies on 20 of the features. As a result a very large database of images can be scanned rapidly perhaps a million images per second. Finally we will describe a set of experiments performed using our retrieval system on a database of 3000 images.
A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the im...
详细信息
A novel approach for estimating articulated body posture and motion from monocular video sequences is proposed. Human pose is defined as the instantaneous two dimensional configuration (i.e.,the projection onto the image plane) of a single articulated body in terms of the position of a predetermined set of joints. First, statistical segmentation of the human bodies from the background is performed and low-level visual features are found given the segmented body shape. The goal is to be able to map these, generally low level, visual features to body configurations. The system estimates different mappings, each one with a specific cluster in the visual feature space. Given a set of body motion sequences for training, unsupervised clustering is obtained via the Expectation Maximization algorithm. For each of the clusters, a function is estimated to build the mapping between low-level features to 2D pose. Given new visual features, a mapping from each cluster is performed to yield a set of possible poses. From this set, the system selects the most likely pose given the learned probability distribution and the visual feature similarity, between hypothesis and input. Performance of the proposed approach is characterized using real and artificially generated body postures, showing promising results.
This paper describes a new method for tracking of a human body in 3D motion by using constraints imposed on the body from the scene. An image-based approach for tracking exclusively uses a geometrical model of the hum...
详细信息
ISBN:
(纸本)0769506623
This paper describes a new method for tracking of a human body in 3D motion by using constraints imposed on the body from the scene. An image-based approach for tracking exclusively uses a geometrical model of the human. body. Since the model usually has a large number of degrees of freedom (DOF), a chance do be corrupted by noise increases during the tracking process, and the tracking may fall in an ill-posed problem. To cope with this problem, we pay our attention, to that a human body can not move freely, and usually receive some constraints from the scene. The new method uses constraints imposed on position, velocity and acceleration of the part of the body from the scene. These constraints can reduce the DOF of the model. This reduction guarantees the tracking problem to be a well-posed problem, and prevents tracking errors by noise. Experiments with real image sequences support a precise tracking of the body.
The engineering of computer vision systems that meet application specific computational and accuracy requirements is crucial to the deployment of real-life computer vision systems. This paper illustrates how past work...
详细信息
The engineering of computer vision systems that meet application specific computational and accuracy requirements is crucial to the deployment of real-life computer vision systems. This paper illustrates how past work on a systematic engineering methodology for vision systems performance characterization can be used to develop a real-time people detection and rooming system to meet given application requirements. We illustrate that by judiciously choosing the system modules and performing a careful analysis of the influence of various tuning parameters on the system it is possible to: perform proper statistical inference, automatically set control parameters and quantify limits of a dual-camera real-time video surveillance system. The goal of the system is to continuously provide a high resolution zoomed-in image of a persons head at any location of the monitored area. An omni-directional camera video is processed to detect people and to precisely control a high resolution foveal camera, which has pan, tilt and zoom capabilities. The pan and tilt parameters of the foveal camera and its uncertainties are shown to be functions of the underlying geometry, lighting conditions, background color/contrast, relative position of the person with respect to both cameras as well as sensor noise and calibration errors. The uncertainty in the estimates is used to adaptively estimate the zoom. parameter that guarantees with a user specified probability, alpha, that the detected person's face is contained and zoomed within the image.
We use cluster analysis as a unifying principle for problems from low, middle and high level vision. The clustering problem is viewed as graph partitioning, where nodes represent data elements and the weights of the e...
详细信息
ISBN:
(纸本)0769506623
We use cluster analysis as a unifying principle for problems from low, middle and high level vision. The clustering problem is viewed as graph partitioning, where nodes represent data elements and the weights of the edges represent pairwise similarities Our algorithm generates samples of cuts in this graph, by using David Karger's contraction algorithm, and computes an "average" cut which provides the basis for our solution to the clustering problem. The stochastic nature of our method makes it robust against noise, including accidental edges and small spurious clusters. The complexity of our algorithm is very low: O(N log(2) N)far N objects and a fixed accuracy level. Without additional computational cost, our algorithm provides a hierarchy of nested partitions. We demonstrate the superiority of our method for image segmentation on a few real color images. Our second application includes the concatenation of edges in a cluttered scene (perceptual grouping), where we show that the same clustering algorithm achieves as goad a grouping, if not better as more specialized methods.
Combining learning with vision techniques in interactive image retrieval has been all active research topic during the past few years. However, existing learning techniques either are based on heuristics or fail to an...
详细信息
ISBN:
(纸本)0769506623
Combining learning with vision techniques in interactive image retrieval has been all active research topic during the past few years. However, existing learning techniques either are based on heuristics or fail to analyze the working conditions. Furthermore, there is almost no in depth study on how to effectively learn from the users when there are multiple visual features in the retrieval system. To address these limitations, bt this paper we present a vigorous optimization formulation of the learning process and solve the problem in a principled way. By using Lagrange multipliers, we have derived explicit solutions, which are both optimal a,tn fast to compute. Extensive comparisons against state-of-the-art techniques have been performed. Experiments were carried out on a large-size heterogeneous image collection consisting of 17,000 images. Retrieval performance was tested under a wide range of conditions. Various evaluation criteria, including precision-recall curve and rank measure, have demonstrated the effectiveness and robustness of the proposed technique.
暂无评论