Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extract...
详细信息
Positioning a robot with respect to objects by using data provided by a camera is a well known technique called visual servoing. In order to perform a task, the object must exhibit visual features which can be extracted from different points of view. Then, visual servoing is object-dependent as it depends on the object appearance. Therefore, performing the positioning task is not possible in presence of nontextured objets or objets for which extracting visual features is too complex or too costly. This paper proposes a solution to tackle this limitation inherent to the current visual servoing techniques. Our proposal is based on the coded structured light approach as a reliable and fast way to solve the correspondence problem. In this case, a coded light pattern is projected providing robust visual features independently of the object appearance.
In this paper, we present a general guideline to establish the relation between a distribution model and its corresponding similarity estimation. A rich set of distance metrics, such as harmonic distance and geometric...
详细信息
In this paper, we present a general guideline to establish the relation between a distribution model and its corresponding similarity estimation. A rich set of distance metrics, such as harmonic distance and geometric distance, is derived according to Maximum Likelihood theory. These metrics can provide a more accurate feature model than the conventional Euclidean distance (SSD) and Manhattan distance (SAD). Because the feature elements are from heterogeneous sources and may have different influence on similarity estimation, the assumption of single isotropic distribution model is often inappropriate. We propose a novel boosted distance metric that not only finds the best distance metric that fits the distribution of the underlying elements but also selects the most important feature elements with respect to similarity. We experiment with different distance metrics for similarity estimation and compute the accuracy of different methods in two applications: stereo matching and motion tracking in video sequences. The boosted distance metric is tested on fifteen benchmark data sets from the UCI repository and two image retrieval applications. In all the experiments, robust results are obtained based on the proposed methods.
We present a 3-level hierarchical model for localizing human bodies in still images from arbitrary viewpoints. We first fit a simple tree-structured model defined on a small landmark set along the body contours by Dyn...
详细信息
We present a 3-level hierarchical model for localizing human bodies in still images from arbitrary viewpoints. We first fit a simple tree-structured model defined on a small landmark set along the body contours by Dynamic Programming (DP). The output is a series of proposal maps that encode the probabilities of partial body configurations. Next, we fit a mixture of view-dependent models by Sequential Monte Carlo (SMC), which handles self-occlusion, anthropometric constraints, and large viewpoint changes. DP and SMC are designed to search in opposite directions such that the DP proposals are utilized effectively to initialize and guide the SMC inference. This hybrid strategy of combining deterministic and stochastic search ensures both the robustness and efficiency of DP, and the accuracy of SMC. Finally, we fit an expanded mixture model with increased landmark density through local optimization. The model hierarchy is trained on a large number of gait images. Extensive tests on cluttered images with varying poses including walking, dancing and various types of sports activities demonstrate the feasibility of the proposed approach.
In scenes containing specular objects, the image motion observed by a moving camera may be an intermixed combination of optical flow resulting from diffuse reflectance (diffuse flow) and specular reflection (specular ...
详细信息
In scenes containing specular objects, the image motion observed by a moving camera may be an intermixed combination of optical flow resulting from diffuse reflectance (diffuse flow) and specular reflection (specular flow). Here, with few assumptions, we formalize the notion of specular flow, show how it relates to the 3D structure of the world, and develop an algorithm for estimating scene structure from 2D image motion. Unlike previous work on isolated specular highlights we use two image frames and estimate the semi-dense flow arising from the specular reflections of textured scenes. We parametrically model the image motion of a quadratic surface patch viewed from a moving camera. The flow is modeled as a probabilistic mixture of diffuse and specular components and the 3D shape is recovered using an Expectation-Maximization algorithm. Rather than treating specular reflections as noise to be removed or ignored, we show that the specular flow provides additional constraints on scene geometry that improve estimation of 3D structure when compared with reconstruction from diffuse flow alone. We demonstrate this for a set of synthetic and real sequences of mixed specular-diffuse objects.
In multi-target tracking, the maintaining of the correct identity of targets is challenging. In the presented tracking method, accurate target identification is achieved by incorporating the appearance information of ...
详细信息
In multi-target tracking, the maintaining of the correct identity of targets is challenging. In the presented tracking method, accurate target identification is achieved by incorporating the appearance information of the spatial and temporal context of each target. The spatial context of a target involves local background and nearby targets. The first contribution of the paper is to provide a new discriminative model for multi-target tracking with the embedded classification of each target against its context. As a result, the tracker not only searches for the image region similar to the target but also avoids latching on nearby targets or on a background region. The temporal context of a target includes its appearances seen during tracking in the past. The past appearances are used to train a probabilistic PCA that is used as the measurement model of the target at the present. As the second contribution, we develop a new incremental scheme for probabilistic PCA. It can update accurately the full set of parameters including a noise parameter still ignored in related literature. The experiments show robust tracking performance under the condition of severe clutter, occlusions and pose changes.
We present a new method for training deformable models. Assume that we have training images where part locations have been labeled. Typically, one fits a model by maximizing the likelihood of the part labels. Alternat...
详细信息
We present a new method for training deformable models. Assume that we have training images where part locations have been labeled. Typically, one fits a model by maximizing the likelihood of the part labels. Alternatively, one could fit a model such that, when the model is run on the training images, it finds the parts. We do this by maximizing the conditional likelihood of the training data. We formulate model-learning as parameter estimation in a conditional random field (CRF). Initializing parameters with their maximum likelihood estimates, we reach the global optimum by gradient ascent. We present a learning algorithm that searches exhaustively over all part locations in an image without relying on feature detectors. This provides millions of examples of training data, and seems to avoid over-fitting issues known with CRFs. Results for part localization are relatively scarce in the community. We present results on three established datasets; Caltech motorbikes [8], USC people [19], and Weizmann horses [3]. In the Caltech set we significantly outperform the state-of-the-art [6]. For the challenging people dataset, we present results that are comparable to [19], but are obtained using a significantly more generic model (devoid of a face or skin detector). Our model is general enough to find other articulated objects; we use it to recover poses of horses in the challenging Weizmann database.
In this paper we present a novel embedded platform, dedicated especially to the surveillance of remote locations under harsh environmental conditions, featuring various video and audio compression algorithms as well a...
详细信息
In this paper we present a novel embedded platform, dedicated especially to the surveillance of remote locations under harsh environmental conditions, featuring various video and audio compression algorithms as well as support for local systems and devices. The presented solution follows a radically decentralized approach and is able to act as an autonomous video server. Using up to three Texas InstrumentsTM TMS320C6414 DSPs, it is possible to use high-level computervision algorithms in real-time in order to extract the information from the video stream which is relevant to the surveillance task. The focus of this paper is on the task of vehicle detection and tracking in images. In particular, we discuss the issues specific for embedded systems, and we describe how they influenced our work. We give a detailed description of several algorithms and justify their use in our implementation. The power of our approach is shown on two real-world applications, namely vehicle detection on highways and license plate detection on urban traffic videos.
A long-text-input keystroke biometric system was developed for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the...
详细信息
A long-text-input keystroke biometric system was developed for applications such as identifying perpetrators of inappropriate e-mail or fraudulent Internet activity. A Java applet collected raw keystroke data over the Internet, appropriate long-text-input features were extracted, and a pattern classifier made identification decisions. Experiments were conducted on a total of 118 subjects using two input modes - copy and free-text input - and two keyboard types - desktop and laptop keyboards. Results indicate that the keystroke biometric can accurately identify an individual who sends inappropriate email (free text) if sufficient enrollment samples are available and if the same type of keyboard is used to produce the enrollment and questioned samples. For laptop keyboards we obtained 99.5% accuracy on 36 users, which decreased to 97.9% on a larger population of 47 users. For desktop keyboards we obtained 98.3% accuracy on 36 users, which decreased to 93.3% on a larger population of 93 users. Accuracy decreases significantly when subjects used different keyboard types or different input modes for enrollment and testing.
We present a novel feature-based non-rigid image registration algorithm using a small number of automatically extracted points and their associated local salient region features. Our automatic registration is a hybrid...
详细信息
We present a novel feature-based non-rigid image registration algorithm using a small number of automatically extracted points and their associated local salient region features. Our automatic registration is a hybrid approach co-optimizing point-based and image-based terms. Motivated by the paradigm of the TPS-RPM algorithm [6], we develop the RHDM (Robust Hybrid Deformable Matching) algorithm by alternatively optimizing correspondences and transformations for registration. The local salient region features and the geometric features, together with the softassign and deterministic annealing techniques, are used for solving correspondences. Thin-plate splines are used for generating a smooth non-rigid spatial transformation. Our algorithm is built to be extremely robust to feature extraction errors. A new dynamic outlier rejection mechanism is described for rejecting outliers and generating accurate spatial mappings. A local refinement technique is used for correcting non-exactly matched correspondences arising from image noise and irregular deformations. In contrast with the TPS-RPM algorithm, which can handle only outliers in one point set, our algorithm is able to handle a considerable number of outliers in both point sets. The experimental results demonstrate the robustness and accuracy of our algorithm.
Interest strength assignment to image points is important for selecting good features. Strength assignments using spatial information aim to detect interest points repeatable across different image/illumination transf...
详细信息
Interest strength assignment to image points is important for selecting good features. Strength assignments using spatial information aim to detect interest points repeatable across different image/illumination transformations, and have been widely adopted in many interest point detectors. Recently, strength assignment schemes using discriminant information received attention, and studies showed the superiority of discriminant strength. In this paper, we introduce a strength assignment scheme integrating spatial and discriminant information, with the motivation that strong spatial information can be helpful in improving the robustness of the discriminant strength estimation, e.g., in undersampled training scenario. Our integrated strength uses a new discriminant strength assignment, so-called locality oriented Fisher criterion score. The integrated strength leads to new methods for feature selection and weighted linear dimensionality reduction. Experimental results in two case studies (embryo developmental stage classification and face recognition) show the favorable performance of the proposed methods.
暂无评论