In this study, we propose a new integrated computervision system designed to track multiple human beings and extract their silhouette with a pan-tilt stereo camera, so that it can assist in gesture and gait recogniti...
详细信息
In this study, we propose a new integrated computervision system designed to track multiple human beings and extract their silhouette with a pan-tilt stereo camera, so that it can assist in gesture and gait recognition in the field of Human-Robot Interaction (HRI). The proposed system consists of three modules: detection, tracking and silhouette extraction. These modules are robust to camera movements, and they work interactively in near real-time. Detection was performed by camera ego-motion compensation and disparity segmentation. For tracking, we present an efficient mean shift-based tracking method in which the tracking objects are characterized as disparity weighted color histograms. The silhouette was obtained by two-step segmentation. A trimap was estimated in advance and then effectively incorporated into the graph-cut framework for fine segmentation. The proposed system was evaluated with respect to ground truth data, and it was shown to detect and track multiple people very well and also produce high-quality silhouettes.
Automatic reconstruction of unknown 3-D objects has been of great importance in computervision, patternrecognition and visualization. In this paper, a new view planning approach of generating 3-D models automaticall...
详细信息
ISBN:
(纸本)9780769538167
Automatic reconstruction of unknown 3-D objects has been of great importance in computervision, patternrecognition and visualization. In this paper, a new view planning approach of generating 3-D models automatically is proposed. The new algorithm incorporates the visual region of the vision system with the limit visual surface of the unknown model. The limit visual surface is modeled according to the knowledge of acquired object model. And the space positions of efficient viewpoints are determined according to the visual region of the system and the limit visual surface. The NBV position is the position that could obtain the maximal unknown surface area among the efficient viewpoints. The experimental results show that the method proposed is effective in practical implementation.
Statistical model based facial expression synthesis methods are robust and can be easily used in real environment. But facial expressions of humans are varied. How to represent and synthesize expressions that are is n...
详细信息
Statistical model based facial expression synthesis methods are robust and can be easily used in real environment. But facial expressions of humans are varied. How to represent and synthesize expressions that are is not included in the training set is an unresolved problem in statistical model based researches. In this paper, we propose a two-step method. At first, we propose a statistical appearance model, the facial component model, to represent faces. The model divides the face into seven components, and constructs one global shape model and seven local texture models separately. The motivation to use global shape + local texture strategy is the combination of different components that can generate more types of expression than training sets and the global shape guarantees a "legal" result. Then a neighbor reconstruction framework is proposed to synthesize expressions. The framework estimates the target expression vector by a linear combination of neighbor subject's expression vectors. This paper primarily contributes three things: first, the proposed method can synthesize a wider range of expressions than with the training set. Second, experiments demonstrate that FCM is better than standard AAM in face representation. Third, neighbor reconstruction framework is very flexible. It can be used in multisamples with multitargets and single-sample with single-target applications.
In this paper, an extended work reported in [Shet, et al , 2007] to detect complex objects in aerial images was discussed. Such objects, e.g. surface to air missile launcher sites, are highly variable in appearance an...
详细信息
In this paper, an extended work reported in [Shet, et al , 2007] to detect complex objects in aerial images was discussed. Such objects, e.g. surface to air missile launcher sites, are highly variable in appearance and can only be characterized by their functional design and surrounding context, such as physical arrangement of access structures. Constraints in acquiring sufficient annotated data for learning make it challenging for purely data driven approaches to adequately generalize. In this work, structure arising from functional requirements and surrounding context has been encoded using predicate logic based grammars. Observation and model uncertainties have been integrated within the bi lattice framework. Also in this paper a proposed method to automatically optimize weights associated with logical rules is presented. Automated logical rule weight learning is an important aspect of the application of such systems in the computervision domain. The proposed approach casts the instantiated inference tree as a knowledge based neural net, interprets rule uncertainties as link weights in the network, and applies a constrained, back propagation (BP) algorithm to converge upon a set of weights for optimal performance. The BP algorithm has been accordingly modified to compute local gradients over the bi lattice specific inference operation and respect constraints specific to vision applications. Both extension have been evaluated over real and simulated data with favorable results.
We propose a novel probabilistic framework for learning visual models of 3D object categories by combining appearance information and geometric constraints. Objects are represented as a coherent ensemble of parts that...
详细信息
We propose a novel probabilistic framework for learning visual models of 3D object categories by combining appearance information and geometric constraints. Objects are represented as a coherent ensemble of parts that are consistent under 3D viewpoint transformations. Each part is a collection of salient image features. A generative framework is used for learning a model that captures the relative position of parts within each of the discretized viewpoints. Contrary to most of the existing mixture of viewpoints models, our model establishes explicit correspondences of parts across different viewpoints of the object class. Given a new image, detection and classification are achieved by determining the position and viewpoint of the model that maximize recognition scores of the candidate objects. Our approach is among the first to propose a generative probabilistic framework for 3D object categorization. We test our algorithm on the detection task and the viewpoint classification task by using “car” category from both the Savarese et al. 2007 and PASCAL VOC 2006 datasets. We show promising results in both the detection and viewpoint classification tasks on these two challenging datasets.
A state-of-the-art approach to measure the similarity of two images is to model each image by a continuous distribution, generally a Gaussian mixture model (GMM), and to compute a probabilistic similarity between the ...
详细信息
A state-of-the-art approach to measure the similarity of two images is to model each image by a continuous distribution, generally a Gaussian mixture model (GMM), and to compute a probabilistic similarity between the GMMs. One limitation of traditional measures such as the Kullback-Leibler (KL) divergence and the probability product kernel (PPK) is that they measure a global match of distributions. This paper introduces a novel image representation. We propose to approximate an image, modeled by a GMM, as a convex combination of K reference image GMMs, and then to describe the image as the K-dimensional vector of mixture weights. The computed weights encode a similarity that favors local matches (i.e. matches of individual Gaussians) and is therefore fundamentally different from the KL or PPK. Although the computation of the mixture weights is a convex optimization problem, its direct optimization is difficult. We propose two approximate optimization algorithms: the first one based on traditional sampling methods, the second one based on a variational bound approximation of the true objective function. We apply this novel representation to the image categorization problem and compare its performance to traditional kernel-based methods. We demonstrate on the PASCAL VOC 2007 dataset a consistent increase in classification accuracy.
发现在二幅图象之间的可靠的相应的点是在计算机视觉的一个基本问题,特别与 L 视觉框架的发展。这篇论文介绍歧管的通讯并且建议一个新奇计划由听说向上的看法拒绝孤立点歧管。建议计划独立于在出版工作要估计并且克服可得到的方法的...
详细信息
发现在二幅图象之间的可靠的相应的点是在计算机视觉的一个基本问题,特别与 L 视觉框架的发展。这篇论文介绍歧管的通讯并且建议一个新奇计划由听说向上的看法拒绝孤立点歧管。建议计划独立于在出版工作要估计并且克服可得到的方法的下列限制的参量的模型:效率严厉地因孤立点百分比的增加和估计的模型参数的数字倒下;孤立点拒绝被结合模型选择和模型评价。真实图象对的实验显示出我们的建议计划的优秀性能。
In recent years,feature based object detection has attracted increasing attention in computervision research ***,to our best knowledge,no previous work has focused on utilizing local binary pattern (LBP) for vehicle ...
详细信息
In recent years,feature based object detection has attracted increasing attention in computervision research ***,to our best knowledge,no previous work has focused on utilizing local binary pattern (LBP) for vehicle detection in Intelligent Transportation System(ITS) *** this paper,we develop a novel traffic monitoring system based on N-LBP algorithm,which is the new LBP texture descriptor *** approach includes three steps: firstly the general critical ingredients (GCI for short) are selected from LBP features through training to indicate *** GCI are extracted from region of interest (ROI) in the new image for object detection and *** Kalman filter is employed for feature based tracking *** results demonstrate the superiority of N-LBP feature over basic LBP feature,and performance of the new system is more stable and reliable.
What is necessary to enhance the command of complex technical systems in order to reach the ease and naturalness of human communication? How can we endow technical devices with the necessary cognitive abilities to sup...
详细信息
What is necessary to enhance the command of complex technical systems in order to reach the ease and naturalness of human communication? How can we endow technical devices with the necessary cognitive abilities to support humans at a high level of semantic interaction offering true flexibility by virtue of adaptivity, self-organization, and learning? These are the guiding questions of the Excellence Cluster "Cognitive Interaction Technology" established since November 2007 at Bielefeld University. It focusses the efforts of computer scientists, psychologists, linguists, physicists and biologists on the goal of establishing cognitive interfaces that facilitate the use of complex technical systems by providing a high level of semantic interaction. This combines interdisciplinary, basic, and applied research beyond the classical confines of artificial intelligence. It aims at a thorough understanding of the processes and functional constituents of cognitive interaction in order to replicate it in technical systems including the development of evaluation methodologies and tool-kits for such systems. The research agenda of the EC is organized around four central topic areas: Motion Intelligence, Attentive Systems, Situated Communication, Memory and Learning. In addition, the EC is offering a cross-disciplinary education program.
暂无评论