We propose a generative modeling framework - namely, Dynamic Tree Structured Belief Networks (DTSBNs) and a novel Structured Variational Approximation (SVA) inference algorithm for DTSBNs - as a viable solution to obj...
详细信息
This paper presents a method for learning decision theoretic models of facial expressions and gestures from video data. We consider that the meaning of a facial display or gesture to an observer is contained in its re...
详细信息
This paper presents a method for learning decision theoretic models of facial expressions and gestures from video data. We consider that the meaning of a facial display or gesture to an observer is contained in its relationship to context, actions and outcomes. An agent wishing to capitalize on these relationships must distinguish facial displays and gestures according to how they help the agent to maximize utility. This paper demonstrates how an agent can learn relationships between unlabeled observations of a person's face and gestures, the context, and its own actions and utility function. The agent needs no prior knowledge about the number or the structure of the gestures and facial displays that are valuable to distinguish. The agent discovers classes of human non-verbal behaviors, as well as which are important for choosing actions that optimize over the utility of possible outcomes. This value-directed model learning allows an agent to focus resources on recognizing only those behaviors which are useful to distinguish. We show results in a simple gestural robotic control problem and in a simple card game played by two human players.
In this paper, we introduce the notion of a programmable imaging system. Such an imaging system provides a human user or a vision system significant control over the radiometric and geometric characteristics of the sy...
详细信息
In this paper, we introduce the notion of a programmable imaging system. Such an imaging system provides a human user or a vision system significant control over the radiometric and geometric characteristics of the system. This flexibility is achieved using a programmable array of micro-mirrors. The orientations of the mirrors of the array can be controlled with high precision over space and time. This enables the system to select and modulate rays from the light field based on the needs of the application at hand. We have implemented a programmable imaging system that uses a digital micro-mirror device (DMD), which is used in digital light processing. Although the mirrors of this device can only be positioned in one of two states, we show that our system can be used to implement a wide variety of imaging functions, including, high dynamic range imaging, feature detection, and object recognition. We conclude with a discussion on how a micro-mirror array can be used to efficiently control field of view without the use of moving parts.
Speech reading, also known as lip reading, is aimed at extracting visual cues of lip and facial movements to aid in recognition of speech. The main hurdle for speech reading is that visual measurements of lip and faci...
详细信息
Speech reading, also known as lip reading, is aimed at extracting visual cues of lip and facial movements to aid in recognition of speech. The main hurdle for speech reading is that visual measurements of lip and facial motion lack information-rich features like the Mel frequency cepstral coefficients (MFCC), widely used in acoustic speech recognition. These MFCC are used with hidden Markov models (HMM) in most speech recognition systems at present. Speech reading could greatly benefit from automatic selection and formation of informative features from measurements in the visual domain. These new features can then be used with HMM to capture the dynamics of lip movement and eventual recognition of lip shapes. Towards this end, we use AdaBoost methods for automatic visual feature formation. Specifically, we design an asymmetric variant of AdaBoost M2 algorithm to deal with the ill-posed multi-class sample distribution inherent in our problem. Our experiments show that the boosted HMM approach outperforms conventional AdaBoost and HMM classifiers. Our primary contributions are in the design of (a) boosted HMM and (b) asymmetric multi-class boosting.
This paper describes the development of a hand-held environment discovery tool for the blind. The final device will be composed of a laser-based range sensor and of an onboard processor. As the user swings the hand-he...
详细信息
In this paper, we propose a new method, video repairing, to robustly infer missing static background and moving fore-ground due to severe damage or occlusion from a video. To recover background pixels, we extend the i...
详细信息
In this paper, we propose a new method, video repairing, to robustly infer missing static background and moving fore-ground due to severe damage or occlusion from a video. To recover background pixels, we extend the image repairing method, where layer segmentation and homography blending are used to preserve temporal coherence and avoid flickering. By exploiting the constraint imposed by periodic motion and a subclass of camera and object motions, we adopt a two-phase approach to repair moving foreground pixels: In the sampling phase, motion data are sampled and regularized by 3D tensor voting to maintain temporal coherence and motion periodicity. In the alignment phase, missing moving foreground pixels are inferred by spatial and temporal alignment of the sampled motion data at multiple scales. We experimented our system with some difficult examples, where the camera can be stationary or in motion.
In this paper we present a novel method for performing robust illumination-tolerant face recognition. We show that this method works well even when presented with partial test faces which are also captured under varia...
详细信息
In this paper we present a novel method for performing robust illumination-tolerant face recognition. We show that this method works well even when presented with partial test faces which are also captured under variable illumination and outperforms other competing face recognition algorithms. Our method is a hybrid PCA-correlation filter which links the best of two major approaches in face recognition;Principal Component Analysis (PCA) for capturing the variability in a set of training images and advanced correlation filters which have attractive features such as illumination tolerance, shift-invariance, and can handle occlusions. We examine how these filters work and why our proposed method is able to perform better. We call our method 'Corefaces' as it seeks to model the 'core' face representation that remains relatively invariant to illumination variations. We show comparative results using the illumination subset of CMU-PIE database consisting of 65 people, and Yale-B illumination database and compare with other standard methods such as the illumination subspace method and Fisherfaces.
In this paper, we present a method for face recognition using boosted Gabor feature based classifiers. Weak classifiers are constructed based on both magnitude and phase features derived from Gabor filters [Quadrature...
详细信息
In this paper, we propose an efficient method that estimates the motion parameters of a human head from a video sequence by using a three-layer linear iterative process. In the innermost layer, we estimate the motion ...
详细信息
In this paper, we propose an efficient method that estimates the motion parameters of a human head from a video sequence by using a three-layer linear iterative process. In the innermost layer, we estimate the motion of each, input face image in a video sequence based on a generic face model and a small set of feature points. A fast iterative least-square method is used to recover these motion parameters. After that, we iteratively estimate three model scaling factors using multiple frames with the recovered poses in the middle layer. Finally, we update 3D coordinates of the feature points on the generic face model in the outermost layer. Since all iterative processes can be solved linearly, the computational cost is low. Tests on synthetic data under noisy conditions and two real video sequences have been performed. Experimental results show that the proposed method is robust and has good performance.
A new algorithm for the segmentation of objects from 3D images using deformable models is presented. This algorithm relies on learned shape and appearance models for the objects of interest. The main innovation over s...
详细信息
暂无评论