The local space-time feature is an effective way to represent video data and achieves state-of-the-art performance in action recognition. However, in majority of cases, it only captures the static or dynamic cues of t...
详细信息
One key challenge of facial trait recognition is the large non-rigid appearance variations due to some irrelevant real world factors, such as viewpoint and expression changes. In this paper, we explore how the shape i...
详细信息
ISBN:
(纸本)9781467369657
One key challenge of facial trait recognition is the large non-rigid appearance variations due to some irrelevant real world factors, such as viewpoint and expression changes. In this paper, we explore how the shape information, i.e. facial landmark positions, can be explicitly deployed into the popular Convolutional Neural Network (CNN) architecture to disentangle such irrelevant non-rigid appearance variations. First, instead of using fixed kernels, we propose a kernel adaptation method to dynamically determine the convolutional kernels according to the spatial distribution of facial landmarks, which helps learning more robust features. Second, motivated by the intuition that different local facial regions may demand different adaptation functions, we further propose a tree-structured convolutional architecture to hierarchically fuse multiple local adaptive CNN subnetworks. Comprehensive experiments on WebFace, Morph II and MultiPIE databases well validate the effectiveness of the proposed kernel adaptation method and tree-structured convolutional architecture for facial trait recognition tasks, including identity, age and gender recognition. For all the tasks, the proposed architecture consistently achieves the state-of-the-art performances.
In order to classify the objects in nature images, a model with color constancy and principle component analysis network (PCANet) is proposed. The new color constancy model imitates the functional properties of the HV...
详细信息
In the paper, we propose a novel ordinal regression method called minimum class variance support vector ordinal regression(MCVSVOR). MCVSVOR is derived from minimum class variance support vector machine(MCVSVM) which ...
详细信息
ISBN:
(纸本)9781510812055
In the paper, we propose a novel ordinal regression method called minimum class variance support vector ordinal regression(MCVSVOR). MCVSVOR is derived from minimum class variance support vector machine(MCVSVM) which is a variant of SVM, and so inherits the latter's characteristics such as taking the distribution of the categories into consideration and good generalization performance. Finally, the experimental results validate the effectiveness of MCVSVOR and indicate its superior generalization performance over SVOR.
In this work, a kernel principle component analysis network (KPCANet) is proposed for classification of the facial expression in unconstrained images, which comprises only the very basic data processing components: ca...
详细信息
Sparse Representation based Classification (SRC) and its potential in object tracking have been explored in recent years. However, the trade-off between the discriminative ability of the overly emphasized sparse repre...
详细信息
Sparse Representation based Classification (SRC) and its potential in object tracking have been explored in recent years. However, the trade-off between the discriminative ability of the overly emphasized sparse representation and the lack of insight on correlation of visual information has raised questions over the general applicability of such methods in object tracking. In addition, the need for the optimization of a series of l 1 -regularized least square norm, increases the computational complexity thereby limiting their usage in real-time applications. In this paper, a novel approach to robust object tracking is proposed. First, the variations in the appearance of the tracked target is modelled using PCA basis vectors, and further, a l 2 -regularized least square method is used to solve the proposed representation model. In order to improve the robustness of feature representation in object tracking applications, weights are associated with multiple trackers; each formulated using a different feature, and adapted via an online learning scheme. Finally, a decision fusion criterion is imposed to generate an optimized output through the weighted combination of different tracking results. Experiments on challenging video sequences have demonstrated the superior accuracy and robustness of the proposed method in comparison to thirteen other state-of-the-art baselines.
In order to obtain a robust supervised model with good generalization ability, traditional supervised learning method has to be trained with sufficient well labeled and uniformly distributed samples. However, in many ...
详细信息
Segmenting the prostate from CT images is a critical step in the radiotherapy planning for prostate cancer. The segmentation accuracy could largely affect the efficacy of radiation treatment. However, due to the touch...
详细信息
This paper presents a deep learning method application to the extraction of emotions included in Chinese speech with a deep belief network (DBN) structure. Eight proper features such as pitch, mel frequency cepstrum c...
详细信息
ISBN:
(纸本)9781479974351
This paper presents a deep learning method application to the extraction of emotions included in Chinese speech with a deep belief network (DBN) structure. Eight proper features such as pitch, mel frequency cepstrum coefficient (MFCC) are chosen from Mandarin speech used as network inputs, and a DBN classifier is used instead of traditional shallow learning methods to recognition of emotions. Experiment studies have proven that its recognition rate is higher than that of the traditional back propagation (BP) method and support vector machine (SVM) classifier.
Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and abnormal limb movements. Detection and quantification of these movements in infants from videos are hence desirable for p...
详细信息
ISBN:
(纸本)9781467383264
Central nervous system dysfunction in infants may be manifested through inconsistent, rigid and abnormal limb movements. Detection and quantification of these movements in infants from videos are hence desirable for providing useful information to clinicians. This could lead to computer-aided diagnosis of dysfunctions where early treatment may improve infant development. In this paper, we propose a scheme for detecting and quantifying qualitative aspects of limb movement through multiple tracking and state space motion modeling on videos. The main novelties of the paper include: (a) An enhanced detection method for effectively detection small weak marker points from video; (b) Bayesian estimation and nearest neighbor searching for selecting new observation in individual tracker and for tracking marker trajectories on limbs; (c) A criterion for anomaly detection based on the frequency and duration of abrupt changes in limb movement, using window averaged prominent residual powers. The proposed method has been tested on videos of neonates, results show that the proposed method is promising for tracking and quantifying the movement of neonate limbs for helping medical diagnostics.
暂无评论