Facial landmark detection has long been impeded by the problems of occlusion and pose variation. Instead of treating the detection task as a single and independent problem, we investigate the possibility of improving ...
详细信息
ISBN:
(纸本)9783319105994;9783319105987
Facial landmark detection has long been impeded by the problems of occlusion and pose variation. Instead of treating the detection task as a single and independent problem, we investigate the possibility of improving detection robustness through multi-task learning. Specifically, we wish to optimize facial landmark detection together with heterogeneous but subtly correlated tasks, e. g. head pose estimation and facial attribute inference. this is non-trivial since different tasks have different learning difficulties and convergence rates. To address this problem, we formulate a novel tasks-constrained deep model, with task-wise early stopping to facilitate learning convergence. Extensive evaluations show that the proposed task-constrained learning (i) outperforms existing methods, especially in dealing with faces with severe occlusion and pose variation, and (ii) reduces model complexity drastically compared to the state-of-the-art method based on cascaded deep model [21].
In the perspective of actual production, the paper presents the advances in the application of image processing fruit grading from several aspects, such as processing precision and processing speed of image processing...
详细信息
Motions of organs or extremities are important features for clinical diagnosis. However, tracking and segmentation of complex, quickly changing motion patterns is challenging, certainly in the presence of occlusions. ...
详细信息
ISBN:
(纸本)9783319117522;9783319117515
Motions of organs or extremities are important features for clinical diagnosis. However, tracking and segmentation of complex, quickly changing motion patterns is challenging, certainly in the presence of occlusions. Neither state-of-the-art tracking nor motion segmentation approaches are able to deal with such cases. thus far, motion capture systems or the like were needed which are complicated to handle and which impact on the movements. We propose a solution based on a single video camera, that is not only far less intrusive, but also a lot cheaper. the limitation of tracking and motion segmentation are overcome by a new approach to integrate prior knowledge in the form of weak labeling intomotion segmentation. Using the example of Cerebral Palsy detection, we segment motion patterns of infants into the different body parts by analyzing body movements. Our experimental results show that our approach outperforms current motion segmentation and tracking approaches.
the proceedings contain 58 papers. the special focus in this conference is on computer and Computing Technologies in Agriculture. the topics include: Maize seed embryo and position inspection based on image processing...
ISBN:
(纸本)9783642543401
the proceedings contain 58 papers. the special focus in this conference is on computer and Computing Technologies in Agriculture. the topics include: Maize seed embryo and position inspection based on image processing;greenhouse irrigation optimization decision support system;agricultural field environment high-quality image remote acquisition;study on consultative agricultural knowledge service system;stochastic simulation and application of monthly rainfall and evaporation;elimination method study of ambiguous words in chinese automatic indexing;analysis and evaluation of soil fertility status based on weighted K-means clustering algorithm;the classification of pavement crack image based on beamlet algorithm;research on the construction and implementation of soil fertility knowledge based on ontology;research on 3G terminal-based agricultural information service;study on the application of information technologies on suitability evaluation analysis in agriculture;study on the way of production, life and thinking of farmers in mobile internet era;research and design of peanut diseases diagnosis and prevention expert system;semantic-based reasoning for vegetable supply chain knowledge retrieval;research on agricultural products cold-chain logistics of mobile services application;three-dimensional reconstruction and characteristics computation of corn ears based on machine vision;effects of water and nutrition on photoassimilates partitioning coefficient variation;application of a logical reasoning approach based Petri net in agriculture expert system;research on text mining based on domain ontology and research on the vegetable trade current situation and its trade competitiveness in China.
In pig production, food conversion ratio and profit can be evaluated by real time detection of pig live weight. Traditional pig weight detections usually require direct contact with pigs, which are limited by its low ...
详细信息
Based on the Momel algorithm, a set of acoustic parameters was analyzed automatically on chinese emotional speech. Global prosodic features were calculated on the sentence level, which showed a concordance withthe us...
详细信息
Based on the Momel algorithm, a set of acoustic parameters was analyzed automatically on chinese emotional speech. Global prosodic features were calculated on the sentence level, which showed a concordance withthe usual pattern reported in the literature. Local constraints were also considered on the syllable layer. An ANOVA showed that there were interactive effects among emotions, syllable positions and syllable tones on certain parameters. Further more, by examining the pitch movements, no significant difference was found between neutral speech and active emotional speech, which was different from the performance in non-tonal languages. However when reducing the tonal influence by using utterances composed of only tone 1 syllables, this inverse effect disappeared. Hence we posited an interpretation that due to the existence of lexical tone in Mandarin chinese, the paralinguistic use of pitch movements has been reduced.
this paper proposes a novel Affine Subspace Representation (ASR) descriptor to deal with affine distortions induced by viewpoint changes. Unlike the traditional local descriptors such as SIFT, ASR inherently encodes l...
详细信息
ISBN:
(纸本)9783319105840;9783319105833
this paper proposes a novel Affine Subspace Representation (ASR) descriptor to deal with affine distortions induced by viewpoint changes. Unlike the traditional local descriptors such as SIFT, ASR inherently encodes local information of multi-view patches, making it robust to affine distortions while maintaining a high discriminative ability. To this end, PCA is used to represent affine-warped patches as PCA-patch vectors for its compactness and efficiency. then according to the subspace assumption, which implies that the PCA-patch vectors of various affine-warped patches of the same keypoint can be represented by a low-dimensional linear subspace, the ASR descriptor is obtained by using a simple subspace-to-point mapping. Such a linear subspace representation could accurately capture the underlying information of a keypoint (local structure) under multiple views without sacrificing its distinctiveness. To accelerate the computation of ASR descriptor, a fast approximate algorithm is proposed by moving the most computational part (i.e., warp patch under various affine transformations) to an offline training stage. Experimental results show that ASR is not only better than the state-of-the-art descriptors under various image transformations, but also performs well without a dedicated affine invariant detector when dealing with viewpoint changes.
three-dimensional shape descriptors of corn ears are important traits in corn breeding, genetic and genomics research, however it is difficult to accurately and consistently measure 3D features of corn ears by hand or...
详细信息
Image segmentation is of great importance in the fields of computervision, face recognition, medical imaging, digital libraries, and video retrieval. this paper presents a novel method for image segmentation based on...
详细信息
ISBN:
(纸本)9783319093307;9783319093291
Image segmentation is of great importance in the fields of computervision, face recognition, medical imaging, digital libraries, and video retrieval. this paper presents a novel method for image segmentation based on a Hybrid particle swarm algorithm, which combines the advantages of swarm intelligence and the natural selection mechanism of artificial bee colony algorithm. Experimental results show that the proposed method can reach a higher quality adequate segmentation, reduce the CPU processing time and eliminate the particles falling into local minima.
this paper introduces a hierarchical stress generation for expressive speech synthesis. In the previous study, we proposed a novel hierarchical Mandarin stress modeling method, and the text-based stress prediction exp...
详细信息
this paper introduces a hierarchical stress generation for expressive speech synthesis. In the previous study, we proposed a novel hierarchical Mandarin stress modeling method, and the text-based stress prediction experiments demonstrates a reliable stress assignment can be obtained from textual features. However, the stress model should be further verified to be an effective and efficient prosody model in a Text-to-Speech system. In this work, Fujisaki model known as an ideal global representation of prosody is adopted to construct the pitch contours. To illustrate the effect of stress model, the Fujisaki model parameters are automatically predicted by the textural feature with and without stress information. the synthetic speech sounds more natural than that without stress modeling. the RMSE of the pitch contour and the feature importance analysis also show stress information can improve the pitch modeling. this work offers a promising method to accurate pitch modeling for Mandarin expressive speech synthesis.
暂无评论