This paper proposes a new generative model which can deal with rotational data variations by extending Separable Lattice 2-D HMMs (SL2D-HMMs). In image recognition, geometrical variations such as size, location and ro...
详细信息
ISBN:
(纸本)9781424442966
This paper proposes a new generative model which can deal with rotational data variations by extending Separable Lattice 2-D HMMs (SL2D-HMMs). In image recognition, geometrical variations such as size, location and rotation degrade the performance, therefore normalization is required. SL2D-HMMs can perform an elastic matching in both horizontal and vertical directions;this makes it possible to model invariances to size and location. To deal with rotational variations, we introduce additional HMM states which represent the shifts of the state alignments of the observation lines in a particular direction. Face recognition experiments show that the proposed method improves the performance significantly for rotational variation data.
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by inte...
详细信息
ISBN:
(纸本)9781424442966
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.
In this paper, we propose a human posture reconstruction method from the insufficient input posture data based on human posture probability density that is constructed by a long-term human motion capture data. Since t...
详细信息
ISBN:
(纸本)0780389123
In this paper, we propose a human posture reconstruction method from the insufficient input posture data based on human posture probability density that is constructed by a long-term human motion capture data. Since the long continuous daily human motion data has high dimensions and becomes huge size, the human posture data should be effectively compressed. The long term posture data has nonlinear distribution on the posture space, since each specific posture such as standing and sitting has different property. The posture data is allocated into some subspaces and compressed for each subspace with Mixtures of Probabilistic Principal Component Analyzer (MPPCA). MPPCA is improved by replacing conventional emalgorithm with deterministic annealing em algorithm (DAem) to avoid initial parameter sensitivity. The posture probability density is constructed over those subspaces. The adequate human posture can be reconstructed from the insufficient data by introducing the posture probability density into the Sequential Monte Carlo frame work. The experimental results show that the robust human posture estimation can be realized since this method does not estimate the unique posture but estimates the proper posterior posture density with using the posture prior knowledge.
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by inte...
详细信息
ISBN:
(纸本)9781424442959
This paper describes factor analyzed voice models for realizing various voice characteristics in the HMM-based speech synthesis. The eigenvoice method can synthesize speech with arbitrary voice characteristics by interpolating representative HMM sets. However, the objective of PCA is to accurately reconstruct each speaker-dependent HMM set, and this is not equivalent to estimating models which represent training data accurately. To overcome this problem, we propose a general speech model which generates speech utterances with various voice characteristics directly. In the proposed method, the HMM states, factors representing voice characteristics and contextual decision trees are simultaneously optimized within a unified framework.
This paper proposes a new generative model which can deal with rotational data variations by extending Separable Lattice 2-D HMMs (SL2D-HMMs). In image recognition, geometrical variations such as size, location and ro...
详细信息
ISBN:
(纸本)9781424442959
This paper proposes a new generative model which can deal with rotational data variations by extending Separable Lattice 2-D HMMs (SL2D-HMMs). In image recognition, geometrical variations such as size, location and rotation degrade the performance, therefore normalization is required. SL2D-HMMs can perform an elastic matching in both horizontal and vertical directions;this makes it possible to model invariances to size and location. To deal with rotational variations, we introduce additional HMM states which represent the shifts of the state alignments of the observation lines in a particular direction. Face recognition experiments show that the proposed method improves the performance significantly for rotational variation data.
暂无评论