The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifyin...
详细信息
The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.
Two prevalent approaches for video coding are hybrid motion compensated DCT coding (MC/DCT) and subbandcoding. Hybrid MC/DCT coding has been adopted in present standards for low bit rate digital video compression suc...
详细信息
ISBN:
(纸本)0819423564
Two prevalent approaches for video coding are hybrid motion compensated DCT coding (MC/DCT) and subbandcoding. Hybrid MC/DCT coding has been adopted in present standards for low bit rate digital video compression such as ITU-T Recommendations H.261 and H.263. One problem with hybrid MC/DCT coding is that blocking artifacts in the reconstructed video sequences are prominent at low bit rates due to block segmentation of the image. Unlike block transform coding, subbandcoding does not suffer from these 'blocking' effects. A significant issue for the subband video coder is to fully exploit the temporal redundancy prevailing in video images for efficient video coding. More recent studies have addressed this problem using the three- dimensional (3-D) subband framework. In this study, a packet wavelet processing scheme is implemented to exploit temporal redundancy in video sequences. A bit allocation strategy is proposed and applied to the coding of the temporal subbands performed in an embedded fashion. The coding performance of the resulting 3-D waveletsubband video coder is compared with the H.261 coder at a bit rate of 384 kbps and CIF resolution, and with the H.263 coder at 64 kbps and QCIF resolution. Test sequences are selected to cover a reasonable range of scene contents.
暂无评论