检索结果-内蒙古大学图书馆

Speaker identification using multimodal neural networks and wavelet analysis

IET BIOMETRICS 2015年第1期4卷 18-28页

作者： Almaadeed, Noor Aggoun, Amar Amira, Abbes Brunel Univ Dept Comp Engn Uxbridge UB8 3PH Middx England Qatar Univ Coll Engn Dept Comp Sci & Engn Doha Qatar Univ Bedfordshire Dept Comp Sci & Technol Luton LU1 3JU Beds England Univ West Scotland Dept Engn & Comp Sci Paisley Renfrew Scotland

The rapid momentum of the technology progress in the recent years has led to a tremendous rise in the use of biometric authentication systems. The objective of this research is to investigate the problem of identifying a speaker from its voice regardless of the content. In this study, the authors designed and implemented a novel text-independent multimodal speaker identification system based on wavelet analysis and neural networks. wavelet analysis comprises discrete wavelet transform, wavelet packet transform, wavelet sub-band coding and Mel-frequency cepstral coefficients (MFCCs). The learning module comprises general regressive, probabilistic and radial basis function neural networks, forming decisions through a majority voting scheme. The system was found to be competitive and it improved the identification rate by 15% as compared with the classical MFCC. In addition, it reduced the identification time by 40% as compared with the back-propagation neural network, Gaussian mixture model and principal component analysis. Performance tests conducted using the GRID database corpora have shown that this approach has faster identification time and greater accuracy compared with traditional approaches, and it is applicable to real-time, text-independent speaker identification systems.

关键词： Gaussian processes audio databases backpropagation biometrics (access control) cepstral analysis discrete wavelet transforms mixture models principal component analysis radial basis function networks speaker recognition text analysis GRID database corpora Gaussian mixture model MFCC Mel-frequency cepstral coefficients back-propagation neural network biometric authentication systems discrete wavelet transform general regressive neural networks learning module majority voting scheme multimodal neural networks probabilistic neural networks radial basis function neural networks text-independent multimodal speaker identification system wavelet analysis wavelet packet transform wavelet subband coding Biometric Identification speaker recognition Radial basis function networks Gaussian processes audio databases Mel frequency cepstral coefficient Mixture models wavelet packets wavelet Analysis principal components analysis back propagation probabilistic neural network text processing Cepstral analysis discrete wavelet transform

来源：评论

学校读者我要写书评

暂无评论

Performance analysis of 3-D subband coding for low bit rate video

Performance analysis of 3-D subband coding for low bit rate ...

引用

Conference on Digital Compression Technologies and Systems for Video Communications

作者： Mainguy, A Wang, LM COMMUN RES CTR OTTAWAON K2H 8S2CANADA

ISBN: (纸本)0819423564

Two prevalent approaches for video coding are hybrid motion compensated DCT coding (MC/DCT) and subband coding. Hybrid MC/DCT coding has been adopted in present standards for low bit rate digital video compression such as ITU-T Recommendations H.261 and H.263. One problem with hybrid MC/DCT coding is that blocking artifacts in the reconstructed video sequences are prominent at low bit rates due to block segmentation of the image. Unlike block transform coding, subband coding does not suffer from these 'blocking' effects. A significant issue for the subband video coder is to fully exploit the temporal redundancy prevailing in video images for efficient video coding. More recent studies have addressed this problem using the three- dimensional (3-D) subband framework. In this study, a packet wavelet processing scheme is implemented to exploit temporal redundancy in video sequences. A bit allocation strategy is proposed and applied to the coding of the temporal subbands performed in an embedded fashion. The coding performance of the resulting 3-D wavelet subband video coder is compared with the H.261 coder at a bit rate of 384 kbps and CIF resolution, and with the H.263 coder at 64 kbps and QCIF resolution. Test sequences are selected to cover a reasonable range of scene contents.

关键词： video compression wavelet subband coding low bit rate video coding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：