检索结果-内蒙古大学图书馆

Keyword Spotting in Continuous Speech Using Spectral and Prosodic Information Fusion

CIRCUITS SYSTEMS AND SIGNAL PROCESSING 2019年第6期38卷 2767-2791页

作者： Pandey, Laxmi Hegde, Rajesh M. Indian Inst Technol Kanpur Dept Elect Engn Kanpur Uttar Pradesh India

Keyword spotting in a continuous speech is a challenging problem and has relevance in applications like audio indexing and music retrieval. In this work, the problem of keyword spotting is addressed by utilizing the complementary information present in spectral and prosodic features of the speech signal. A thorough analysis of the complementary information is performed on a large Hindi language database developed for this purpose. Phonetic and prosodic distribution analysis is performed toward this end, using canonical correlation and Student T-distance function. Motivated by these analyses, novel methods for spectral and prosodic information fusion that optimize a combined error function is proposed. The fusion methods are developed both at the feature and the model level. Improved syllable sequence prediction and keyword spotting performance are obtained using these methods when compared to conventional methods of keyword spotting. Additionally, in order to enable comparison with the state-of-the-art deep learning-based methods, a novel method for improved syllable sequence prediction using deep denoising autoencoders is proposed. The performance of the methods proposed in this work is evaluated for keyword spotting using a syllable sliding protocol over a large Hindi database. Reasonable performance improvements are noted from the experimental results on syllable sequence prediction, keyword spotting, and audio retrieval.

关键词： deep denoising autoencoder Keyword spotting Hidden Markov models deep neural network Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Multi-objective learning based speech enhancement method to increase speech quality and intelligibility for hearing aid device users

引用

BIOMEDICAL SIGNAL PROCESSING AND CONTROL 2019年第0期48卷 35-45页

作者： Lai, Ying-Hui Zheng, Wei-Zhong Natl Yang Ming Univ Dept Biomed Engn Taipei Taiwan

Background noise is a critical issue for hearing aid device users;a common solution to address this problem is speech enhancement (SE). In recent times, a novel SE approach based on deep learning technology, called deep denoising autoencoder (DDAE), has been proposed. Previous studies show that the DDAE SE approach provides superior noise suppression capabilities and produces less distortion than any of the classical SE approaches in the case of processed speech. Motivated by the improved results using DDAE shown in previous studies, we propose the multi-objective learning-based DDAE (M-DDAE) SE approach in this study;in addition, we evaluated its speech quality and intelligibility improvements using seven typical hearing loss audiograms. The experimental results of our objective evaluations show that our M-DDAE approach achieved significantly better results than the DDAE approach in most test conditions. Considering this, the proposed M-DDAE SE approach can be potentially used to further improve the listening performance of hearing aid devices in noisy conditions. (C) 2018 Elsevier Ltd. All rights reserved.

关键词： Hearing aids Speech enhancement deep learning deep denoising autoencoder

来源：评论

学校读者我要写书评

暂无评论

Speech Dereverberation Based on Integrated deep and Ensemble Learning Algorithm

Speech Dereverberation Based on Integrated Deep and Ensemble...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Wei-Jen Lee Syu-Siang Wang Fei Chen Xugang Lu Shao-Yi Chien Yu Tsao Research Center for Information Technology Innovation Academia Sinica Taiwan Department of Electrical and Electronic Engineering Southern University of Science and Technology China National Institute of Information and Communications Technology Japan Department of Electrical Engineering National Taiwan University Taiwan

ISBN: (纸本)9781538646595

Reverberation, which is generally caused by sound reflections from walls, ceilings, and floors, can result in severe performance degradation of acoustic applications. Due to a complicated combination of attenuation and time-delay effects, the reverberation property is difficult to characterize, and it remains a challenging task to effectively retrieve the anechoic speech signals from reverberation ones. In the present study, we proposed a novel integrated deep and ensemble learning algorithm (IDEA) for speech dereverberation. The IDEA consists of offline and online phases. In the offline phase, we train multiple dereverberation models, each aiming to precisely dereverb speech signals in a particular acoustic environment;then a unified fusion function is estimated that aims to integrate the information of multiple dereverberation models. In the online phase, an input utterance is first processed by each of the dereverberation models. The outputs of all models are integrated accordingly to generate the final anechoic signal. We evaluated the IDEA on designed acoustic environments, including both matched and mismatched conditions of the training and testing data. Experimental results confirm that the proposed IDEA outperforms single deep-neural-network-based dereverberation model with the same model architecture and training data.

关键词： deep neural networks Speech dereverberation Ensemble learning Convolutional neural networks deep denoising autoencoder International Data Encryption Algorithm Speech Reverberation degradation of performance OFF LINE Acoustic environments

来源：评论

学校读者我要写书评

暂无评论

COMPARISON OF UNSUPERVISED SEQUENCE ADAPTATIONS FOR deep NEURAL NETWORKS 41

COMPARISON OF UNSUPERVISED SEQUENCE ADAPTATIONS FOR DEEP NEU...

引用

41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Kobayashi, Akio Onoe, Kazuo Ichiki, Manon Sato, Shoei NHK Engn Syst Inc Tokyo Japan NHK Japan Broadcasting Coporat Sci & Technol Res Labs Tokyo Japan

ISBN: (纸本)9781479999880

This paper compares unsupervised sequence training techniques for deep neural networks (DNN) for broadcast transcriptions. Recent progress in digital archiving of broadcast content has made it easier to access large amounts of speech data. Such archived data will be helpful for acoustic/language modeling in live-broadcast captioning based on automatic speech recognition (ASR). In Japanese broadcasts, however, archived programs, e.g., sports news, do not always have closed-captions used typically as references. Thus, unsupervised adaptation techniques are needed for performance improvements even when a DNN is used as an acoustic model. In this paper, we compared three unsupervised sequence adaptation techniques: maximum a posteriori (MAP), entropy minimization, and Bayes risk minimization. Experimental results for transcribing sports news programs showed that the best ASR performance is brought about by Bayes risk minimization which reflects information as to expected errors, while comparable results are obtained with MAP, the simplest way of unsupervised sequence adaptation.

关键词： acoustic modeling deep neural network unsupervised adaptation Bayes risk minimization deep denoising autoencoder

来源：评论

学校读者我要写书评

暂无评论

COMPARISON OF UNSUPERVISED SEQUENCE ADAPTATIONS FOR deep NEURAL NETWORKS

COMPARISON OF UNSUPERVISED SEQUENCE ADAPTATIONS FOR DEEP NEU...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Akio Kobayashi Kazuo Onoe Manon Ichiki Shoei Sato NHK Engineering System. Inc. Tokyo Japan NHK (Japan Broadcasting Coporation) Science and Technology Research Laboratories Tokyo Japan

ISBN: (纸本)9781479999897

关键词： acoustic modeling deep neural network unsupervised adaptation Bayes risk minimization deep denoising autoencoder acoustic modeling Bayes Speech recognition digital archiving Neural network Data Archiving Acoustic models aquifer storage and recovery

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：