文献详情 >Automatic Speech Recognition S... 收藏

Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit

有产量门的自动语音识别系统投射 Gated 周期性的单位

作者：Cheng, Gaofeng Zhang, Pengyuan Xu, Ji

作者机构：Univ Chinese Acad Sci Sch Elect Elect & Commun Engn Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China Chinese Acad Sci Inst Acoust Key Lab Speech Acoust & Content Understanding Beijing Peoples R China

出版物：《IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS》 (电子信息通信学会汇刊：信息与系统)

年卷期：2019年第E102D卷第2期

页面：355-363页

核心收录：

学科分类：0809[工学-电子科学与技术（可授工学、理学学位）] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：National Key Research and Development Plan [2016YFB0801203, 2016YFB0801200] National Natural Science Foundation of China [11590770-4, U1536117, 11504406, 11461141004] Key Science and Technology Project of the Xinjiang Uygur Autonomous Region [2016A03007-1] Pre-research Project for Equipment of General Information System [JZX2017-0994/Y306]

主　　题：GRU LSTM neural network language model speech recognition

摘要：The long short-term memory recurrent neural network (LSTM) has achieved tremendous success for automatic speech recognition (ASR). However, the complicated gating mechanism of LSTM introduces a massive computational cost and limits the application of LSTM in some scenarios. In this paper, we describe our work on accelerating the decoding speed and improving the decoding accuracy. First, we propose an architecture, which is called Projected Gated Recurrent Unit (PGRU), for ASR tasks, and show that the PGRU can consistently outperform the standard GRU. Second, to improve the PGRU generalization, particularly on large-scale ASR tasks, we propose the Output-gate PGRU (OPGRU). In addition, the time delay neural network (TDNN) and normalization methods are found beneficial for OPGRU. In this paper, we apply the OPGRU for both the acoustic model and recurrent neural network language model (RNN-LM). Finally, we evaluate the PGRU on the total Eval2000 / RT03 test sets, and the proposed OPGRU single ASR system achieves 0.9% / 0.9% absolute (8.2% / 8.6% relative) reduction in word error rate (WER) compared to our previous best LSTM single ASR system. Furthermore, the OPGRU ASR system achieves significant speed-up on both acoustic model and language model rescoring.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Automatic Speech Recognition System with Output-Gate Projected Gated Recurrent Unit

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：