There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention offers benefits over soft attention such as decrease...
详细信息
ISBN:
(纸本)9781538646595
There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention offers benefits over soft attention such as decreased computational cost, but training hard attention models can be difficult because of the discrete latent variables they introduce. Previous work used REINFORCE to approach these issues, however, it suffers from high-variance gradient estimates, resulting in slow convergence. In this paper, we tackle the problem of learning hard attention for a sequential task using variational inference methods, specifically the recently introduced Variational Inference for Monte Carlo Objectives (VIMCO) and Neural Variational Inference (NVIL). Furthermore, we propose a novel baseline that adapts VIMCO to this setting. We demonstrate our method on a phoneme recognition task in clean and noisy environments and show that our method outperforms REINFORCE, with the difference being greater for a more complicated task.
Task-oriented dialog systems usually face the challenge of querying knowledge ***,it usually cannot be explicitly modeled due to the lack of *** this paper,we introduce an explicit KB retrieval component(KB retriever)...
详细信息
ISBN:
(纸本)9783030017156
Task-oriented dialog systems usually face the challenge of querying knowledge ***,it usually cannot be explicitly modeled due to the lack of *** this paper,we introduce an explicit KB retrieval component(KB retriever)into the seq2seq dialogue *** first use the KB retriever to get the most relevant entry according to the dialogue history and KB,and then apply the copying mechanism to retrieve entities from the retrieved KB in decoding ***,the KB retriever is trained with distant supervision,which does not need any annotation *** on Stanford Multi-turn Task-oriented Dialogue Dataset shows that our framework significantly outperforms other sequence-to-sequence based baseline models on both automatic and human evaluation.
We present AliMe Assist, an intelligent assistant designed for creating an innovative online shopping experience in E-commerce. Based on question answering (QA), AliMe Assist offers assistance service, customer servic...
详细信息
ISBN:
(纸本)9781450349185
We present AliMe Assist, an intelligent assistant designed for creating an innovative online shopping experience in E-commerce. Based on question answering (QA), AliMe Assist offers assistance service, customer service, and chatting service. It is able to take voice and text input, incorporate context to QA, and support multi round interaction. Currently, it serves millions of customer questions per day and is able to address 85% of them. In this paper, we demonstrate the system, present the underlying techniques, and share our experience in dealing with real-world QA in the E commerce field.
A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module. Building these components often requires extensive domain exp...
详细信息
ISBN:
(纸本)9781510848764
A text-to-speech synthesis system typically consists of multiple stages, such as a text analysis frontend, an acoustic model and an audio synthesis module. Building these components often requires extensive domain expertise and may contain brittle design choices. In this paper, we present Tacotron, an end-to-end generative text-to-speech model that synthesizes speech directly from characters. Given pairs, the model can be trained completely from scratch with random initialization. We present several key techniques to make the sequence-to-sequence framework perform well for this challenging task. Tacotron achieves a 3.82 subjective 5-scale mean opinion score on US English. outperforming a production parametric system in terms of naturalness. In addition, since Tacotron generates speech at the frame level, it's substantially faster than sample level autoregressive methods.
In this paper, we study how to improve sentence compression using sequence-to-sequence models in a cross domain setting. Hypothesizing that syntax plays an important role in cross-domain sentence compression, we adopt...
详细信息
ISBN:
(纸本)9781538636756
In this paper, we study how to improve sentence compression using sequence-to-sequence models in a cross domain setting. Hypothesizing that syntax plays an important role in cross-domain sentence compression, we adopt an existing domain adaptation method and explore three auxiliary tasks closely related to syntax. Our experiment results demonstrate the effectiveness of our model.
With the recent development of sequence-to-sequence framework,generation approach for short text conversation becomes *** sequence-to-sequence method for short text conversation often suffers from dull response ***-re...
详细信息
With the recent development of sequence-to-sequence framework,generation approach for short text conversation becomes *** sequence-to-sequence method for short text conversation often suffers from dull response ***-resolution generation approach has been introduced to address this problem by dividing the generation process into two steps:keywords-sequence generation and response ***,this method still tends to generate short and common *** this work,a new multi-resolution generation framework is *** of using the word-level maximum likelihood criterion,we optimize the sequence-level GLEU score of the entire generated keywords-sequence using a policy gradient approach in reinforcement *** show that the proposed approach can generate longer and more diverse ***,it achieves better score in human evaluation.
暂无评论