There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention offers benefits over soft attention such as decrease...
详细信息
ISBN:
(纸本)9781538646595
There has recently been significant interest in hard attention models for tasks such as object recognition, visual captioning and speech recognition. Hard attention offers benefits over soft attention such as decreased computational cost, but training hard attention models can be difficult because of the discrete latent variables they introduce. Previous work used REINFORCE to approach these issues, however, it suffers from high-variance gradient estimates, resulting in slow convergence. In this paper, we tackle the problem of learning hard attention for a sequential task using variational inference methods, specifically the recently introduced Variational Inference for Monte Carlo Objectives (VIMCO) and Neural Variational Inference (NVIL). Furthermore, we propose a novel baseline that adapts VIMCO to this setting. We demonstrate our method on a phoneme recognition task in clean and noisy environments and show that our method outperforms REINFORCE, with the difference being greater for a more complicated task.
Task-oriented dialog systems usually face the challenge of querying knowledge ***,it usually cannot be explicitly modeled due to the lack of *** this paper,we introduce an explicit KB retrieval component(KB retriever)...
详细信息
ISBN:
(纸本)9783030017156
Task-oriented dialog systems usually face the challenge of querying knowledge ***,it usually cannot be explicitly modeled due to the lack of *** this paper,we introduce an explicit KB retrieval component(KB retriever)into the seq2seq dialogue *** first use the KB retriever to get the most relevant entry according to the dialogue history and KB,and then apply the copying mechanism to retrieve entities from the retrieved KB in decoding ***,the KB retriever is trained with distant supervision,which does not need any annotation *** on Stanford Multi-turn Task-oriented Dialogue Dataset shows that our framework significantly outperforms other sequence-to-sequence based baseline models on both automatic and human evaluation.
Sentence compression is a task of compressing sentences containing redundant information into short semantic expressions,simplifying the text struc-ture and retaining important meanings and *** network-based models ar...
详细信息
ISBN:
(纸本)9783030323806
Sentence compression is a task of compressing sentences containing redundant information into short semantic expressions,simplifying the text struc-ture and retaining important meanings and *** network-based models are limited by the size of the window and do not perform well when using long-distance dependent *** solve this problem,we introduce a ver-sion of the graph convolutional network(GCNs)to utilize the syntactic depend-ency relations,and explore a new way to combine GCNs with the sequence-to-sequence model(Seq2Seq)to complete the *** model combines the ad-vantages of both and achieves complementary *** addition,in order to reduce the error propagation of the parse tree,we dynamically adjust the depend-ency arc to optimize the construction process of *** show that the model combined with the graph convolution network is better than the origi-nal model,and the performance in the Google sentence compression dataset has been effectively improved.
With the recent development of sequence-to-sequence framework,generation approach for short text conversation becomes *** sequence-to-sequence method for short text conversation often suffers from dull response ***-re...
详细信息
With the recent development of sequence-to-sequence framework,generation approach for short text conversation becomes *** sequence-to-sequence method for short text conversation often suffers from dull response ***-resolution generation approach has been introduced to address this problem by dividing the generation process into two steps:keywords-sequence generation and response ***,this method still tends to generate short and common *** this work,a new multi-resolution generation framework is *** of using the word-level maximum likelihood criterion,we optimize the sequence-level GLEU score of the entire generated keywords-sequence using a policy gradient approach in reinforcement *** show that the proposed approach can generate longer and more diverse ***,it achieves better score in human evaluation.
暂无评论