Modern day conversational agents are trained to emulate the manner in which humans communicate. To emotionally bond with the user, these virtual agents need to be aware of the affective state of the user. Transformers...
详细信息
ISBN:
(纸本)9781665400213
Modern day conversational agents are trained to emulate the manner in which humans communicate. To emotionally bond with the user, these virtual agents need to be aware of the affective state of the user. Transformers are the recent state of the art in sequence-to-sequence learning that involves training an encoder-decoder model with word embeddings from utterance-response pairs. We propose an emotion-aware transformer encoder for capturing the emotional quotient in the user utterance in order to generate human-like empathetic responses. The contributions of our paper are as follows: 1) An emotion detector module trained on the input utterances determines the affective state of the user in the initial phase 2) A novel transformer encoder is proposed that adds and normalizes the word embedding with emotion embedding thereby integrating the semantic and affective aspects of the input utterance 3) The encoder and decoder stacks belong to the Transformer-XL architecture which is the recent state of the art in language modeling. Experimentation on the benchmark Facebook AI empathetic dialogue dataset confirms the efficacy of our model from the higher BLEU-4 scores achieved for the generated responses as compared to existing methods. Emotionally intelligent virtual agents are now a reality and inclusion of affect as a modality in all human-machine interfaces is foreseen in the immediate future.
For time series forecasting, the weight distribution among multivariables and the long-short-term time dependence are always very important and challenging. Traditional machine forecasting can't automatically sele...
详细信息
ISBN:
(纸本)9783030638351;9783030638368
For time series forecasting, the weight distribution among multivariables and the long-short-term time dependence are always very important and challenging. Traditional machine forecasting can't automatically select the effective features of multivariable input and can't capture the time dependence of sequences. The key to solve this problem is to capture the spatial correlations at the same time, the spatiotemporal relationships at different times and the long-term dependence of the temporal relationships between different series. In this paper, inspired by human attention mechanism including encoder-decoder model, we propose DPAST-based RNN (DPAST-RNN) for long-term time series prediction. Specifically, in the first phase we use attention mechanism to extract relevant features at each time adaptively then we use stacked LSTM units to extract hidden information of time series both from time and space dimensions. In the second phase, we use another attention mechanism to select the related hidden state in encoder to the hidden state of the decoder at the current time to make context vector which is embed into recurrent neural network in decoder. Thorough empirical studies based upon the VM-Power dataset we collected on OpenStack and the NASDAQ 100 Stock dataset demonstrate that the DPAST-RNN can outperform state-of-the-art methods for time series prediction.
encoder-decoder models have made great progress on handwritten mathematical expression recognition recently. However, it is still a challenge for existing methods to assign attention to image features accurately. More...
详细信息
ISBN:
(纸本)9783030863319
encoder-decoder models have made great progress on handwritten mathematical expression recognition recently. However, it is still a challenge for existing methods to assign attention to image features accurately. Moreover, those encoder-decoder models usually adopt RNN-based models in their decoder part, which makes them inefficient in processing long LATEX sequences. In this paper, a transformer-based decoder is employed to replace RNN-based ones, which makes the whole model architecture very concise. Furthermore, a novel training strategy is introduced to fully exploit the potential of the transformer in bidirectional language modeling. Compared to several methods that do not use data augmentation, experiments demonstrate that our model improves the ExpRate of current state-of-the-art methods on CROHME 2014 by 2.23%. Similarly, on CROHME 2016 and CROHME 2019, we improve the ExpRate by 1.92% and 2.28% respectively.
Medical Concept Coding (MCD) is a crucial task in biomedical information extraction. Recent advances in neural network modeling have demonstrated its usefulness in the task of natural language processing. Modern frame...
详细信息
ISBN:
(纸本)9783319989327;9783319989310
Medical Concept Coding (MCD) is a crucial task in biomedical information extraction. Recent advances in neural network modeling have demonstrated its usefulness in the task of natural language processing. Modern framework of sequence-to-sequence learning that was initially used for recurrent neural networks has been shown to provide powerful solution to tasks such as Named Entity Recognition or Medical Concept Coding. We have addressed the identification of clinical concepts within the International Classification of Diseases version 10 (ICD-10) in two benchmark data sets of death certificates provided for the task 1 in the CLEF eHealth shared task 2017. A proposed architecture combines ideas from recurrent neural networks and traditional text retrieval term weighting schemes. We found that our models reach accuracy of 75% and 86% as evaluated by the F-measure on the CepiDc corpus of French texts and on the CDC corpus of English texts, respectfully. The proposed models can be employed for coding electronic medical records with ICD codes including diagnosis and procedure codes.
Autonomous vehicles need the ability to predict the trajectory of surrounding vehicles, so as to make a rational decision planning, improve driving safety and ride comfort. In this paper, a new hierarchical Long Short...
详细信息
ISBN:
(纸本)9781728163956
Autonomous vehicles need the ability to predict the trajectory of surrounding vehicles, so as to make a rational decision planning, improve driving safety and ride comfort. In this paper, a new hierarchical Long Short-Term Memory (LSTM) based on Spatio-Temporal (ST) graph is proposed for vehicle trajectory prediction. Our ST-LSTM uses three layers of different LSTMs to capture the information of spatial, temporal and trajectory data, and LSTM-based encoder-decoder model as a whole, which is capable of accurately predicting future trajectories for vehicles on the highway. Our model trained and validated on the publicly available NGSIM US101 and I-80 datasets. In comparison to state-of-art methods, our method could achieve a more accurate prediction trajectory over 5s time horizon.
We propose a novel Frequently Asked Question (FAQ) retrieval technique with a neural query expansion model. With the growth in Question Answering systems and mobile communications, FAQ retrieval systems have become wi...
详细信息
ISBN:
(纸本)9781450356404
We propose a novel Frequently Asked Question (FAQ) retrieval technique with a neural query expansion model. With the growth in Question Answering systems and mobile communications, FAQ retrieval systems have become widely used in site searches and call center support. However, FAQ retrieval often has lexical gaps between queries and answer documents. To bridge these gaps, we design a query expansion model on the basis of an encoder-decoder model as a type of deep neural network. The model learns the words that appear in answers for questions using Q&A pair documents and generates the expanded queries from inputted queries to retrieve answer documents. We evaluate our proposed technique in a multi-domain FAQ retrieval task. Experimental results show that our technique retrieves FAQs more accurately than the previous methods.
In recent years, the automatic generation of natural language descriptions of video has focused on deep learning research and natural voice processing. Video understanding has multiple applications such as video searc...
详细信息
ISBN:
(纸本)9781665473507
In recent years, the automatic generation of natural language descriptions of video has focused on deep learning research and natural voice processing. Video understanding has multiple applications such as video search and indexing, but video subtitles are a correct sophisticated topic for complex and diverse types of video content. However, the understanding between video and natural language sets remains an open issue to better understand the video and create multiple methods to create a set automatically. The deep learning method has a major focus on the direction of video processing with performance and highspeed computing capabilities. This polling discusses an encoderdecoder network end-in-frame based on a deep learning approach to generate caption. In this paper we will describe the model, dataset and parameters used to evaluate the model.
This paper proposes a novel two-factor attention based encoder-decoder model (TwoFactorencoderdecoder) for multivariate weather prediction. The proposed model learns attention weights from two factors, namely, tempora...
详细信息
ISBN:
(纸本)9781728119854
This paper proposes a novel two-factor attention based encoder-decoder model (TwoFactorencoderdecoder) for multivariate weather prediction. The proposed model learns attention weights from two factors, namely, temporal information and prior knowledge inferred information. Here, temporal information contains change patterns hidden in observed time series data, while prior knowledge inferred information gives various types of meteorological observations in weather forecasting. Attention weights of the two factors are used to select the intermediate outputs of the encoder, and then combine the selected result with information inferred by prior knowledge for weather forecasting by a more effective way. In addition, this paper proposes a loss function for multivariate prediction. Compared with Mean Square Error (MSE) loss function, the proposed loss function can fit small variances more accurately in performing multivariate prediction. Compared with the attention model that only uses temporal information or the prior knowledge inferred information, the proposed TwoFactorencoderdecodermodel has encouraging improvements in prediction accuracy on the public weather forecasting dataset, namely, the MAPE of t2m is increased by 5.42%, the MAPE of rh2m is increased by 2.92%, and the MAPE of w2m is increased by 1.67%, which shows the effect of the two-factor attention mechanism. Source code for the complete system will be available at https://***/YuanMLer/TFAencoderdecoder.
We present a temporal classification constraint as an auxiliary learning method for improving the recognition of Handwritten Mathematical Expression (HME). Connectionist temporal classification (CTC) is used to learn ...
详细信息
ISBN:
(纸本)9783030861599;9783030861582
We present a temporal classification constraint as an auxiliary learning method for improving the recognition of Handwritten Mathematical Expression (HME). Connectionist temporal classification (CTC) is used to learn the temporal alignment of the input feature sequence and corresponding symbol label sequence. The CTC alignment is trained with the encoder-decoder alignment through a combination of CTC loss and encoder-decoder loss to improve the feature learning of the encoder in the encoder-decoder model. We show the effectiveness of the approach in improving symbol classification and expression recognition on the CROHME datasets.
In recent years, coding metasurfaces have become a research hotspot in the field of metasurfaces. In the traditional reverse design of coding metasurfaces, genetic algorithms (GA) combined with forward prediction netw...
详细信息
ISBN:
(纸本)9798350389968
In recent years, coding metasurfaces have become a research hotspot in the field of metasurfaces. In the traditional reverse design of coding metasurfaces, genetic algorithms (GA) combined with forward prediction networks are commonly used for optimization design, but this requires a substantial amount of time in the search process, especially for large coding metasurfaces. To conquer the time-consuming issue of genetic algorithms and to meet diverse design needs, we propose a solution of multi-branched encoder-decoder. In addition, we use Conditional Generative Adversarial Networks (cGAN) to optimize the model. While solving the non-unique mapping problem, our method negates the need for a complicated search process and can directly provide coding metasurfaces designs that meet specific targets, greatly enhancing the efficiency of the design. This method has been verified in our closed-loop simulation, providing an effective solution for the reverse design of coding metasurfaces.
暂无评论