In Internet of Things (IoT), each object requires a unique identifier to identify itself and index its detailed profile to support mutual recognitions among multiple objects. However, existing IoT identifiers belongin...
详细信息
ISBN:
(纸本)9781728198293
In Internet of Things (IoT), each object requires a unique identifier to identify itself and index its detailed profile to support mutual recognitions among multiple objects. However, existing IoT identifiers belonging to different identification schemes are heterogeneous from each other, which create a great challenge for the applications that need to resolve the heterogeneous identifiers. To address this challenge, we propose an algorithm to automatically recognize the heterogeneous identification schemes used by various IoT identifiers, based on a sequence-to-sequence (seq2seq) model consisting of an encoder and a decoder. The encoder uses one Long Short-Term Memory (LSTM) to map the identifier sequence to a vector of fixed dimensionality, and the decoder uses another LSTM to unfold the vector into a target sequence representing the identification scheme of this identifier. To evaluate our algorithm, we create a new dataset named ID-20 with 20 categories of IoT identifiers and conduct experiments on it. The results demonstrate the superiority of our algorithm against other state-of-the-art methods, with an identifier recognition accuracy of up to 94.57%.
Automatic Text Summarization is the process which utilizes machine power to process large paragraphs in order to create brief and refined summary. Text summarization can be classified into two classes viz. extractive ...
详细信息
Automatic Text Summarization is the process which utilizes machine power to process large paragraphs in order to create brief and refined summary. Text summarization can be classified into two classes viz. extractive and abstractive. The pith of the extractive is a choice issue, which consequently picks significant sentences from the input content as indicated by different assessment measures;while abstractive requires a profound semantic and talk comprehension of the content to create a base rephrased summary. Deep learning gives an achievable system to create an abstractive summarizer. Intermittent neural system Recurrent Neural Network (RNN) based arrangement to-grouping number of layers for learning has made as rounding progress in different natural language processing errands. The proposed model principally comprises of two sections - an encoder and a decoder - each one of which is a Recurrent Neural Network (RNN). ROUGE assessment metric is utilized to assess the likeness between real feature and anticipated feature.
Most of NLP research fields (Translation, Classification, Dialogue Systems.) have been revolutionized by the rise of deep learning methods, which rely on the new dense and low-dimensional feature representation. We pr...
详细信息
ISBN:
(纸本)9781728180847
Most of NLP research fields (Translation, Classification, Dialogue Systems.) have been revolutionized by the rise of deep learning methods, which rely on the new dense and low-dimensional feature representation. We present in this article the basic training techniques of Word Embeddings as well as the recent works on Abstractive Neural Summarizers. We also introduce our trained French Word Embeddings, further used as the embedding layer to implement our baseline French Neural Summarizer for the headline generation task, using the RNN (Recurrent Neural Network) encoder-decoder architecture.
Question Generation (QG) aims to construct questions from given text automatically. Recently, QG has received widely concerned. The mainstream method is still based on the fixed sequence generation model of Seq2Seq mo...
详细信息
ISBN:
(数字)9781728169262
ISBN:
(纸本)9781728169262
Question Generation (QG) aims to construct questions from given text automatically. Recently, QG has received widely concerned. The mainstream method is still based on the fixed sequence generation model of Seq2Seq model, and few people consider the influence of generation order in the result. In this paper, we present a novel Reinforced Attention decoder Neural Network for the QA-SRL task. First, our model draws on the idea of the reinforcement learning algorithm with baseline, which is using the accuracy of each slot and sentence as the reward, updating the Policy Network to predict the optimal generation order. Second, we apply the Attention mechanism on the baseline to get more relevant information about the entire sentence. Addition experiments explore that distilling knowledge from RAD (the teacher) model to guide RAD-Reborn model (the student) training can achieve better performance. Extensive experiments on QA-SRL Bank 2.0 show that our model outperforms previous systems of all of the evaluation metrics. In particular, the metric EM (Exact Match) increased significantly by over 3%.
One of the key challenges in no-reference video quality assessment (NR-VQA) is the absence of the reference video to measure the similarity or difference between the distorted video and the original one. In this paper...
详细信息
ISBN:
(纸本)9781728163956
One of the key challenges in no-reference video quality assessment (NR-VQA) is the absence of the reference video to measure the similarity or difference between the distorted video and the original one. In this paper, an encoder-decoder model is proposed to predict pixel-by-pixel similarity maps from the distorted video. The model takes multiple frames as input since correlated pixels of adjacent frames can be exploited to recover the similarity map of the middle frame of the distorted video clip. In addition, to further exploit the temporal perception mechanism of the human visual system (HVS), which is relevant to the perceptual video distortion measurement, visual persistence and temporal memory effects are considered in the spatio-temporal pooling network design. Experimental results demonstrate that our proposed method outperforms state-of-the-art NR-VQA metrics.
In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to e...
详细信息
ISBN:
(纸本)9781509066315
In this paper we present an end-to-end speech recognition model with Transformer encoders that can be used in a streaming speech recognition system. Transformer computation blocks based on self-attention are used to encode both audio and label sequences independently. The activations from both audio and label encoders are combined with a feed-forward layer to compute a probability distribution over the label space for every combination of acoustic frame position and label history. This is similar to the Recurrent Neural Network Transducer (RNN-T) model, which uses RNNs for information encoding instead of Transformer encoders. The model is trained with the RNN-T loss well-suited to streaming decoding. We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy. We also show that the full attention version of our model beats the-state-of-the art accuracy on the LibriSpeech benchmarks. Our results also show that we can bridge the gap between full attention and limited attention versions of our model by attending to a limited number of future frames.
Convolutional Neural Networks is one of the most commonly used methods for automatic prostate segmentation. However, few studies focus on the segmentation of the two main zones of the prostate: the central gland and t...
详细信息
Convolutional Neural Networks is one of the most commonly used methods for automatic prostate segmentation. However, few studies focus on the segmentation of the two main zones of the prostate: the central gland and the peripheral zone. This work proposes and evaluates two models for 2D semantic segmentation of these two zones of the prostate. The first model (Model-A) uses an encoder-decoder architecture based on the global U-net and the local U-net architectures. The global U-net segments the whole prostate, whereas the local U-net segments the central gland. The peripheral zone is obtained by subtracting the central gland from the whole prostate. On the other hand, the second model (Model-B) uses an encoder-classifier architecture based on the VGG16 network. Model-B performs segmentation by classifying each pixel of a Magnetic Resonance Image (MRI) into three categories: background, central gland, and peripheral zone. Both models are tested using MRIs from the dataset NCI-ISBI 2013 Challenge. The experimental results show a superior segmentation performance for Model-A, encoder-decoder architecture, (DSC = 96.79% ± 0.15% and IoU = 93.79% ± 0.29%) compared to Model-B, encoder-classifier architecture, (DSC = 92.50%± 1.19% and IoU = 86.13% ±2.02%).
In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for...
详细信息
ISBN:
(纸本)9781509066315
In this paper, we propose a neural search algorithm to select the most likely hypothesis using a sequence of acoustic representations and multiple hypotheses as input. The algorithm provides a sequence level score for each audio-hypothesis pair that is obtained by integrating information from multiple sources, such as the input acoustic representations, N-best hypotheses, additional 1st-pass statistics, and unpaired textual information through an external language model. These scores are then used to map the search problem of identifying the most likely hypothesis to a sequence classification problem. The definition of the proposed algorithm is broad enough to allow its use as an alternative to beam search in the 1st-pass or as a 2nd-pass, rescoring step. This algorithm achieves up to 12% relative reductions in Word Error Rate (WER) across several languages over state-of-the-art baselines with relatively few additional parameters. We also propose the use of a binary classifier gating function that can learn to trigger the 2nd-pass neural search model when the 1-best hypothesis is not the oracle hypothesis, thereby avoiding extra computation.
We present SL-ReDu, a recently commenced innovative project that aims to exploit deep-learning progress to advance the state-of-the-art in video-based automatic recognition of Greek Sign Language (GSL), while focusing...
详细信息
ISBN:
(纸本)9781450377737
We present SL-ReDu, a recently commenced innovative project that aims to exploit deep-learning progress to advance the state-of-the-art in video-based automatic recognition of Greek Sign Language (GSL), while focusing on the use-case of GSL education as a second language. We first briefly overview the project goals, focal areas, and timeline. We then present our initial deep learning-based approach for GSL recognition that employs efficient visual tracking of the signer hands, convolutional neural networks for feature extraction, and attention-based encoder-decoder sequence modeling for sign prediction. Finally, we report experimental results for small-vocabulary, isolated GSL recognition on the single-signer "Polytropon" corpus. To our knowledge, this work constitutes the first application of deep-learning techniques to GSL.
Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it oft...
详细信息
Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it often comes from two or more interlocutors,exchanging information with each other and having an inherent hierarchical structure based on the sequence of utterances by different speakers. This paper proposes a simple but effective hybrid approach that consists of two modules and uses transfer learning by leveraging pretrained language models(PLMs) to generate an abstractive summary. The first module highlights important utterances, capturing the utterance level relationship by adapting an auto-encoding model like BERT based on the unsupervised or supervised method. And then, the second module generates a concise abstractive summary by adapting encoder-decoder models like T5,BART, and PEGASUS. Experiment results on benchmark datasets show that our approach achieves a state-of-the-art performance by adapting to dialogue scenarios and can also be helpful in lowresource settings for domain adaptation.
暂无评论